TopKLists: a comprehensive R package for statistical inference, stochastic aggregation, and visualization of multiple omics ranked lists

Michael G. Schimek1, Eva Budínská2, Karl Kugler3, Vendula Švendová1, Jie Ding4, Shili Lin5
1Statistical Bioinformatics, IMI, Medical University of Graz, Auenbruggerplatz 2/V, 8036 Graz, Austria
2Bioinformatics in Translational Research, IBA, Masaryk University, Kotlarska 2, 61137 Brno, Czech Republic
3Institute for Bioinformatics and Systems Biology, Helmholtz Centre Munich, Ingolstädter Landstrasse 1, 85764 Neuherberg, Germany
4Stanford Cancer Institute, Stanford University, 265 Campus Drive, Stanford, CA 94305-5456, USA
5Department of Statistics, The Ohio State University, 1958 Neil Avenue, Columbus OH 43210, USA

Tóm tắt

Abstract

High-throughput sequencing techniques are increasingly affordable and produce massive amounts of data. Together with other high-throughput technologies, such as microarrays, there are an enormous amount of resources in databases. The collection of these valuable data has been routine for more than a decade. Despite different technologies, many experiments share the same goal. For instance, the aims of RNA-seq studies often coincide with those of differential gene expression experiments based on microarrays. As such, it would be logical to utilize all available data. However, there is a lack of biostatistical tools for the integration of results obtained from different technologies. Although diverse technological platforms produce different raw data, one commonality for experiments with the same goal is that all the outcomes can be transformed into a platform-independent data format – rankings – for the same set of items. Here we present the

Từ khóa


Tài liệu tham khảo

Love, 2014, Moderated estimation of fold change and dispersion for RNA - seq data with, Genome Biol, 15, 550, 10.1186/s13059-014-0550-8

Baker, 2010, profiling separating signal from noise, Nat Methods, 687, 10.1038/nmeth0910-687

Takahashi, 2009, MiR and MiR can induce cell cycle arrest in human non small cell lung cancer cell lines One, 107

Yanaihara, 2006, Unique microRNA molecular profiles in lung cancer diagnosis and prognosis, Cancer Cell, 189, 10.1016/j.ccr.2006.01.025

Wang, 2011, functions as a tumor suppressor in human non - small cell lung cancer by targeting ras - related protein, Oncogene, 14, 451

Plaisier, 2010, Rank - rank hypergeometric overlap : identification of statistically significant overlap between gene - expression signatures, Nucleic Acids Res, 169, 10.1093/nar/gkq636

Schimek, 2012, An inference and integration approach for the consolidation of ranked lists, Commun Stat Simul, 1152, 10.1080/03610918.2012.625843

Lin, 2010, Space oriented rank - based data integration Article, Stat Appl Genet Mol Biol, 9, 10.2202/1544-6115.1534.Epub2010Apr9

Hall, 2012, Moderate - deviation - based inference for random degeneration in paired rank lists, Am Stat Assoc, 107

Wang, 2011, functions as a tumor suppressor in human non - small cell lung cancer by targeting ras - related protein, Oncogene, 14, 451

Love, 2014, Moderated estimation of fold change and dispersion for RNA - seq data with, Genome Biol, 15, 550, 10.1186/s13059-014-0550-8

Lin, 2009, Integration of ranked lists via Cross Entropy Monte Carlo with applications to mRNA and microRNA studies, Biometrics, 9, 10.1111/j.1541-0420.2008.01044.x

Baker, 2010, profiling separating signal from noise, Nat Methods, 687, 10.1038/nmeth0910-687

Yanaihara, 2006, Unique microRNA molecular profiles in lung cancer diagnosis and prognosis, Cancer Cell, 189, 10.1016/j.ccr.2006.01.025

Lin, 2009, Integration of ranked lists via Cross Entropy Monte Carlo with applications to mRNA and microRNA studies, Biometrics, 9, 10.1111/j.1541-0420.2008.01044.x

Hall, 2012, Moderate - deviation - based inference for random degeneration in paired rank lists, Am Stat Assoc, 107

Yang, 2006, Similarities of ordered gene lists, Comput Biol, 693

Tam, 2014, de Robust global microRNA expression profiling using next - generation sequencing technologies, Lab Invest, 350, 10.1038/labinvest.2013.157

Kugler, 2010, MADAM an open source meta - analysis toolbox for Source Code, Biol Med, 5

Gao, 2010, Deregulated expression of miR miR and miR a in non small cell lung cancer is related to clinicopathologic characteristics or patient prognosis, Biomed Pharmacother, 21, 143

Takahashi, 2009, MiR and MiR can induce cell cycle arrest in human non small cell lung cancer cell lines One, 107

Lin, 2010, Space oriented rank - based data integration Article, Stat Appl Genet Mol Biol, 9, 10.2202/1544-6115.1534.Epub2010Apr9

Gao, 2010, Deregulated expression of miR miR and miR a in non small cell lung cancer is related to clinicopathologic characteristics or patient prognosis, Biomed Pharmacother, 21, 143

Schimek, 2012, An inference and integration approach for the consolidation of ranked lists, Commun Stat Simul, 1152, 10.1080/03610918.2012.625843

Kugler, 2010, MADAM an open source meta - analysis toolbox for Source Code, Biol Med, 5

Plaisier, 2010, Rank - rank hypergeometric overlap : identification of statistically significant overlap between gene - expression signatures, Nucleic Acids Res, 169, 10.1093/nar/gkq636

Yang, 2006, Similarities of ordered gene lists, Comput Biol, 693

Tam, 2014, de Robust global microRNA expression profiling using next - generation sequencing technologies, Lab Invest, 350, 10.1038/labinvest.2013.157