Minfi: a flexible and comprehensive Bioconductor package for the analysis of Infinium DNA methylation microarrays

Bioinformatics (Oxford, England) - Tập 30 Số 10 - Trang 1363-1369 - 2014
Martin J. Aryee1, Andrew E. Jaffe1, Héctor Corrada Bravo1, Christine Ladd‐Acosta1, Andrew P. Feinberg1, Kasper D. Hansen1, Rafael A. Irizarry1
11 Department of Pathology, Massachusetts General Hospital and Harvard Medical School, Boston, MA 02114, USA, 2Department of Biostatistics, Johns Hopkins School of Public Health, 615 N Wolfe Street, Baltimore, MD 21205, USA, 3Lieber Institute of Brain Development, Johns Hopkins Medical Campus, 855 N Wolfe Street, Baltimore, MD 21205, USA, 4Department of Computer Science, University of Maryland, College Park, MD 20742, USA, 5Department of Epidemiology, Johns Hopkins School of Public Health, 615 N Wolfe Street, Baltimore, MD 21205, USA, 6Center for Epigenetics and Department of Medicine, Johns Hopkins University School of Medicine, 570 Rangos, 725 N Wolfe Street, Baltimore, MD 21205, USA and 7Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA

Tóm tắt

Abstract

Motivation: The recently released Infinium HumanMethylation450 array (the ‘450k’ array) provides a high-throughput assay to quantify DNA methylation (DNAm) at ∼450 000 loci across a range of genomic features. Although less comprehensive than high-throughput sequencing-based techniques, this product is more cost-effective and promises to be the most widely used DNAm high-throughput measurement technology over the next several years.

Results: Here we describe a suite of computational tools that incorporate state-of-the-art statistical techniques for the analysis of DNAm data. The software is structured to easily adapt to future versions of the technology. We include methods for preprocessing, quality assessment and detection of differentially methylated regions from the kilobase to the megabase scale. We show how our software provides a powerful and flexible development platform for future methods. We also illustrate how our methods empower the technology to make discoveries previously thought to be possible only with sequencing-based methods.

Availability and implementation:  http://bioconductor.org/packages/release/bioc/html/minfi.html.

Contact:  [email protected]; [email protected]

Supplementary information:  Supplementary data are available at Bioinformatics online.

Từ khóa


Tài liệu tham khảo

Aryee, 2011, Accurate genome-scale percentage DNA methylation estimates from microarray data, Biostatistics, 12, 197, 10.1093/biostatistics/kxq055

Berman, 2012, Regions of focal DNA hypermethylation and long-range hypomethylation in colorectal cancer coincide with nuclear lamina-associated domains, Nat. Genet., 44, 40, 10.1038/ng.969

Bibikova, 2011, High density DNA methylation array with single CpG site resolution, Genomics, 98, 288, 10.1016/j.ygeno.2011.07.007

Bolstad, 2003, A comparison of normalization methods for high density oligonucleotide array data based on variance and bias, Bioinformatics, 19, 185, 10.1093/bioinformatics/19.2.185

Chambers, 1998, Programming with Data: A Guide to the S Language, 10.1007/978-1-4684-6306-4

Dawson, 2012, Cancer epigenetics: from mechanism to therapy, Cell, 150, 12, 10.1016/j.cell.2012.06.013

Dedeurwaerder, 2013, A comprehensive overview of Infinium HumanMethylation450 data processing

Doi, 2009, Differential methylation of tissue- and cancer-specific CpG island shores distinguishes human induced pluripotent stem cells, embryonic stem cells and fibroblasts, Nat. Genet., 41, 1350, 10.1038/ng.471

Feinberg, 2007, Phenotypic plasticity and the epigenetics of human disease, Nature, 447, 433, 10.1038/nature05919

Feinberg, 2004, The history of cancer epigenetics, Nat. Rev. Cancer, 4, 143, 10.1038/nrc1279

Gardiner-Garden, 1987, CpG islands in vertebrate genomes, J. Mol. Biol., 196, 261, 10.1016/0022-2836(87)90689-9

Hannum, 2012, Genome-wide methylation profiles reveal quantitative views of human aging rates, Mol. Cell, 49, 359, 10.1016/j.molcel.2012.10.016

Hansen, 2011, Increased methylation variation in epigenetic domains across cancer types, Nat. Genet., 43, 768, 10.1038/ng.865

Irizarry, 2008, Comprehensive high-throughput arrays for relative methylation (CHARM), Genome Res., 18, 780, 10.1101/gr.7301508

Jaffe, 2012, Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies, Int. J. Epidemiol., 41, 200, 10.1093/ije/dyr238

Krueger, 2012, DNA methylome analysis using short bisulfite sequencing data, Nat. Methods, 9, 145, 10.1038/nmeth.1828

Leek, 2010, Tackling the widespread and critical impact of batch effects in high-throughput data, Nat. Rev. Genet., 11, 733, 10.1038/nrg2825

Lin, 2006, Reproducibility Probability Score–incorporating measurement variability across laboratories for gene selection, Nat. Biotechnol., 24, 1476, 10.1038/nbt1206-1476

Maksimovic, 2012, SWAN: Subset-quantile within array normalization for illumina infinium HumanMethylation450 BeadChips, Genome Biol., 13, R44, 10.1186/gb-2012-13-6-r44

Marabita, 2013, An evaluation of analysis pipelines for DNA methylation profiling using the Illumina HumanMethylation450 BeadChip platform, Epigenetics, 8, 333, 10.4161/epi.24008

McCall, 2010, Frozen robust multiarray analysis (fRMA), Biostatistics, 11, 242, 10.1093/biostatistics/kxp059

McCall, 2011, Assessing affymetrix GeneChip microarray quality, BMC Bioinformatics, 12, 137, 10.1186/1471-2105-12-137

Okayama, 2012, Identification of genes upregulated in ALK-positive and EGFR/KRAS/ALK-negative lung adenocarcinomas, Cancer Res., 72, 100, 10.1158/0008-5472.CAN-11-1403

Pidsley, 2013, A data-driven approach to preprocessing Illumina 450K methylation array data, BMC Genomics, 14, 293, 10.1186/1471-2164-14-293

Price, 2013, Additional annotation enhances potential for biologically-relevant analysis of the Illumina Infinium HumanMethylation450 BeadChip array, Epigenetics Chromatin, 6, 4, 10.1186/1756-8935-6-4

Rakyan, 2011, Epigenome-wide association studies for common human diseases, Nat. Rev. Genet., 12, 529, 10.1038/nrg3000

Skrzypczak, 2010, Modeling oncogenic signaling in colon tumors by multidirectional analyses of microarray data directed for maximization of analytical reliability, PLoS One, 5, e13091, 10.1371/journal.pone.0013091

Teschendorff, 2013, A beta-mixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data, Bioinformatics, 29, 189, 10.1093/bioinformatics/bts680

Touleimat, 2012, Complete pipeline for Infinium((R)) Human Methylation 450K BeadChip data processing using subset quantile normalization for accurate DNA methylation estimation, Epigenomics, 4, 325, 10.2217/epi.12.21

Triche, 2013, Low-level processing of Illumina Infinium DNA Methylation BeadArrays, Nucleic Acids Res., 41, e90, 10.1093/nar/gkt090

Wang, 2012, IMA: an R package for high-throughput analysis of Illumina's 450K Infinium methylation data, Bioinformatics, 28, 729, 10.1093/bioinformatics/bts013

Wessely, 2012, Identification of DNA methylation biomarkers from Infinium arrays, Front. Genet., 3, 161, 10.3389/fgene.2012.00161

Wu, 2010, Subset quantile normalization using negative control features, J. Comput. Biol., 17, 1385, 10.1089/cmb.2010.0049

Yousefi, 2013, Considerations for normalization of DNA methylation data by Illumina 450K BeadChip assay in population studies, Epigenetics, 8, 1141, 10.4161/epi.26037

Zhi, 2013, SNPs located at CpG sites modulate genome-epigenome interaction, Epigenetics, 8, 802, 10.4161/epi.25501