IPCAPS: an R package for iterative pruning to capture population structure

Kridsadakorn Chaichoompu1, Fentaw Abegaz1, Sissades Tongsima2, Philip Shaw3, Anavaj Sakuntabhai4, Luı́sa Pereira5, Kristel Van Steen6
1GIGA-R Medical Genomics - BIO3, University of Liege, Avenue de l'Hôpital 11, 4000, Liege, Belgium
2Genome Technology Research Unit, National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Phahonyothin Road, Khlong Neung, Khlong Luang, Pathum Thani, 12120, Thailand
3Medical Molecular Biology Research Unit, National Center for Genetic Engineering and Biotechnology, 113 Thailand Science Park, Phahonyothin Road, Khlong Neung, Khlong Luang, Pathum Thani, 12120, Thailand
4Functional Genetics of Infectious Diseases Unit, Institut Pasteur, 25-28, rue du Docteur Roux, 75015, Paris, France
5Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Rua Alfredo Allen 208, 4200-135 Porto, Portugal
6WELBIO (Walloon Excellence in Lifesciences and Biotechnology), Avenue Pasteur 6, 1300, Wavre, Belgium

Tóm tắt

Từ khóa


Tài liệu tham khảo

Neuditschko M, Khatkar MS, Raadsma HW. NetView: a high-definition network-visualization approach to detect fine-scale population structures from genome-wide patterns of variation. PLoS One. 2012;7:e48375.

Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38:904–9.

Lawson DJ, Hellenthal G, Myers S, Falush D. Inference of population structure using dense haplotype data. PLoS Genet. 2012;8:e1002453.

Corander J, Marttinen P, Sirén J, Tang J. Enhanced Bayesian modelling in BAPS software for learning genetic structures of populations. BMC Bioinformatics. 2008;9:539.

Intarapanich A, Shaw PJ, Assawamakin A, Wangkumhang P, Ngamphiw C, Chaichoompu K, et al. Iterative pruning PCA improves resolution of highly structured populations. BMC Bioinformatics. 2009;10:382.

Limpiti T, Intarapanich A, Assawamakin A, Shaw PJ, Wangkumhang P, Piriyapongsa J, et al. Study of large and highly stratified population datasets by combining iterative pruning principal component analysis and structure. BMC Bioinformatics. 2011;12:255.

Chaichoompu K, Abegaz F, Tongsima S, Shaw PJ, Sakuntabhai A, Cavadas B, et al. A methodology for unsupervised clustering using iterative pruning to capture fine-scale structure. bioRxiv. 2017;234989.

Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4.

Lebret R, Iovleff S, Langrognet F, Biernacki C, Celeux G, Govaert G. Rmixmod: TheRPackage of the model-based unsupervised, supervised, and semi-supervised ClassificationMixmodLibrary. J Stat Softw. 2015;67.

Clayton D. snpStats: SnpMatrix and XSnpMatrix classes and methods. R package version 1.32.0. 2018. Available from: https://doi.org/10.18129/B9.bioc.snpStats .

Balding DJ, Nichols RA. A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity. Genetica. 1995;96:3–12.

Liu L, Zhang D, Liu H, Arendt C. Robust methods for population stratification in genome wide association studies. BMC Bioinformatics. 2013;14:132.

Gibbs RA, Belmont JW, Hardenbol P, Willis TD, Yu F, Yang H, et al. The international HapMap project. Nature. 2003;426:789–96.

Alanis-Lobato G, Cannistraci CV, Eriksson A, Manica A, Ravasi T. Highlighting nonlinear patterns in population genetics datasets. Sci Rep. 2015;5:8140.