Clumpak: a program for identifying clustering modes and packaging population structure inferences across K

Molecular Ecology Resources - Tập 15 Số 5 - Trang 1179-1191 - 2015

Naama M. Kopelman¹, Jonathan Mayzel¹, Mattias Jakobsson², Noah A. Rosenberg³, Itay Mayrose¹

¹Department of Molecular Biology and Ecology of Plants, Tel Aviv University, Ramat Aviv, 69978, Israel.

²Department of Evolutionary Biology and SciLife Lab Uppsala University Uppsala 75236 Sweden

³Department of Biology, Stanford University, Stanford, CA 94305, USA

Tóm tắt

AbstractThe identification of the genetic structure of populations from multilocus genotype data has become a central component of modern population‐genetic data analysis. Application of model‐based clustering programs often entails a number of steps, in which the user considers different modelling assumptions, compares results across different predetermined values of the number of assumed clusters (a parameter typically denoted K), examines multiple independent runs for each fixed value of K, and distinguishes among runs belonging to substantially distinct clustering solutions. Here, we present Clumpak (Cluster Markov Packager Across K), a method that automates the postprocessing of results of model‐based population structure analyses. For analysing multiple independent runs at a single K value, Clumpak identifies sets of highly similar runs, separating distinct groups of runs that represent distinct modes in the space of possible solutions. This procedure, which generates a consensus solution for each distinct mode, is performed by the use of a Markov clustering algorithm that relies on a similarity matrix between replicate runs, as computed by the software Clumpp. Next, Clumpak identifies an optimal alignment of inferred clusters across different values of K, extending a similar approach implemented for a fixed K in Clumpp and simplifying the comparison of clustering results across different K values. Clumpak incorporates additional features, such as implementations of methods for choosing K and comparing solutions obtained by different programs, models, or data subsets. Clumpak, available at http://clumpak.tau.ac.il, simplifies the use of model‐based analyses of population structure in population genetics and molecular ecology.

Từ khóa

Tài liệu tham khảo

10.1186/1471-2105-12-246

10.1101/gr.094052.109

10.1534/genetics.105.044586

10.1111/j.1471-8286.2007.01769.x

10.1093/genetics/163.1.367

10.1093/bioinformatics/bth250

10.1186/1471-2105-9-539

10.1007/s00180-007-0072-x

10.1017/S001667230100502X

10.1126/science.1139518

10.1093/molbev/msp106

10.1007/s12686-011-9548-7

10.1093/nar/30.7.1575

10.1111/j.1365-294X.2005.02553.x

10.1111/j.1471-8286.2007.01758.x

10.1371/journal.pcbi.1002606

10.1534/genetics.106.059923

10.1534/genetics.113.160572

10.1534/genetics.107.072371

10.1111/j.1365-294X.2012.05754.x

10.1111/mec.12488

10.1093/bioinformatics/btn419

10.1046/j.1365-294x.2001.01191.x

10.1086/375613

10.1017/CBO9780511840371

10.1111/j.1755-0998.2009.02591.x

10.1534/genetics.106.061317

10.4137/EBO.S6761

Jain AK, 1988, Algorithms for clustering data

10.1093/bioinformatics/btm233

10.1038/nature06742

10.1002/9780470316801

10.1186/1471-2156-10-80

10.1016/j.tree.2004.12.004

10.1126/science.1097406

10.1139/f05-224

10.1006/tpbi.2001.1543

10.1111/j.1365-294X.2004.02396.x

10.1371/journal.pone.0066213

10.1046/j.1471-8286.2003.00566.x

10.1093/genetics/159.2.699

10.1073/pnas.98.3.858

10.1126/science.1078311

10.1534/genetics.108.100222

10.1002/gepi.20064

10.1038/90135

Van DongenS 2000Graph clustering by flow simulation. PhD thesis University of Utrecht Utrecht.

10.1137/040608635

10.1371/journal.pgen.0030185

10.1101/gr.076539.108

10.1126/science.1132772

10.1038/ng.2494

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA