Implications of non-uniqueness in phylogenetic deconvolution of bulk DNA samples of tumors
Tóm tắt
Tumors exhibit extensive intra-tumor heterogeneity, the presence of groups of cellular populations with distinct sets of somatic mutations. This heterogeneity is the result of an evolutionary process, described by a phylogenetic tree. In addition to enabling clinicians to devise patient-specific treatment plans, phylogenetic trees of tumors enable researchers to decipher the mechanisms of tumorigenesis and metastasis. However, the problem of reconstructing a phylogenetic tree T given bulk sequencing data from a tumor is more complicated than the classic phylogeny inference problem. Rather than observing the leaves of T directly, we are given mutation frequencies that are the result of mixtures of the leaves of T. The majority of current tumor phylogeny inference methods employ the perfect phylogeny evolutionary model. The underlying Perfect Phylogeny Mixture (PPM) combinatorial problem typically has multiple solutions. We prove that determining the exact number of solutions to the PPM problem is #P-complete and hard to approximate within a constant factor. Moreover, we show that sampling solutions uniformly at random is hard as well. On the positive side, we provide a polynomial-time computable upper bound on the number of solutions and introduce a simple rejection-sampling based scheme that works well for small instances. Using simulated and real data, we identify factors that contribute to and counteract non-uniqueness of solutions. In addition, we study the sampling performance of current methods, identifying significant biases. Awareness of non-uniqueness of solutions to the PPM problem is key to drawing accurate conclusions in downstream analyses based on tumor phylogenies. This work provides the theoretical foundations for non-uniqueness of solutions in tumor phylogeny inference from bulk DNA samples.
Tài liệu tham khảo
Nowell PC. The clonal evolution of tumor cell populations. Science. 1976;194(4260):23–8.
Tabassum DP, Polyak K. Tumorigenesis: it takes a village. Nat Rev Cancer. 2015;15(8):473–83.
Schwartz R, Schäffer AA. The evolution of tumour phylogenetics: principles and practice. Nat Rev Genet. 2017;18(4):213–29.
Fisher R, Pusztai L, Swanton C. Cancer heterogeneity: implications for targeted therapeutics. Br J Cancer. 2013;108(3):479–85.
Venkatesan S, Swanton C. Tumor evolutionary principles: how intratumor heterogeneity influences cancer treatment and outcome. American Society of Clinical Oncology educational book. American Society of Clinical Oncology. Meeting. 2016;35:141–9.
Kandoth C, McLellan MD, Vandin F, Ye K, Niu B, Lu C, Xie M, Zhang Q, McMichael JF, Wyczalkowski MA, Leiserson MDM, Miller CA, Welch JS, Walter MJ, Wendl MC, Ley TJ, Wilson RK, Raphael BJ, Ding L. Mutational landscape and significance across 12 major cancer types. Nature. 2013;502(7471):333–9.
Gerstung M, Jolly C, Leshchiner I, Dentro SC, Gonzalez S, Mitchell TJ, Rubanova Y, Anur P, Rosebrock D, Yu K, Tarabichi M, Deshwar A, Wintersinger J, Kleinheinz K, Vázquez-García I, Haase K, Sengupta S, Macintyre G, Malikic S, Donmez N, Livitz DG, Cmero M, Demeulemeester J, Schumacher S, Fan Y, Yao X, Lee J, Schlesner M, Boutros PC, Bowtell DD, Zhu H, Getz G, Imielinski M, Beroukhim R, Sahinalp SC, Ji Y, Peifer M, Markowetz F, Mustonen V, Yuan K, Wang W, Morris QD, Spellman PT, Wedge DC, Van Loo P, Evolution P, Group HW, network P. The evolutionary history of 2,658 cancers. bioRxiv. 2017;161562.
Strino F, Parisi F, Micsinai M, Kluger Y. Trap: a tree approach for fingerprinting subclonal tumor composition. Nucleic Acids Res. 2013;41(17):165. https://doi.org/10.1093/nar/gkt641. http://nar.oxfordjournals.org/content/41/17/e165.full.pdf+html.
Jiao W, Vembu S, Deshwar AG, Stein L, Morris Q. Inferring clonal evolution of tumors from single nucleotide somatic mutations. BMC Bioinform. 2014;15:35. https://doi.org/10.1186/1471-2105-15-35.
El-Kebir M, Oesper L, Acheson-Field H, Raphael BJ. Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics. 2015;31(12):62–70.
Malikic S, McPherson AW, Donmez N, Sahinalp CS. Clonality inference in multiple tumor samples using phylogeny. Bioinformatics. 2015;31(9):1349–56. https://doi.org/10.1093/bioinformatics/btv003.
Yuan K, Sakoparnig T, Markowetz F, Beerenwinkel N. BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies. Genome Biol. 2015;16(1):1.
Popic V, Salari R, Hajirasouliha I, Kashef-Haghighi D, West RB, Batzoglou S. Fast and scalable inference of multi-sample cancer lineages. Genome Biol. 2015;16(1):91.
Deshwar AG, Vembu S, Yung CK, Jang GH, Stein L, Morris Q. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 2015;16(1):35.
Jiang Y, Qiu Y, Minn AJ, Zhang NR. Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing. Proc Natl Acad Sci USA. 2016;113(37):5528–37.
Malikic S, Jahn K, Kuipers J, Sahinalp C, Beerenwinkel N. Integrative inference of subclonal tumour evolution from single-cell and bulk sequencing data. bioRxiv. 2017;234914.
McGranahan N, Favero F, de Bruin EC, Birkbak NJ, Szallasi Z, Swanton C. Clonal status of actionable driver events and the timing of mutational processes in cancer evolution. Sci Transl Med. 2015;7(283):283–5428354.
Jamal-Hanjani M, Wilson GA, McGranahan N, Birkbak NJ, Watkins TBK, Veeriah S, Shafi S, Johnson DH, Mitter R, Rosenthal R, Salm M, Horswell S, Escudero M, Matthews N, Rowan A, Chambers T, Moore DA, Turajlic S, Xu H, Lee SM, Forster MD, Ahmad T, Hiley CT, Abbosh C, Falzon M, Borg E, Marafioti T, Lawrence D, Hayward M, Kolvekar S, Panagiotopoulos N, Janes SM, Thakrar R, Ahmed A, Blackhall F, Summers Y, Shah R, Joseph L, Quinn AM, Crosbie PA, Naidu B, Middleton G, Langman G, Trotter S, Nicolson M, Remmen H, Kerr K, Chetty M, Gomersall L, Fennell DA, Nakas A, Rathinam S, Anand G, Khan S, Russell P, Ezhil V, Ismail B, Irvin-sellers M, Prakash V, Lester JF, Kornaszewska M, Attanoos R, Adams H, Davies H, Dentro S, Taniere P, O’Sullivan B, Lowe HL, Hartley JA, Iles N, Bell H, Ngai Y, Shaw JA, Herrero J, Szallasi Z, Schwarz RF, Stewart A, Quezada SA, Le Quesne J, Van Loo P, Dive C, Hackshaw A, Swanton C. Tracking the evolution of non-small-cell lung cancer. N Engl J Med. 2017;376(22):2109–21.
Zhang AW, McPherson A, Milne K, Kroeger DR, Hamilton PT, Miranda A, Funnell T, Little N, de Souza CPE, Laan S, LeDoux S, Cochrane DR, Lim JLP, Yang W, Roth A, Smith MA, Ho J, Tse K, Zeng T, Shlafman I, Mayo MR, Moore R, Failmezger H, Heindl A, Wang YK, Bashashati A, Grewal DS, Brown SD, Lai D, Wan ANC, Nielsen CB, Huebner C, Tessier-Cloutier B, Anglesio MS, Bouchard-Côté A, Yuan Y, Wasserman WW, Gilks CB, Karnezis AN, Aparicio S, McAlpine JN, Huntsman DG, Holt Ra, Nelson BH, Shah SP. Interfaces of malignant and immunologic clonal dynamics in ovarian cancer. Cell. 2018;173(7):1755–176922.
Łuksza M, Riaz N, Makarov V, Balachandran VP, Hellmann MD, Solovyov A, Rizvi NA, Merghoub T, Levine AJ, Chan TA, Wolchok JD, Greenbaum BD. A neoantigen fitness model predicts tumour response to checkpoint blockade immunotherapy. Nature. 2017;551(7681):517.
Turajlic S, Xu H, Litchfield K, Rowan A, Horswell S, Chambers T, O’Brien T, Lopez JI, Watkins TBK, Nicol D, Stares M, Challacombe B, Hazell S, Chandra A, Mitchell TJ, Au L, Eichler-Jonsson C, Jabbar F, Soultati A, Chowdhury S, Rudman S, Lynch J, Fernando A, Stamp G, Nye E, Stewart A, Xing W, Smith JC, Escudero M, Huffman A, Matthews N, Elgar G, Phillimore B, Costa M, Begum S, Ward S, Salm M, Boeing S, Fisher R, Spain L, Navas C, Gronroos E, Hobor S, Sharma S, Aurangzeb I, Lall S, Polson A, Varia M, Horsfield C, Fotiadis N, Pickering L, Schwarz RF, Silva B, Herrero J, Luscombe NM, Jamal-Hanjani M, Rosenthal R, Birkbak NJ, Wilson GA, Pipek O, Ribli D, Krzystanek M, Csabai I, Szallasi Z, Gore M, McGranahan N, Van Loo P, Campbell P, Larkin J, Swanton C. Deterministic evolutionary trajectories influence primary tumor growth: TRACERx renal. Cell. 2018;173(3):595–610.
Turajlic S, Xu H, Litchfield K, Rowan A, Chambers T, Lopez JI, Nicol D, O’Brien T, Larkin J, Horswell S, Stares M, Au L, Jamal-Hanjani M, Challacombe B, Chandra A, Hazell S, Eichler-Jonsson C, Soultati A, Chowdhury S, Rudman S, Lynch J, Fernando A, Stamp G, Nye E, Jabbar F, Spain L, Lall S, Guarch R, Falzon M, Proctor I, Pickering L, Gore M, Watkins TBK, Ward S, Stewart A, DiNatale R, Becerra MF, Reznik E, Hsieh JJ, Richmond TA, Mayhew GF, Hill SM, McNally CD, Jones C, Rosenbaum H, Stanislaw S, Burgess DL, Alexander NR, Swanton C. Tracking Cancer evolution reveals constrained routes to metastases: TRACERx renal. Cell. 2018;173(3):581–94.
El-Kebir M, Satas G, Oesper L, Raphael BJ. Inferring the mutational history of a tumor using multi-state perfect phylogeny mixtures. Cell Syst. 2016;3(1):43–53.
Pradhan D, El-Kebir M. On the Non-uniqueness of solutions to the perfect phylogeny mixture problem. RECOMB comparative genomics. Cham: Springer; 2018. p. 277–93.
Kirchhoff G. Ueber die Auflösung der Gleichungen, auf welche man bei der Untersuchung der linearen Vertheilung galvanischer Ströme geführt wird. Annalen der Physik. 1847;148:497–508. https://doi.org/10.1002/andp.18471481202.
Tutte WT. The dissection of equilateral triangles into equilateral triangles. Math Proc Camb Philos Soc. 1948;44(4):463–82.
Gabow HN, Myers EW. Finding all spanning trees of directed and undirected graphs. SIAM J Comput. 1978;7(3):280–7. https://doi.org/10.1137/0207024.
Creignou N, Hermann M. On #P completeness of some counting problems. Research Report RR-2144, INRIA. 1993. https://hal.inria.fr/inria-00074528.
Jerrum M. Counting, sampling and integrating: algorithms and complexity. New York: Springer; 2003.
Deshwar AG, Boyles L, Wintersinger J, Boutros PC, Teh YW, Morris Q. Abstract B2–59: PhyloSpan: using multi-mutation reads to resolve subclonal architectures from heterogeneous tumor samples. Cancer Res. 2015;75(22 Supplement 2):2–59259.
Malikic S, Jahn K, Kuipers J, Sahinalp SC, Beerenwinkel N. Integrative inference of subclonal tumour evolution from single-cell and bulk sequencing data. Nat Commun. 2019;10(1):1–12.
Propp JG, Wilson DB. How to get a perfectly random sample from a generic Markov Chain and generate a random spanning tree of a directed graph. J Algorithms. 1998;27(2):170–217.
El-Kebir M, Satas G, Raphael BJ. Inferring parsimonious migration histories for metastatic cancers. Nat Genet. 2018;50(5):718–26.
Roth A, Khattra J, Yap D, Wan A, Laks E, Biele J, Ha G, Aparicio S, Bouchard-Côté A, Shah SP. PyClone: statistical inference of clonal population structure in cancer. Nat Methods. 2014;11(4):396–8.
El-Kebir M. Parsimonious migration history problem: complexity and algorithms. In: Parida L, Ukkonen E, editors. 18th international workshop on algorithms in bioinformatics (WABI 2018). Leibniz international proceedings in informatics (LIPIcs), 2018;113:24–12414. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany. https://doi.org/10.4230/LIPIcs.WABI.2018.24. http://drops.dagstuhl.de/opus/volltexte/2018/9326.