SaAlign: Multiple DNA/RNA sequence alignment and phylogenetic tree construction tool for ultra-large datasets and ultra-long sequences based on suffix array
Tài liệu tham khảo
Hong, 2021, ENJ algorithm can construct triple phylogenetic trees, Mol Ther Nucleic Acids, 23, 286, 10.1016/j.omtn.2020.11.004
Kolomvatsos, 2019, A distributed, proactive intelligent scheme for securing quality in large scale data processing, Computing, 101, 1687, 10.1007/s00607-018-0683-9
Wooley, 2010, A primer on metagenomics, PLoS Comput Biol, 6, 10.1371/journal.pcbi.1000667
Wooley, 2010, Metagenomics: facts and artifacts, and computational challenges, J Comp Sci Technol, 25, 71, 10.1007/s11390-010-9306-4
Godini, 2019, A brief overview of the concepts, methods and computational tools used in phylogenetic tree construction and gene prediction, Meta Gene, 21, 10.1016/j.mgene.2019.100586
Smith, 2014, Buying in to bioinformatics: An introduction to commercial sequence analysis software, Briefings Bioinf, 16, 700, 10.1093/bib/bbu030
Nakamura, 2018, Parallelization of MAFFT for large-scale multiple sequence alignments, Bioinformatics, 34, 2490, 10.1093/bioinformatics/bty121
Katoh, 2002, MAFFT: A novel method for rapid multiple sequence alignment based on fast Fourier transform, Nucleic Acids Res, 30, 3059, 10.1093/nar/gkf436
Pal, 2016, Use of FFT in protein sequence comparison under their binary representations, Comput Mol Biosci, 06, 33, 10.4236/cmb.2016.62003
Mirarab, 2014, PASTA: ultra-large multiple sequence alignment for nucleotide and amino-acid sequences, J Comput Biol, 22, 377, 10.1089/cmb.2014.0156
Zhan, 2019, ProbPFP: a multiple sequence alignment algorithm combining hidden Markov model optimized by particle swarm optimization with partition function, BMC Bioinf, 20, 1, 10.1186/s12859-019-3132-7
Li, 2018, Minimap2: Pairwise alignment for nucleotide sequences, Bioinformatics, 34, 3094, 10.1093/bioinformatics/bty191
Jiang X, Fu X, Dong G, Li H. Research on Pairwise Sequence Alignment Needleman-Wunsch Algorithm 2017;141:1041–6. 10.2991/icmmcce-17.2017.187.
Lu, 2020, Parallel and distributed architecture of genetic algorithm on Apache Hadoop and Spark, Appl Soft Comp J, 95
Abuín, 2016, SparkBWA: speeding up the alignment of high-throughput DNA sequencing data, PLoS ONE, 11, 10.1371/journal.pone.0155461
Zou, 2015, HAlign: Fast multiple similar DNA/RNA sequence alignment based on the centre star strategy, Bioinformatics, 31, 2475, 10.1093/bioinformatics/btv177
Wan, 2017, HAlign-II: Efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing, Algorithms Mol Biol, 12, 1, 10.1186/s13015-017-0116-x
Song, 2020, Complete mitochondrial genome of Aspergillus japonicus from the built environment and its phylogenetic analysis, Mitochondrial DNA Part B, 5, 1445, 10.1080/23802359.2020.1735972
Merheb, 2019, Mitochondrial DNA, a powerful tool to decipher ancient human civilization from domestication to music, and to uncover historical murder cases, Cells, 8, 10.3390/cells8050433
Abuín, 2020, Big Data in metagenomics: Apache Spark vs MPI, PLoS ONE, 15, 10.1371/journal.pone.0239741
Junier, 2010, The Newick utilities: high-throughput phylogenetic tree processing in the Unix shell, Bioinformatics, 26, 1669, 10.1093/bioinformatics/btq243
Thompson, 1999, BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs, Bioinformatics, 15, 87, 10.1093/bioinformatics/15.1.87
Carrillo, 1988, The multiple sequence alignment problem in biology, SIAM J Appl Math, 48, 1073, 10.1137/0148063
Darling, 2010, progressiveMauve: multiple genome alignment with gene gain, loss and rearrangement, PLOS ONE, 5, 10.1371/journal.pone.0011147
Efron, 1996, Bootstrap confidence levels for phylogenetic trees, PNAS, 93, 13429, 10.1073/pnas.93.23.13429
Soltis, 2003, Applying the bootstrap in phylogeny reconstruction, Statistical Sci, 18, 256, 10.1214/ss/1063994980
Hill, 2008, Amdahl’s law in the multicore era, Computer, 41, 33, 10.1109/MC.2008.209
Li, 2003, ClustalW-MPI: ClustalW analysis using distributed and parallel computing, Bioinformatics, 19, 1585, 10.1093/bioinformatics/btg192
Niemenmaa, 2012, Hadoop-BAM: directly manipulating next generation sequencing data in the cloud, Bioinformatics, 28, 876, 10.1093/bioinformatics/bts054
Zou Q, Wan S, Zeng X. HPTree: Reconstructing phylogenetic trees for ultra-large unaligned DNA sequences via NJ model and Hadoop. 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2016, p. 53–8. 10.1109/BIBM.2016.7822492.
Pearson, 1991, Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms, Genomics, 11, 635, 10.1016/0888-7543(91)90071-L
Su, 2017, Multiple sequence alignment based on a suffix tree and center-star strategy: a linear method for multiple nucleotide sequence alignment on spark parallel framework, J Comput Biol, 24, 1230, 10.1089/cmb.2017.0040
Saitou, 1987, The neighbor-joining method: a new method for reconstructing phylogenetic trees, Mol Biol Evol, 4, 406
Edgar, 2004, MUSCLE: A multiple sequence alignment method with reduced time and space complexity, BMC Bioinf, 5, 1, 10.1186/1471-2105-5-113
Zaharia, 2012, Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing, 15
Xin RS, Gonzalez JE, Franklin MJ, Stoica I. GraphX: A resilient distributed graph system on spark. 1st International Workshop on Graph Data Management Experiences and Systems, GRADES 2013 – Co-Located with SIGMOD/PODS 2013 2013. 10.1145/2484425.2484427.
Sun, 2013, A novel algorithm for DNA multiple sequence alignment based on the sliding window and the keyword tree, Int J Biosci, Biochem Bioinform, 3, 271
Na JC, Park H, Lee S, Hong M, Lecroq T, Mouchard L, et al. Suffix array of alignment: A practical index for similar data. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 2013;8214 LNCS:243–54. 10.1007/978-3-319-02432-5_27.
Bingmann, 2018, Scalable string and suffix sorting, Algorith. Techn. Tools, 1
