A probabilistic approach for long read-length DNA sequence analysis
Tóm tắt
This paper introduces a new algorithm for DNA sequence analysis, based on the use of a reference DNA sequence for the estimation of base positions, and a probabilistic modelling of trace peaks. The new algorithm has been applied to long read-length DNA sequences and its performance has been compared to the base-calling program Phred. The results reported in this paper, after cross-matching with a finished consensus, show a significant improvement by the new algorithm in the final sequence read-length and in the number of correct bases extracted from DNA traces.
Từ khóa
#DNA #Bioinformatics #Genomics #Signal processing algorithms #Phase estimation #Image sequence analysis #Signal analysis #Libraries #Algorithm design and analysis #HumansTài liệu tham khảo
brown, 1999, Genomes, 37
bevingto, 1969, Data Reduction and Error Analysis for the Physical Sciences
stow, 1997, experimental issues of functional merging on probability density estimation, Fifth International Conference on Artificial Neural Networks (Conf Publ No 440), 123, 10.1049/cp:19970713
10.1109/6.880952
giddings, 1998, A software system for data analysis in automated DNA sequencing, Genome Research, 8, 644, 10.1101/gr.8.6.644
10.1101/gr.6.2.80
haan, 2000, Modelling electropherogram data for DNA sequencing using MCMC, Proceedings IEEE International Conference on Acoustics Speech and Signal Processing
10.1101/gr.8.3.175
dempster, 1977, Maximum Likelihood from Incomplete Data Via the EM Algorithm, J Royal Statist Soc, b, 1
10.1101/gr.8.3.186
10.1093/nar/21.19.4530
10.1073/pnas.74.12.5463
10.1023/A:1008199518065