Base-Calling of Automated Sequencer Traces UsingPhred. I. Accuracy Assessment
Tóm tắt
The availability of massive amounts of DNA sequence information has begun to revolutionize the practice of biology. As a result, current large-scale sequencing output, while impressive, is not adequate to keep pace with growing demand and, in particular, is far short of what will be required to obtain the 3-billion-base human genome sequence by the target date of 2005. To reach this goal, improved automation will be essential, and it is particularly important that human involvement in sequence data processing be significantly reduced or eliminated. Progress in this respect will require both improved accuracy of the data processing software and reliable accuracy measures to reduce the need for human involvement in error correction and make human review more efficient. Here, we describe one step toward that goal: a base-calling program for automated sequencer traces,
Từ khóa
Tài liệu tham khảo
ABI (1996) ABI PRISM, DNA sequencing analysis software, user’s manual. (PE Applied Biosystems, Foster City, CA).
Connell, 1987, Automated DNA sequence analysis., BioTechniques, 5, 342
Dear, 1992, A standard file format for data from DNA sequencing instruments., DNA Sequence, 3, 107, 10.3109/10425179209034003
Ewing, B. and P. Green. 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. (this issue).
Golden, J., E. Garcia, and C. Tibbetts. 1995. Evolutionary optimization of a neural network-based signal processor for photometric data from an automated DNA sequencer. In Evolutionary programming IV. Proceedings of the Fourth Annual Conference on Evolutionary Programming. pp. 579–601.
Golden J.B. Torgersen D. Tibbetts C. (1993) Pattern recognition for automated DNA sequencing: I. On-line signal conditioning and feature extraction for basecalling. in Proceedings of the First International Conference on Intelligent Systems for Molecular Biology, eds Hunter L. Searls D. Shavlik J. (AAAI Press, Menlo Park, CA), pp 136–144.
Parker, 1996, AmpliTaq DNA polymerase, FS dye-terminator sequencing: Analysis of peak height patterns., BioTechniques, 21, 694, 10.2144/96214rr02
Press W.H. Flannery B.P. Teukolsky S.A. Vetterling W.T. (1988) Numerical recipes in C. The art of scientific computing. (Cambridge University Press, Cambridge, UK).