Trình tự bộ gen của muỗi truyền bệnh sốt rét Anopheles gambiae
Tóm tắt
Từ khóa
#trình tự bộ gen #Anopheles gambiae #vector bệnh sốt rét #đa hình nucleotide đơn #khả năng thích nghi sinh lý #bám dính tế bào #miễn dịch #chủng PESTTài liệu tham khảo
Touré Y. T., et al., Parassitologia 40, 477 (1998).
See supporting data on Science Online.
A mate pair is a set of two sequence reads derived from either end of a clone insert such that their relative orientation and distance apart are known.
Unitigs are sets of sequence reads that have been uniquely assembled into a single contiguous sequence such that no fragment in the unitig overlaps a fragment not in the unitig. The depth of reads in a unitig and the mate pair structure between it and other unitigs are used to determine whether a given unitig has single or multiple copies in the genome. We define contigs as sets of overlapping unitigs. Unlike scaffolds which comprise ordered and oriented contigs unitigs and contigs do not have internal gaps.
A nucleotide position was considered to be a SND if the respective column of the multialignment satisfied the following three criteria. First two different bases (A C G T or unknown) had to be observed each in at least two fragments. Second the total number of fragments covering the column had to be ≤15 [halfway between single (10×) and double (20×) coverage] to reduce the frequency of false positives resulting from overcollapsed repeats. Third we eliminated all but one of a run of adjacent SND columns so that block mismatches or (more likely) block indels (insertions/deletions) were counted only once.
SND “balance” is the ratio of the number of fragments showing the second most frequent character in a column to the number showing the most frequent character.
SND “association” shows for a sliding window of 100 kb the fraction of polymorphic columns that can be partitioned into two consistent haplotypes. For an SND column A of the multiple sequence alignment and the previous such column B each fragment might have one of four possible haplotype phases: AB Ab aB or ab where the upper- and lowercase letters indicate alternative nucleotides. We say that columns A and B are consistent if only two of these four haplotypes are present. For the test to be nontrivial we require that at least two fragments be observed with each of the two haplotype phases.
M. Ashburner Drosophila : A Laboratory Handbook (Cold Spring Harbor Laboratory Press Plainview NY 1989) p. 74.
F. H. Collins unpublished data.
On the basis of empirical tests homologous proteins were required to be one of the five best mutual Blast hits within the entire genome to fall within 15 gene calls of the closest neighboring pair and to consist of three or more spatial matches.
S. L. Salzberg R. Wides unpublished data.
The complete hierarchy of InterPro entries is described at www.ebi.ac.uk/interpro; the hierarchy for GO is described at www.geneontology.org.
A. N. Clements Biology of Mosquitoes Vol. I: Development Nutrition Reproduction (Chapman & Hall Wallingford UK 1992).
Supported in part by NIH grant U01AI50687 (R.A.H.) and grants U01AI48846 and R01AI44273 (F.H.C.) on behalf of the Anopheles gambiae Genome Consortium and by the French Ministry of Research. We thank K. Aultman (NIAID) for her insights and effective coordination D. Lilley (Celera) for competent financial and administrative management and all members of the sequencing and support teams at the sequencing centers Celera Genoscope and TIGR.