Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences
Tóm tắt
Từ khóa
Tài liệu tham khảo
Apweiler, 2004, UniProt: the Universal Protein knowledgebase, Nucleic Acids Res., 32, D115, 10.1093/nar/gkh131
Bourne, 2004, The distribution and query systems of the RCSB Protein Data Bank, Nucleic Acids Res., 32, D223, 10.1093/nar/gkh096
Li, 2001, Clustering of highly homologous sequences to reduce the size of large protein databases, Bioinformatics, 17, 282, 10.1093/bioinformatics/17.3.282
Li, 2002, Sequence clustering strategies improve remote homology recognitions while reducing search times, Bioinformatics, 15, 643
Li, 2002, Tolerating some redundancy significantly speeds up clustering of large protein databases, Protein Eng., 18, 77