Term identification in the biomedical literature

Journal of Biomedical Informatics - Tập 37 - Trang 512-526 - 2004
Michael Krauthammer1, Goran Nenadic2,3
1Department of Biomedical Informatics, Columbia Genome Center, Columbia University, New York, USA
2Department of Computation, UMIST, Manchester, UK.
3National Centre for Text Mining, Manchester, UK

Tài liệu tham khảo

Gaizauskas R, Demetriou G, Humphreys K. Term recognition and classification in biological science journal articles. In: Proceedings of Workshop on Computational Terminology for Medical and Biological Applications. Patras, Greece; 2000. pp. 37–44

Hirschman, 2002, Rutabaga by any other name: extracting biological names, J. Biomed. Inform., 35, 247, 10.1016/S1532-0464(03)00014-5

Tuason O, Chen L, Liu H, Blake JA, Friedman C. Biological nomenclature: a source of lexical knowledge and ambiguity. In: Proceedings of Pacific Symposium on Biocomputations; 2004. pp. 238–49

Boeckmann, 2003, The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003, Nucleic Acids Res, 31, 365, 10.1093/nar/gkg095

Collier N, Nobata C, Tsujii J. Automatic term identification and classification in biological texts. In: Proceedings of Natural Language Pacific Rim Symposium. Beijing, China; 1999. pp. 369–74

Pruitt, 2001, RefSeq and LocusLink: NCBI gene-centered resources, Nucleic Acids Res., 29, 137, 10.1093/nar/29.1.137

Benson, 2000, Genbank, Nucleic Acids Res., 28, 15, 10.1093/nar/28.1.15

Altschul, 1997, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res., 25, 3389, 10.1093/nar/25.17.3389

Blake C, Pratt W. Better Rules, Fewer Features: A semantic approach to selecting features from text. In: Proceedings of IEEE Data Mining Conference. San Jose, California; 2001. pp. 59–66

Gaizauskas, 2003, Protein structures and information extraction from biological texts: the PASTA system, Bioinformatics, 19, 135, 10.1093/bioinformatics/19.1.135

Narayanaswamy M, Ravikumar KE, Vijay-Shanker K. A biological named entity recognizer. In: Proceedings of Pacific Symposium on Biocomputations. 2003. pp. 427–38

Hobbs, 2002, Information extraction from biomedical text, J. Biomed. Inform., 35, 260, 10.1016/S1532-0464(03)00015-7

Andrade, 1998, Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families, Bioinformatics, 14, 600, 10.1093/bioinformatics/14.7.600

Hatzivassiloglou, 2001, Disambiguating proteins, genes, and RNA in text: a machine language approach, Bioinformatics, 17, S97, 10.1093/bioinformatics/17.suppl_1.S97

Hodges, 1998, The Yeast Protein Database (YPD): a curated proteome database for Saccaromyces cerevisiae, Nucleic Acids Res., 26, 68, 10.1093/nar/26.1.68

Tanabe, 2002, Tagging gene and protein names in biomedical text, Bioinformatics, 18, 1124, 10.1093/bioinformatics/18.8.1124

Humphreys, 1998, The unified medical language system: an informatics research collaboration, J. Am. Med. Inform. Assoc., 5, 1, 10.1136/jamia.1998.0050001

Nenadic G, Rice S, Spasic I, Ananiadou S, Stapley BJ. Selecting text features for gene name classification: from documents to terms. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 121–8

Nenadic, 2003, Terminology-driven mining of biomedical literature, Bioinformatics, 19, 938, 10.1093/bioinformatics/btg105

Adar, 2002

Chang, 2002, Creating an online dictionary of abbreviations from medline, J. Am. Med. Inform. Assoc., 9, 612, 10.1197/jamia.M1139

Rimer, 1998, BioABACUS: a database of abbreviations and acronyms in biotechnology and computer science, Bioinformatics, 14, 888, 10.1093/bioinformatics/14.10.888

Yoshida, 2000, PNAD-CSS: A Workbench for Constructing a Protein name abbrevation dictionary, Bioinformatics, 16, 169, 10.1093/bioinformatics/16.2.169

Yu, 2002, Mapping abbreviations to full forms in biomedical articles, J Am. Med. Inform. Assoc., 9, 262, 10.1197/jamia.M0913

Schwartz AS, Hearst MA. A simple algorithm for identifying abbreviation definitions in biomedical text. In: Proceedings of Pacific Symposium on Biocomputations. 2003. pp. 451–62

Nobata C, Collier N, Tsujii J. Automatic term identification and classification in biological texts. In: Proceedings of Natural Language Pacific Rim Symposium. 1999. pp. 369–74

Raychaudhuri, 2002, Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature, Genome. Res., 12, 203, 10.1101/gr.199701

Seewald A. Towards recognizing domain and species from MEDLINE publications. In: Proceedings of European Workshops on Data Mining and Text Mining for Bioinformatics. 2003. pp. 51–8

Jacquemin, 2001

Jacquemin, 1999, NLP for Term Variant Extraction: A Synergy of Morphology, Lexicon and Syntax, 25

Yu, 2003, Extracting synonymous gene and protein terms from biological literature, Bioinformatics, 19, I340, 10.1093/bioinformatics/btg1047

Liu, 2002, Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS, J. Am. Med. Inform. Assoc., 9, 621, 10.1197/jamia.M1101

Blaschke, 2002, Molecular biology nomenclature thwarts information-extraction progress, IEEE Intell Syst, 17, 73

Ogren P, Cohen K, Acquaah-Mensah G, Eberlein J, Hunter L. The compositional structure of gene ontology terms. In: Proceedings of Pacific Symposium on Biocomputations 2004. pp. 214–25

Hisamitsu T, Tsujii J. Measuring term representativeness. In: Pazienza MT, editor. Information Extraction in the Web Era, LNAI 2700. New York, NY: Springer; 2003. pp. 45–76