Term identification in the biomedical literature

Journal of Biomedical Informatics - Tập 37 - Trang 512-526 - 2004

Michael Krauthammer¹, Goran Nenadic^2,3

¹Department of Biomedical Informatics, Columbia Genome Center, Columbia University, New York, USA

²Department of Computation, UMIST, Manchester, UK.

³National Centre for Text Mining, Manchester, UK

Tài liệu tham khảo

Gaizauskas R, Demetriou G, Humphreys K. Term recognition and classification in biological science journal articles. In: Proceedings of Workshop on Computational Terminology for Medical and Biological Applications. Patras, Greece; 2000. pp. 37–44

Hirschman, 2002, Rutabaga by any other name: extracting biological names, J. Biomed. Inform., 35, 247, 10.1016/S1532-0464(03)00014-5

Tuason O, Chen L, Liu H, Blake JA, Friedman C. Biological nomenclature: a source of lexical knowledge and ambiguity. In: Proceedings of Pacific Symposium on Biocomputations; 2004. pp. 238–49

The FlyBase database of the Drosophila genome projects and community literature. Nucleic Acids Res 2003;31(1):172–5

Boeckmann, 2003, The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003, Nucleic Acids Res, 31, 365, 10.1093/nar/gkg095

Collier N, Nobata C, Tsujii J. Automatic term identification and classification in biological texts. In: Proceedings of Natural Language Pacific Rim Symposium. Beijing, China; 1999. pp. 369–74

Pruitt, 2001, RefSeq and LocusLink: NCBI gene-centered resources, Nucleic Acids Res., 29, 137, 10.1093/nar/29.1.137

Ohta T, Tateisi Y, Mima H, Tsujii J. GENIA corpus: an annotated research abstract corpus in molecular biology domain. In: Proceedings of Human Language Technology Conference (HLT 2002). 2002. pp. 73–7

Nenadic G, Spasic I, Ananiadou S. Mining biomedical abstracts: What is in a term? In: Proceedings of International Joint Conference on NLP. Sanya, China; 2004. pp. 247–54

Krauthammer, 2000, Using BLAST for identifying gene and protein names in journal articles, Gene, 259, 245, 10.1016/S0378-1119(00)00431-5

Benson, 2000, Genbank, Nucleic Acids Res., 28, 15, 10.1093/nar/28.1.15

Altschul, 1990, Basic local alignment search tool, J. Mol. Biol., 215, 403, 10.1016/S0022-2836(05)80360-2

Altschul, 1997, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res., 25, 3389, 10.1093/nar/25.17.3389

Tsuruoka Y, Tsujii J. Probabilistic term variant generator for biomedical terms. In: Proceedings of 26th Annual ACM SIGIR Conference. 2003. pp. 167–73

Tsuruoka Y, Tsujii J. Boosting precision and recall of dictionary-based protein name recognition. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 41–8

Bourigault D, Gomzalez-Mullier I, Gros C. LEXTER, a Natural language processing tool for terminology extraction. In: Proceedings of EURALEX ’96. 1996. pp. 771–9

Blake C, Pratt W. Better Rules, Fewer Features: A semantic approach to selecting features from text. In: Proceedings of IEEE Data Mining Conference. San Jose, California; 2001. pp. 59–66

Ananiadou S. A Methodology for automatic term recognition. In: Proceedings of COLING-94. Kyoto, Japan; 1994. pp. 1034–8

Humphreys K, Demetriou G, Gaizauskas R. Two applications of information extraction to biological science journal articles: enzyme interactions and protein structures. In: Proceedings of Pacific Symposium on Biocomputations. 2000. pp. 505–16

Gaizauskas, 2003, Protein structures and information extraction from biological texts: the PASTA system, Bioinformatics, 19, 135, 10.1093/bioinformatics/19.1.135

Fukuda K, Tamura A, Tsunoda T, Takagi T. Toward information extraction: identifying protein names from biological papers. In: Proceedings of Pacific Symposium on Biocomputations. 1998. pp. 707–18

Narayanaswamy M, Ravikumar KE, Vijay-Shanker K. A biological named entity recognizer. In: Proceedings of Pacific Symposium on Biocomputations. 2003. pp. 427–38

Franzen, 2002, Protein names and how to find them, Int. J. Med. Inf., 67, 49, 10.1016/S1386-5056(02)00052-7

Hou W, Chen H. Enhancing performance of protein name recognizers using collocation. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 25–32

Hobbs, 2002, Information extraction from biomedical text, J. Biomed. Inform., 35, 260, 10.1016/S1532-0464(03)00015-7

Thomas J, Milward D, Ouzounis C, Pulman S, Carroll M. Automatic extraction of protein interactions from scientific abstracts. In: Proceedings of Pacific Symposium on Biocomputations. 2000. p. 541–52

Hobbs, 1997, FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text, 383

Andrade, 1998, Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families, Bioinformatics, 14, 600, 10.1093/bioinformatics/14.7.600

Hatzivassiloglou, 2001, Disambiguating proteins, genes, and RNA in text: a machine language approach, Bioinformatics, 17, S97, 10.1093/bioinformatics/17.suppl_1.S97

Craven M, Kumlien J. Constructing biological knowledge bases by extracting information from text sources. In: Proceedings of Int Conf Intell Syst Mol Biol. 1999. pp. 77–86

Hodges, 1998, The Yeast Protein Database (YPD): a curated proteome database for Saccaromyces cerevisiae, Nucleic Acids Res., 26, 68, 10.1093/nar/26.1.68

Morgan A, Yeh A, Hirschman L, Colosimo M. Gene name extraction using flybase resources. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 1–8

Collier N, Nobata C, Tsujii J. Extracting the names of genes and gene products with a hidden markov model. In: Proceedings of COLING 2000. Saarbruecken; 2000. pp. 201–7

Shen D, Zhang J, Zhou G, Su J, Tan C. Effective adaptation of hidden markov model-based named entity recognizer for biomedical domain. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 49–56

Kazama J, Makino T, Ohta Y, Tsujii J. Tuning support vector machines for biomedical named entity recognition. In: Proceedings of Workshop on NLP in the Biomedical Domain, ACL 2002. Philadelphia, PA; 2002. pp. 1–8

Takeuchi K, Collier N. Bio-medical entity extraction using support vector machines. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 57–64

Yamamoto K, Kudo T, Konagaya A, Matsumoto Y. Protein name tagging for biomedical annotation in text. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 65–72

Lee K, Hwang Y, Rim H. Two-phase biomedical NE recognition based on SVMs. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 33–40

Tanabe, 2002, Tagging gene and protein names in biomedical text, Bioinformatics, 18, 1124, 10.1093/bioinformatics/18.8.1124

Brill E. A simple rule-based part-of-speech tagger. In: Proceedings of ANLP-92. Trento, IT; 1992. pp. 152–5

Harris, 2004, The Gene Ontology (GO) database and informatics resource, Nucleic. Acids. Res., 32, D258

Proux D, Rechenmann F, Julliard L, Pillet VV, Jacq B. Detecting gene symbols and names in biological texts: a first step toward pertinent information extraction. In: Proceedings of Ninth Workshop on Genome Informatics. 1998. pp. 72–80

Rindflesch TC, Hunter L, Aronson AR. Mining molecular binding terminology from biomedical text. In: Proceedings of AMIA Symposium. 1999. pp. 127–31

Humphreys, 1998, The unified medical language system: an informatics research collaboration, J. Am. Med. Inform. Assoc., 5, 1, 10.1136/jamia.1998.0050001

Rindflesch TC, Tanabe L, Weinstein JN, Hunter L. EDGAR: extraction of drugs, genes and relations from the biomedical literature. In: Proceedings of Pacific Symposium on Biocomputations. 2000. pp. 517–28

Frantzi, 2000, Automatic recognition of multi-word terms: the c-value/nc-value method, Int J Digit Libr, 3, 115, 10.1007/s007999900023

Ananiadou, 2000, Evaluation of automatic term recognition of nuclear receptors from medline, Genome Informatics Series

Nenadic G, Rice S, Spasic I, Ananiadou S, Stapley BJ. Selecting text features for gene name classification: from documents to terms. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 121–8

Nenadic G, Spasic I, Ananiadou S. Automatic Acronym acquisition and term variation management within domain-Specific texts. In: Proceedings of LREC-3. Las Palmas, Spain; 2002. pp. 2155–62

Nenadic, 2003, Terminology-driven mining of biomedical literature, Bioinformatics, 19, 938, 10.1093/bioinformatics/btg105

Adar, 2002

Chang, 2002, Creating an online dictionary of abbreviations from medline, J. Am. Med. Inform. Assoc., 9, 612, 10.1197/jamia.M1139

Rimer, 1998, BioABACUS: a database of abbreviations and acronyms in biotechnology and computer science, Bioinformatics, 14, 888, 10.1093/bioinformatics/14.10.888

Yu, 2003, Automatically identifying gene/protein terms in Medline abstracts, J. Biomed. Inform., 35, 322

Yoshida, 2000, PNAD-CSS: A Workbench for Constructing a Protein name abbrevation dictionary, Bioinformatics, 16, 169, 10.1093/bioinformatics/16.2.169

Liu H, Aronson AR, Friedman C. A study of abbreviations in MEDLINE abstracts. In: Proceedings of AMIA Symposium. 2002. pp. 464–8

Yu, 2002, Mapping abbreviations to full forms in biomedical articles, J Am. Med. Inform. Assoc., 9, 262, 10.1197/jamia.M0913

Schwartz AS, Hearst MA. A simple algorithm for identifying abbreviation definitions in biomedical text. In: Proceedings of Pacific Symposium on Biocomputations. 2003. pp. 451–62

Pustejovsky J, Castano J, Cochran B, Kotecki M, Morrell M, Rumshisky A. Extraction and Disambiguation of Acronym–Meaning Pairs in Medline. In: Proceedings of Medinformatics. 2001

Torii M, Kamboj S, Vijay-Shanker K. An Investigation of Various Information Sources for Classifying Biological Names. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 113–20

Nobata C, Collier N, Tsujii J. Automatic term identification and classification in biological texts. In: Proceedings of Natural Language Pacific Rim Symposium. 1999. pp. 369–74

Torii M, Vijay-Shanker K. Using unlabeled MEDLINE abstracts for biological named entity classification. In: Proceedings of Genome Informatics Workshop 2002. 2002. pp. 567–658

Spasic I, Nenadic G, Ananiadou S. Using domain-specific verbs for term classification. In: Proceedings of NLP in Biomedicine, ACL 2003. Sapporo, Japan; 2003. pp. 17–24

Raychaudhuri, 2002, Associating genes with gene ontology codes using a maximum entropy analysis of biomedical literature, Genome. Res., 12, 203, 10.1101/gr.199701

Seewald A. Towards recognizing domain and species from MEDLINE publications. In: Proceedings of European Workshops on Data Mining and Text Mining for Bioinformatics. 2003. pp. 51–8

Cohen KB, Acquaah-Mensah GK, Dolbey AE, Hunter L. Contrast and variability in gene names. In: Proceedings of Workshop on NLP in the Biomedical Domain, ACL 2002. Philadelphia, PA; 2002. pp. 14–20

Jacquemin, 2001

Jacquemin, 1999, NLP for Term Variant Extraction: A Synergy of Morphology, Lexicon and Syntax, 25

Aronson AR. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. In: Proceedings of AMIA Symposium. 2001. pp. 17–21

Yu, 2003, Extracting synonymous gene and protein terms from biological literature, Bioinformatics, 19, I340, 10.1093/bioinformatics/btg1047

Liu, 2002, Automatic resolution of ambiguous terms based on machine learning and conceptual relations in the UMLS, J. Am. Med. Inform. Assoc., 9, 621, 10.1197/jamia.M1101

Pakhomov S. Semi-supervised maximum entropy based approach to acronym and abbreviation normalization in medical texts. In: Proceedings of 40th annual meeting of ACL. 2002. pp. 160–7

Blaschke, 2002, Molecular biology nomenclature thwarts information-extraction progress, IEEE Intell Syst, 17, 73

Ogren P, Cohen K, Acquaah-Mensah G, Eberlein J, Hunter L. The compositional structure of gene ontology terms. In: Proceedings of Pacific Symposium on Biocomputations 2004. pp. 214–25

Hisamitsu T, Tsujii J. Measuring term representativeness. In: Pazienza MT, editor. Information Extraction in the Web Era, LNAI 2700. New York, NY: Springer; 2003. pp. 45–76

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA