Stacked ensemble combined with fuzzy matching for biomedical named entity recognition of diseases

Journal of Biomedical Informatics - Tập 64 - Trang 1-9 - 2016

Balu Bhasuran¹, Gurusamy Murugesan², Sabenabanu Abdulkadhar², Jeyakumar Natarajan^1,2

¹DRDO-BU Center for Life Sciences, Bharathiar University Campus, Coimbatore 641046, India

²Data Mining and Text Mining Laboratory, Department of Bioinformatics, Bharathiar University, Coimbatore 641046, India

Tài liệu tham khảo

Zhu, 2013, Biomedical text mining and its applications in cancer research, J. Biomed. Inform., 46, 200, 10.1016/j.jbi.2012.10.007 Cohen, 2005, A survey of current work in biomedical text mining, Briefings Bioinform., 6, 57, 10.1093/bib/6.1.57 Lin, 2004, A maximum entropy approach to biomedical named entity recognition, 56 Jimeno, 2008, Assessment of disease named entity recognition on a corpus of annotated sentences, BMC Bioinform., 9, S3, 10.1186/1471-2105-9-S3-S3 J. Lafferty, A. McCallum, F.C. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, 2001. Jonnalagadda, 2013, Using empirically constructed lexical resources for named entity recognition, Biomed. Inform. Insights, 6, 17 Collier, 2000, Extracting the names of genes and gene products with a hidden Markov model, vol. 1, 201 McCallum, 2000, Maximum entropy Markov models for information extraction and segmentation, vol. 17, 591 Neves, 2010, Moara: a Java library for extracting and normalizing gene and protein mentions, BMC Bioinform., 11, 157, 10.1186/1471-2105-11-157 Krauthammer, 2004, Term identification in the biomedical literature, J. Biomed. Inform., 37, 512, 10.1016/j.jbi.2004.08.004 Campos, 2013, A modular framework for biomedical concept recognition, BMC Bioinform., 14, 281, 10.1186/1471-2105-14-281 Leaman, 2015, TmChem: a high performance approach for chemical named entity recognition and normalization, J. Cheminform., 7 Huang, 2013, Disease named entity recognition by machine learning using semantic type of metathesaurus, Int. J. Mach. Learn. Comput., 3, 494, 10.7763/IJMLC.2013.V3.367 Korkontzelos, 2015, Boosting drug named entity recognition using an aggregate classifier, Artif. Intell. Med., 65, 145, 10.1016/j.artmed.2015.05.007 Ekbal, 2013, Biomedical named entity extraction: some issues of corpus compatibilities, SpringerPlus, 2, 601, 10.1186/2193-1801-2-601 Li, 2012, Disease mention recognition using soft-margin SVM, Training, 593, 5 Doğan, 2012, An improved corpus of disease mentions in PubMed citations Karp, 1987, Efficient randomized pattern-matching algorithms, IBM J. Res. Dev., 31, 249, 10.1147/rd.312.0249 Boyer, 1977, A fast string searching algorithm, Commun. ACM, 20, 762, 10.1145/359842.359859 Doğan, 2014, NCBI disease corpus: a resource for disease name recognition and concept normalization, J. Biomed. Inform., 47, 1, 10.1016/j.jbi.2013.12.006 Wei, 2015, Overview of the BioCreative V chemical disease relation (CDR) task Toutanova, 2000, Enriching the knowledge sources used in a maximum entropy part-of-speech tagger, vol. 13, 63 Lipscomb, 2000, Medical subject headings (MeSH), Bull. Med. Libr. Assoc., 88, 265 Hewett, 2002, PharmGKB: the pharmacogenetics knowledge base, Nucl. Acids Res., 30, 163, 10.1093/nar/30.1.163 Bodenreider, 2004, The unified medical language system (UMLS): integrating biomedical terminology, Nucl. Acids Res., 32, D267, 10.1093/nar/gkh061 Osborne, 2009, Annotating the human genome with Disease Ontology, BMC Genom., 10, S6, 10.1186/1471-2164-10-S1-S6 Elkin, 2006, Evaluation of the content coverage of SNOMED CT: ability of SNOMED clinical terms to represent clinical problem lists, vol. 81, no. 6, 741 Davis, 2012, MEDIC: a practical disease vocabulary used at the Comparative Toxicogenomics Database, Database, bar065 Hamosh, 2005, Online Mendelian Inheritance in Man (OMIM) a knowledgebase of human genes and genetic disorders, Nucl. Acids Res., 33, D514 Nadeau, 2007, A survey of named entity recognition and classification, Lingvisticae Investigationes, 30, 3, 10.1075/li.30.1.03nad John Lafferty, Andrew McCallum, Fernando C.N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, 2001. Munkhdalai, 2015, Incorporating domain knowledge in chemical and biomedical named entity recognition with word representations, J. Cheminform., 7, S9, 10.1186/1758-2946-7-S1-S9 Lavergne, 2010, Practical very large scale CRFs Bergstra, 2012, Random search for hyper-parameter optimization, J. Mach. Learn. Res., 13, 281 Yang, 2008, Exploiting the contextual cues for bio-entity name recognition in biomedical literature, J. Biomed. Inform., 41, 580, 10.1016/j.jbi.2008.01.002 Ekbal, 2011, Weighted vote-based classifier ensemble for named entity recognition: a genetic algorithm-based approach, ACM Trans. Asian Lang. Inform. Process. (TALIP), 10, 9, 10.1145/1967293.1967296 H. Wang, T. Zhao, Identifying named entities in biomedical text based on stacked generalization, in: 7th World Congress on Intelligent Control and Automation. WCICA 2008, IEEE, 2008, pp. 160–164. Dasarathy, 1979, A composite classifier system design: concepts and methodology, Proc. IEEE, 67, 708, 10.1109/PROC.1979.11321 Zhou, 2002, Ensembling neural networks: many could be better than all, Artif. Intell., 137, 239, 10.1016/S0004-3702(02)00190-X Zhou, 2005, Recognition of protein/gene names from text using an ensemble of classifiers, BMC Bioinform., 6, 1, 10.1186/1471-2105-6-1 Wolpert, 1992, Stacked generalization, Neural Networks, 5, 241, 10.1016/S0893-6080(05)80023-1 P.P. Bonissone, The Problem of Linguistic Approximation in System Analysis, 1979. Eshragh, 1979, A general approach to linguistic approximation, Int. J. Man Mach. Stud., 11, 501, 10.1016/S0020-7373(79)80040-1 Wenstøp, 1980, Quantitative analysis with linguistic values, Fuzzy Sets Syst., 4, 99, 10.1016/0165-0114(80)90031-7 Zwick, 1987, Measures of similarity among fuzzy concepts: a comparative analysis, Int. J. Approx. Reason., 1, 221, 10.1016/0888-613X(87)90015-6 Wei, 2013, PubTator: a web-based text mining tool for assisting biocuration, Nucl. Acids Res., gkt441

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA