Meta-clustering of gene expression data and literature-based information

Association for Computing Machinery (ACM) - Tập 5 Số 2 - Trang 101-112 - 2003
Patrick Glenisson1, Janick Mathys1, Bart De Moor1
1ESAT-SCD KULeuven, Leuven, Belgium#TAB#

Tóm tắt

The current tendency in the life sciences to spawn ever growing amounts of high-throughput assays has led to a situation where the interpretation of data and the formulation of hypotheses lag the pace at which information is produced. Although the first generation of statistical algorithms scrutinizing single, large-scale data sets found their way into the biological community, the great challenge to connect their results to existing knowledge still remains. Despite the fairly large number of biological databases that is currently available, a lot of relevant information is found in free-text format (such as textual annotations, scientific abstracts and full publications). In this paper we explore how an integrated analysis of expression data and literature-extracted information can reveal biologically meaningful clusters not identified when using microarray information alone. The joint analysis is validated in terms of transcriptional regulation.

Từ khóa


Tài liệu tham khảo

R. Baeza-Yates and B. Ribeiro-Neto . Modern Information Retrieval . ACM Press , 1999 . R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. ACM Press, 1999.

10.1093/nar/30.1.1

A. Ben-Hur , A. Elisseeff , and I. Guyon . A stability based method for discovering structure in clustered data . In Proc of the Seventh Ann Pac Symp Biocomp (PSB 2002 ), pages 6 -- 17 , 2002 . A. Ben-Hur, A. Elisseeff, and I. Guyon. A stability based method for discovering structure in clustered data. In Proc of the Seventh Ann Pac Symp Biocomp (PSB 2002), pages 6--17, 2002.

10.1093/bioinformatics/btg092

10.1007/s101420000036

10.1038/84792

D. Chaussable and A. Cher . Mining microarray expression data by literature profiling . Genome Biol , 3 , 2002 . D. Chaussable and A. Cher. Mining microarray expression data by literature profiling. Genome Biol, 3, 2002.

10.1093/bioinformatics/btg160

10.1016/S1097-2765(00)80114-8

10.1093/nar/gkg615

10.1073/pnas.0630591100

W. B. Frakes . Stemming algorithms . in W. B. Frakes and R. Baeze-Yates: Information retrieval . Prentice Hall , 1992 . W. B. Frakes. Stemming algorithms. in W. B. Frakes and R. Baeze-Yates: Information retrieval. Prentice Hall, 1992.

10.1186/gb-2002-3-11-research0059

10.1101/gr.397002

P. Glenisson , P. Antal , J. Mathys , and B. De Moor . Evaluation of the vector space representation in text-based gene clustering . In Proc of the Eighth Ann Pac Symp Biocomp (PSB 2003 ), pages 391 -- 402 , 2003 . P. Glenisson, P. Antal, J. Mathys, and B. De Moor. Evaluation of the vector space representation in text-based gene clustering. In Proc of the Eighth Ann Pac Symp Biocomp (PSB 2003), pages 391--402, 2003.

P. Glenisson , B. Coessens , S. Van Vooren , Y. Moreau , and B. De Moor . Text-based gene profiling with domain-specific views . In Proc of the First Int Workshop on Semantic Web and Databases (SWDB 2003 ), Berling, Germany, pages 15--31 , 2003 . P. Glenisson, B. Coessens, S. Van Vooren, Y. Moreau, and B. De Moor. Text-based gene profiling with domain-specific views. In Proc of the First Int Workshop on Semantic Web and Databases (SWDB 2003), Berling, Germany, pages 15--31, 2003.

10.1006/jmbi.2000.3519

A. Jain and R. Dubes . Algorithms for clustering data . Prentice Hall , 1988 . A. Jain and R. Dubes. Algorithms for clustering data. Prentice Hall, 1988.

10.1038/ng0501-21

10.1002/9780470316801

10.1126/science.1075090

10.1093/bioinformatics/18.11.1515

10.1145/133160.133172

10.1038/ng0501-9

10.1093/nar/gkg108

10.1016/j.tig.2003.08.006

P. Pavlidis , D. Lewis , and W. Noble . Exploring gene expression data with class scores . In Proc of the Seventh Ann Pac Symp Biocomp (PSB 2002) , 2002 . P. Pavlidis, D. Lewis, and W. Noble. Exploring gene expression data with class scores. In Proc of the Seventh Ann Pac Symp Biocomp (PSB 2002), 2002.

10.1089/10665270252935539

K. Pollard and M. van der Laan. A method to identify significant clusters in gene expression data. In To appear in Proc of Systemics , Cybernetics and Informatics 2002 (SCI 2002) , 2002 . K. Pollard and M. van der Laan. A method to identify significant clusters in gene expression data. In To appear in Proc of Systemics, Cybernetics and Informatics 2002 (SCI 2002), 2002.

10.1038/35076576

10.1093/bioinformatics/btg002

10.1093/nar/gkg636

10.1101/gr.199701

10.1023/A:1023901610396

10.1038/ng1165

10.1109/5254.999219

10.1091/mbc.9.12.3273

M. Stephens , M. Palakal , S. Mukhopadhyay , R. Raje , and J. Mostafa . Detecting gene relations from MEDLINE abstracts . In Proc of the Sixth Ann Pac Symp Biocomp (PSB 2001) , 2001 . M. Stephens, M. Palakal, S. Mukhopadhyay, R. Raje, and J. Mostafa. Detecting gene relations from MEDLINE abstracts. In Proc of the Sixth Ann Pac Symp Biocomp (PSB 2001), 2001.

10.2144/99276bc03

10.1038/10343

10.1016/S0092-8674(01)00221-5

10.5555/645635.660967

10.1093/bioinformatics/btg1045

10.1093/bioinformatics/17.4.309

10.5555/934911

10.5555/645635.758003