HashGO: hashing gene ontology for protein function prediction

Computational Biology and Chemistry - Tập 71 - Trang 264-273 - 2017
Guoxian Yu1, Yingwen Zhao1, Chang Lu1, Jun Wang1
1College of Computer and Information Science, Southwest University, Chongqing 400715, China

Tài liệu tham khảo

Alexandra, 2013, Biases in the experimental annotations of protein function and their effect on our understanding of protein function space, PLoS Comput. Biol., 9, e1003063, 10.1371/journal.pcbi.1003063 Ashburner, 2000, Gene ontology: tool for the unification of biology, Nat. Genet., 25, 25, 10.1038/75556 Barutcuoglu, 2006, Hierarchical multi-label prediction of gene function, Bioinformatics, 22, 830, 10.1093/bioinformatics/btk048 Benso, 2012, Using gnome wide data for protein function prediction by exploiting gene ontology relationships, 497 Chen, 2012, Feature-aware label space dimension reduction for multi-label classification, 1529 Cho, 2015, Diffusion component analysis: unraveling functional topology in biological networks, 62 Done, 2010, Predicting novel human gene ontology annotations using semantic analysis, IEEE/ACM Trans. Comput. Biol. Bioinform., 7, 91, 10.1109/TCBB.2008.29 Eisner, 2005, Improving protein function prediction using the hierarchical structure of the gene ontology, 1 Falda, 2012, Argot2: a large scale function prediction tool relying on semantic similarity of weighted gene ontology terms, BMC Bioinform., 13, S14, 10.1186/1471-2105-13-S4-S14 Fu, 2016, Neggoa: negative go annotations selection using ontology structure, Bioinformatics, 32, 2996, 10.1093/bioinformatics/btw366 The gene ontology database http://geneontology.org/page/download-ontology (accessed 15.12.16). The gene ontology annotation files http://geneontology.org/page/download-annotations (accessed 15.12.16). Guzzi, 2012, Semantic similarity analysis of protein data: assessment with biological features and issues, Brief Bioinform., 13, 569, 10.1093/bib/bbr066 Hsu, 2009, Multi-label prediction via compressed sensing, 772 Huang, 2009, Systematic and integrative analysis of large gene lists using David bioinformatics resources, Nat. Protoc., 4, 44, 10.1038/nprot.2008.211 Huntley, 2014, Understanding how and why the gene ontology and its annotations evolve: the go within uniprot, GigaScience, 3, 1, 10.1186/2047-217X-3-4 Jain, 2010, An improved method for scoring protein-protein interactions using semantic similarity within the gene ontology, BMC Bioinform., 11, 1, 10.1186/1471-2105-11-562 Jensen, 2003, Prediction of human protein function according to gene ontology categories, Bioinformatics, 19, 635, 10.1093/bioinformatics/btg036 Jiang, 2015, Scalable graph hashing with feature transformation, Proceedings of the 24th International Conference on Artificial Intelligence, 2248 Jiang, 2014, The impact of incomplete knowledge on the evaluation of protein function prediction: a structured-output learning perspective, Bioinformatics, 30, i609, 10.1093/bioinformatics/btu472 Jiang, 2016, An expanded evaluation of protein function prediction methods shows an improvement in accuracy, Genome Biol., 17, 184, 10.1186/s13059-016-1037-6 Kulis, 2009, Kernelized locality-sensitive hashing for scalable image search, 2130 Lin, 1998, An information-theoretic definition of similarity, 296 Liu, 2011, Hashing with graphs, 1 Liu, 2012, Supervised hashing with kernels, 2074 Lord, 2003, Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation, Bioinformatics, 19, 1275, 10.1093/bioinformatics/btg153 Martin, 2004, Gotcha: a new method for prediction of protein function assessed by the annotation of seven genomes, BMC Bioinform., 5, 178, 10.1186/1471-2105-5-178 Mistry, 2008, Gene ontology term overlap as a measure of gene functional similarity, BMC Bioinform., 9, 327, 10.1186/1471-2105-9-327 Myers, 2006, Finding function: evaluation methods for functional genomic data, BMC Genom., 7, 187, 10.1186/1471-2164-7-187 Pandey, 2009, Incorporating functional inter-relationships into protein function prediction algorithms, BMC Bioinform., 10, 142, 10.1186/1471-2105-10-142 Peng, 2015, Measuring semantic similarities by combining gene ontology annotations and gene co-function networks, BMC Bioinform., 16, 44, 10.1186/s12859-015-0474-7 Pesquita, 2008, Metrics for go based protein semantic similarity: a systematic evaluation, BMC Bioinform., 9, S4, 10.1186/1471-2105-9-S5-S4 Radivojac, 2013, A large-scale evaluation of computational protein function prediction, Nat. Methods, 10, 221, 10.1038/nmeth.2340 Rentzsch, 2009, Protein function prediction-the power of multiplicity, Trends Biotechnol., 27, 210, 10.1016/j.tibtech.2009.01.002 Rhee, 2008, Use and misuse of the gene ontology annotations, Nat. Rev. Genet., 9, 509, 10.1038/nrg2363 Roberts, 2004, Identifying protein function-a call for community action, PLoS Biol., 2, 293, 10.1371/journal.pbio.0020042 Shehu, 2016, A survey of computational methods for protein function prediction, 225 Stark, 2006, Biogrid: a general repository for interaction datasets, Nucleic Acids Res., 34, D535, 10.1093/nar/gkj109 Subramanian, 2005, Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles, Proc. Natl. Acad. Sci. U. S. A., 102, 15545, 10.1073/pnas.0506580102 Tao, 2007, Information theory applied to the sparse gene ontology annotation network to predict novel gene function, Bioinformatics, 23, i529, 10.1093/bioinformatics/btm195 Teng, 2013, Measuring gene functional similarity based on group-wise comparison of go terms, Bioinformatics, 29, 1424, 10.1093/bioinformatics/btt160 Thomas, 2012, On the use of gene ontology annotations to assess functional similarity among orthologs and paralogs: a short report, PLoS Comput. Biol., 8, 1454, 10.1371/journal.pcbi.1002386 Tian, 2016, Sgfsc: speeding the gene functional similarity calculation based on hash tables, BMC Bioinform., 17, 445, 10.1186/s12859-016-1294-0 Tong, 2008, Random walk with restart: fast solutions and applications, Knowl. Inf. Syst., 14, 327, 10.1007/s10115-007-0094-2 Valentini, 2011, True path rule hierarchical ensembles for genome-wide gene function prediction, IEEE/ACM Trans. Comput. Biol. Bioinform., 8, 832, 10.1109/TCBB.2010.38 Wang, 2015, Exploiting ontology graph for predicting sparsely annotated gene function, Bioinformatics, 31, i357, 10.1093/bioinformatics/btv260 Wang, 2016, Learning to hash for indexing big data-a survey, Proc. IEEE, 104, 34, 10.1109/JPROC.2015.2487976 Weiss, 2009, Spectral hashing, 1753 Wilcoxon, 1945, Individual comparisons by ranking methods, Biom. Bull., 1, 80, 10.2307/3001968 Wu, 2014, Genome-wide protein function prediction through multi-instance multi-label learning, IEEE/ACM Trans. Comput. Biol. Bioinform., 11, 891, 10.1109/TCBB.2014.2323058 Yang, 2012, Improving go semantic similarity measures by exploring the ontology beneath the terms and modelling uncertainty, Bioinformatics, 28, 1383, 10.1093/bioinformatics/bts129 Yu, 2013, Protein function prediction using dependence maximization, 574 Yu, 2013, Protein function prediction using multilabel ensemble classification, IEEE/ACM Trans. Comput. Biol. Bioinform., 10, 1045 Yu, 2015, Predicting protein function via downward random walks on a gene ontology, BMC Bioinform., 16, 217, 10.1186/s12859-015-0713-y Yu, 2015, Predicting protein functions using incomplete hierarchical labels, BMC Bioinform., 16, 10.1186/s12859-014-0430-y Yu, 2016, Interspecies gene function prediction using semantic similarity, BMC Syst. Biol., 10, 495, 10.1186/s12918-016-0361-5 Yu, 2016, Predicting protein function via semantic integration of multiple networks, IEEE/ACM Trans. Comput. Biol. Bioinform., 13, 220, 10.1109/TCBB.2015.2459713 Zhang, 2012, A framework for incorporating functional interrelationships into protein function prediction algorithms, IEEE/ACM Trans. Comput. Biol. Bioinform., 9, 740, 10.1109/TCBB.2011.148 Zhang, 2014, A review on multi-label learning algorithms, IEEE Trans. Knowl. Data Eng., 26, 1819, 10.1109/TKDE.2013.39 Zhang, 2016, Robust label compression for multi-label classification, Knowl. Based Syst., 107, 32, 10.1016/j.knosys.2016.05.051