Using machine learning techniques for rising star prediction in co-author network

Scientometrics - Tập 102 - Trang 1687-1711 - 2014

Ali Daud¹, Muhammad Ahmad², M. S. I. Malik¹, Dunren Che³

¹Department of Computer Science and Software Engineering, International Islamic University, Islamabad, Pakistan

²Department of Computer Science, Allama Iqbal Open University, Islamabad, Pakistan

³Department of Computer Science, Southern Illinois University, Carbondale, USA

Tóm tắt

Online bibliographic databases are powerful resources for research in data mining and social network analysis especially co-author networks. Predicting future rising stars is to find brilliant scholars/researchers in co-author networks. In this paper, we propose a solution for rising star prediction by applying machine learning techniques. For classification task, discriminative and generative modeling techniques are considered and two algorithms are chosen for each category. The author, co-authorship and venue based information are incorporated, resulting in eleven features with their mathematical formulations. Extensive experiments are performed to analyze the impact of individual feature, category wise and their combination w.r.t classification accuracy. Then, two ranking lists for top 30 scholars are presented from predicted rising stars. In addition, this concept is demonstrated for prediction of rising stars in database domain. Data from DBLP and Arnetminer databases (1996–2000 for wide disciplines) are used for algorithms’ experimental analysis.

Tài liệu tham khảo

Bermejo, P., Gamez, J. A., & Puerta, J. M. (2014). Speeding up incremental wrapper feature subset selection with Naive Bayes classifier. Knowledge-Based Systems, 55, 140–147. Chen, J., Huang, H., Tian, S., & Qu, Y. (2009). Feature selection for text classification with Naïve Bayes. Expert Systems with Applications, 36(3), 5432–5435. Chrysos, G., Dagritzikos, P., Papaefstathiou, I., & Dollas, A. (2013). HC-CART: A parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system. ACM Transactions on Architecture and Code Optimization, 9(4), 47. Constantinou, A. C., Fenton, N. E., & Neil, M. (2012). pi-football: A Bayesian network model for forecasting Association Football match outcomes. Knowledge-Based Systems, 36, 322–339. Cui, X., Afify, M., Gao, Y., & Zhou, B. (2013). Stereo hidden Markov modeling for noise robust speech recognition. Computer Speech & Language, 27(2), 407–419. Cuxac, P., Lamirel, J.-C., & Bonvallot, V. (2013). Efficient supervised and semi-supervised approaches for affiliations disambiguation. Scientometrics, 97(1), 47–58. Daud, A., Abbasi, R., & Muhammad, F. (2013). Finding rising stars in social networks. Database Systems for Advanced Applications (LNCS), 7825, 13–24. Daud, A., Li, J., Zhou, L., & Muhammad, F. (2010). Temporal expert finding through generalized time topic modeling. Knowledge-Based Systems (KBS), 23(6), 615–625. Fakhari, A., & Moghadam, A. M. E. (2013). Combination of classification and regression in decision tree for multi-labeling image annotation and retrieval. Applied Soft Computing, 13(2), 1292–1302. Farid, D. M., Zhang, L., Rahman, C. F., Hossain, M. A., & Strachan, R. (2014). Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks. Expert Systems with Applications, 41(4) Part 2, 1937–1946. Gu, F., Zhang, H., & Zhu, D. (2013). Blind separation of non-stationary sources using continuous density hidden Markov models. Digital Signal Processing, 23(5), 1549–1564. Guns, R., & Rousseau, R. (2014). Recommending research collaborations using link prediction and random forest classifiers. Scientometrics,. doi:10.1007/s11192-013-1228-9. Huang, S., Yang, B., Yan, S., & Rousseau, R. (2013). Institution name disambiguation for research assessment. Scientometrics,. doi:10.1007/s11192-013-1214-2. Kao, L. J., Chiu, C. C., & Chiu, F. Y. (2013). A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring. Knowledge-Based Systems, 36, 245–252. Li, Z., Fang, H., & Xia, L. (2014). Increasing mapping based hidden Markov model for dynamic process monitoring and diagnosis. Expert Systems with Applications, 41(2), 744–751. Li, X. K., Foo, C. S., Tew, K. L., & Ng, S. K. (2009).Searching for rising stars in bibliography networks. In Proceedings of the 14th international conference on database systems for advanced applications (pp. 288–292). Loh, W. J. (2011). Classification and regression trees. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(1), 14–23. López-Cruz, P. L., Larrañaga, P., DeFelipe, J., & Bielza, C. (2014). Bayesian network modeling of the consensus between experts: An application to neuron classification. International Journal of Approximate Reasoning, 55(1), 3–22. Ma, Z., Sun, A., & Cong, G. (2013). On predicting the popularity of newly emerging hashtags in Twitter. Journal of the American Society for Information Science and Technology, 64(7), 1399–1410. Mascaro, S., Nicholso, A. E., & Korb, K. B. (2014). Anomaly detection in vessel tracks using Bayesian networks. International Journal of Approximate Reasoning, 55(1), 84–98. McCallum, A., Freitag, D., & Pereira, F. C. (2000). Maximum entropy Markov models for information extraction and segmentation. In Proceedings of the seventeenth international conference on machine learning (pp. 591–598). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. Orman, L. V. (2013). Bayesian inference in trust networks. ACM Transactions on Management Information Systems (TMIS), 4(2), Article No. 7. New York, USA: ACM. Ren, F., & Kang, X. (2013). Employing hierarchical Bayesian networks in simple and complex emotion topic analysis. Computer Speech & Language, 27(4), 943–968. Santos, R. L. T., Macdonald, C., & Ounis, I. (2013). Learning to rank query suggestions for adhoc and diversity search. Information Retrieval, 16(4), 429–451. Sekercioglu, C. H. (2008). Quantifying co-author contributions. Science, 322, 371. Song, I. J., & Cho, S. B. (2013). Bayesian and behavior networks for context-adaptive user interface in a ubiquitous home environment. Expert Systems with Applications, 40(5), 1827–1838. Speybroeck, N. (2012). Classification and regression trees. International Journal of Public Health., 57(1), 243–246. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., & Su, Z. (2008). Arnetminer: Extraction and mining of academic social networks. In Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 990–998). Tsatsaronis, G., Varlamis, I., & Norvag, K. (2011). How to become a group leader? Or modeling author types based on graph mining. LNCS, 6966, 15–26. Wang, G. A., Jiao, J., Abrahams, A. S., Fan, W., & Zhang, Z. (2013). Expert rank: A topic-aware expert finding algorithm for online knowledge communities. Decision Support Systems, 54(3), 1442–1451. Yan, R., Huang, C., Tang, J., Zhang, Y., & Li, X. (2012). To better stand on the shoulder of giants. In JCDL ‘12 Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries New York (pp. 51–60). Zhang, G., Ding, Y., & Milojevic, S. (2013). Citation content analysis (CCA): A method for syntactic and semantic analysis of citation content. Journal of the American Society for Information Science and Technology, 64(7), 1490–1503.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích ảnh hưởng của các bài báo, công bố khoa học Việt Nam và Quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ SciBase

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Hệ thống hội thảo khoa học Việt Nam

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA

Thông tin liên hệ & hỗ trợ

Đơn vị chủ quản, phát triển và vận hành: Công ty Cổ phần Metis

Địa chỉ liên hệ: 26A Lê Đức Thọ, Phường Từ Liêm, Thành phố Hà Nội

Số giấy chứng nhận ĐKKD: 0109293202 cấp ngày 03/08/2020 tại Sở Kế hoạch và Đầu tư thành phố Hà Nội

Người quản lý và chịu trách nhiệm nội dung: Nguyễn Ngọc Sơn

Hotline: 0566.685.688

Email: [email protected]