Data mining model for scientific research classification: the case of digital workplace accessibility

DECISION - 2024
Radka Nacheva1, Maciej Czaplewski2, Pavel Petrov1
1Department of Informatics, University of Economics – Varna, Varna, Bulgaria
2Institute of Spatial Management and Socio-Economic Geography, University of Szczecin, Szczecin, Poland

Tóm tắt

Research classification is an important aspect of conducting research projects because it allows researchers to efficiently identify papers that are in line with the latest research in each field and relevant to projects. There are different approaches to the classification of research papers, such as subject-based, methodology-based, text-based, and machine learning-based. Each approach has its advantages and disadvantages, and the choice of classification method depends on the specific research question and available data. The classification of scientific literature helps to better organize and structure the vast amount of information and knowledge generated in scientific research. It enables researchers and other interested parties to access relevant information in a fast and efficient manner. Classification methods allow easier and more accurate extraction of scientific knowledge to be used as a basis for scientific research in each subject area. In this regard, this paper aims to propose a research classification model using data mining methods and techniques. To test the model, we selected scientific articles on digital workplace accessibility for the disabled retrieved from Scopus and Web of Science repositories. We believe that the classification model is universal and can be applied in other scientific fields.

Từ khóa


Tài liệu tham khảo

Aggarwal T, Salatino AA, Osborne F, Motta E (2022) R-classify: extracting research papers’ relevant concepts from a controlled vocabulary. Softw Impacts 14:100444. https://doi.org/10.1016/j.simpa.2022.100444 ALDabbas A, Gál Z (2022) Recurrent neural network variants based model for Cassini-Huygens spacecraft trajectory modifications recognition. Neural Comput Appl 34(16):13575–13598. https://doi.org/10.1007/s00521-022-07145-0 Anshu (2019) Review paper on data mining techniques and applications. https://ssrn.com/abstract=3529347. Accessed 30 Jan 2024 Antonova K, Ivanova P (2023) How to manage people in a dynamic environment—innovative approaches and practice. J HR Technol 1:25–44 Bártová B, Bína V, Váchová L (2022) A PRISMA-driven systematic review of data mining methods used for defects detection and classification in the manufacturing industry. Prod J. https://doi.org/10.1590/0103-6513.20210097 Birjandi SM, Khasteh SH (2021) A survey on data mining techniques used in medicine. J Diabetes Metab Disord 20(2):2055–2071. https://doi.org/10.1007/s40200-021-00884-2 Bose R (2009) Advanced analytics: opportunities and challenges. Ind Manag Data Syst 109(2):155–172. https://doi.org/10.1108/02635570910930073 Charbuty B, Abdulazeez AM (2021) Classification based on decision tree algorithm for machine learning. J Appl Sci Technol Trends 2(01):20–28. https://doi.org/10.38094/jastt20165 Chaudhary R, Singh P, Mahajan R (2014) A survey on data mining techniques. Int J Adv Res Comput Commun Eng 3(1):5002–5003 Chowdhury S, Schoen MP (2020) Research paper classification using supervised machine learning techniques. In: 2020 intermountain engineering, technology and computing (IETC). https://doi.org/10.1109/ietc47856.2020.9249211 Deshpande S, Thakare VM (2010) Data mining system and applications: a review. Int J Distrib Parallel Syst 1(1):32–44. https://doi.org/10.5121/ijdps.2010.1103 Dimitrova D (2023) The concept “labour power” as a term in legislation and legal doctrine. Studia Iuris 1:24–31 Dunham MH (2003) Data mining introductory and advanced topics. https://openlibrary.org/books/OL26870779M/DataMiningIntroductoryandAdvancedTopics Esling P, Agon C (2012) Time-series data mining. ACM Comput Surv 45(1):1–34. https://doi.org/10.1145/2379776.2379788 Gu C (2022) Application of data mining technology in financial intervention based on data Fusion information entropy. J Sens 2022:1–10. https://doi.org/10.1155/2022/2192186 Gupta S, Gupta A (2019) Dealing with noise problem in machine learning data-sets: a systematic review. Procedia Comput Sci 161:466–474. https://doi.org/10.1016/j.procs.2019.11.146 Ho TK, Hull JJ, Srihari SN (1994) Decision combination in multiple classifier systems. IEEE Trans Pattern Anal Mach Intell 16(1):66–75. https://doi.org/10.1109/34.273716 Hong L, Sun X, Sun Y, Gao Y (2017) Text feature extraction based on deep learning: a review. EURASIP J Wirel Commun Netw. https://doi.org/10.1186/s13638-017-0993-1 Jüngermann F, Křetínský J, Weininger M (2022) Algebraically explainable controllers: decision trees and support vector machines join forces. arXiv (Cornell University). https://doi.org/10.48550/arxiv.2208.12804. Accessed 30 Jan 2024 Ketui N, Wisomka W, Homjun K (2019) Using classification data mining techniques for students performance prediction. In: 2019 joint international conference on digital arts, media and technology with ECTI northern section conference on electrical, electronics, computer and telecommunications engineering (ECTI DAMT-NCON), pp 359–363. https://doi.org/10.1109/ecti-ncon.2019.8692227 Kim S-W, Gi J-M (2019) Research paper classification systems based on TF-IDF and LDA schemes. Human-Centric Comput Inf Sci. https://doi.org/10.1186/s13673-019-0192-7 Koleva V (2023) E-recruitment and generation z job seekers. J HR Technol 1:63–75 Lim B, Zohren S (2021) Time-series forecasting with deep learning: a survey. Philos Trans R Soc 379(2194):20200209. https://doi.org/10.1098/rsta.2020.0209 Mahmoud DF, Moussa SM, Badr NL (2016) The evolution of data mining techniques to big data analytics: an extensive study with application to renewable energy data analytics. Asian J Appl Sci 4(3). https://www.ajouronline.com/index.php?journal=AJAS&page=article&op=view&path%5B%5D=3792. Accessed 30 Jan 2024 Massi MC, Ieva F, Lettieri E (2020) Data mining application to healthcare fraud detection: a two-step unsupervised clustering method for outlier detection with administrative databases. BMC Med Inform Decis Mak 20(1):160. https://doi.org/10.1186/s12911-020-01143-9 Mukherjee S (2019) Predictive analytics and predictive modeling in healthcare. Univ Cumberl. https://doi.org/10.2139/ssrn.3403900 Nacheva R (2022) Emotions mining research framework: higher education in the pandemic context. In: Terzioğlu MK (eds) Advances in econometrics, operational research, data science and actuarial studies, pp 299–310. https://doi.org/10.1007/978-3-030-85254-2_18 Nacheva R, Koleva V (2022) Exploring gender pay gap in the IT sector. In: Proceedings of international scientific-practical conference human resource management, pp 210–224 Nagi S, Bhattacharyya DK (2013) Classification of microarray cancer data using ensemble approach. Netw Model Anal Health Inform Bioinform 2(3):159–173. https://doi.org/10.1007/s13721-013-0034-x Narayana GS, Kolli K, Ansari MD, Gunjan VK (2020) A traditional analysis for efficient data mining with integrated association mining into regression techniques, pp 1393–1404. https://doi.org/10.1007/978-981-15-7961-5_127 Nikolov N (2023) Understanding student motivation in digital education. In: 2023 31st national conference with international participation (TELECOM), Sofia, Bulgaria, pp 1–5. https://doi.org/10.1109/TELECOM59629.2023.10409667 Nivethithaa KK, Vijayalakshmi S (2021) Survey on data mining techniques, process and algorithms. J Phys 197(1):012052. https://doi.org/10.1088/1742-6596/1947/1/012052 Noura M, Gyrard A, Heil S, Gaedke M (2019) Automatic knowledge extraction to build semantic web of things applications. IEEE Internet Things J 6(5):8447–8454. https://doi.org/10.1109/jiot.2019.2918327 Noura M, Wang Y, Heil S, Gaedke M (2021) OntoSpect: IoT ontology inspection by concept extraction and natural language generation. In: Brambilla M, Chbeir R, Frasincar F, Manolescu I (eds) Web engineering. ICWE 2021. Lecture notes in computer science, vol 12706, pp 37–52. https://doi.org/10.1007/978-3-030-74296-6_4 Olson D, Delen D (2008) Advanced data mining techniques. Springer, Berlin. https://doi.org/10.1007/978-3-540-76917-0 Omisore MO (2015) A classification model for mining research publications from crowdsourced data. In: IEEE tech. comm. digit. libr. https://bulletin.jcdl.org/Bulletin/v11n3/papers/154-Omisore.pdf. Accessed 30 Jan 2024 Orange (2023) Preprocess text. https://orangedatamining.com/widget-catalog/text-mining/preprocesstext/. Accessed 30 Jan 2024 Rahman N (2018) Data mining techniques and applications. Int J Strateg Inf Technol Appl 9(1):78–97. https://doi.org/10.4018/ijsita.2018010104 Rak T, Żyła R (2022) Using data mining techniques for detecting dependencies in the outcoming data of a Web-Based system. Appl Sci 12(12):6115. https://doi.org/10.3390/app12126115 Sarker IH (2021) Machine learning: algorithms, real-world applications and research directions. SN Comput Sci 2(3):160. https://doi.org/10.1007/s42979-021-00592-x Scimago Lab (2020) Scimago journal country rank. https://www.scimagojr.com/countryrank.php?year=2021. Accessed 30 Jan 2024 Stamenova S (2023) Improving the process of training staff in software companies through specialized software. In: 2023 international conference automatics and informatics (ICAI), pp 341–345.https://doi.org/10.1109/ICAI58806.2023.10339020 Sulova S (2021) Big data processing in the logistics industry. Econ Comput Sci 7(1):6–19 Todoranova L, Penchev B (2023) Higher education—accessible for people with disabilities. J HR Technol 2:45–56 Torkayesh AE, Tirkolaee EB, Bahrini A, Pamucar D, Khakbaz A (2023) A systematic literature review of MABAC method and applications: an outlook for sustainability and circularity. Informatica. https://doi.org/10.15388/23-infor511 UNESCO (2023) 2021 science report: statistics and resources. https://www.unesco.org/reports/science/2021/en/statistics. Accessed 30 Jan 2024 Vasilev J, Iliev I (2023) Digital competences, dependencies between mental indicators and defensive tactical performance indicators for students playing basketball. TEM J 12(1):445–451