Một phương pháp bán giám sát trong mô hình hóa chủ đề văn bản ngắn sử dụng phân cụm mờ nhúng để đề xuất hashtag trên Twitter
Tóm tắt
Từ khóa
#mạng xã hội #hashtag #mô hình hóa chủ đề #phân cụm mờ #Word2Vec #phân tích tweetTài liệu tham khảo
Pandey D, Wairya S, Pradhan B, Wangmo. Understanding covid-19 response by twitter users: a text analysis approach. Heliyon. 2022;8:09994. https://doi.org/10.1016/j.heliyon.2022.e09994.
Shahi GK, Dirkson A, Majchrzak TA. An exploratory study of covid-19 misinformation on twitter. Online Soc Netw Media. 2021;22: 100104. https://doi.org/10.1016/j.osnem.2020.100104.
Ahmad W, Wang B, Xu H, Xu M, Zeng Z. Topics, sentiments, and emotions triggered by covid-19-related tweets from IRAN and Turkey official news agencies. SN Computer Sci. 2021. https://doi.org/10.1007/s42979-021-00789-0.
Vera-Burgos CM, Griffin Padgett DR. Using twitter for crisis communications in a natural disaster: hurricane harvey. Heliyon. 2020;6(9):04804. https://doi.org/10.1016/j.heliyon.2020.e04804.
Ardon S, Bagchi A, Mahanti A, Ruhela A, Seth A, Tripathy RM, Triukose S. Spatio-temporal and events based analysis of topic popularity in twitter. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. CIKM ’13, Association for Computing Machinery, New York, NY, USA. 2013; pp. 219–28. https://doi.org/10.1145/2505515.2505525.
Jain M, Rajyalakshmi S, Tripathy RM, Bagchi A. Temporal analysis of user behavior and topic evolution on twitter. In: Bhatnagar V, Srinivasa S, editors. Big data analytics. Cham: Springer; 2013. p. 22–36.
Karimi S, Shakery A, Verma RM. Enhancement of twitter event detection using news streams. Nat Lang Eng. 2023;29:181–200. https://doi.org/10.1017/S1351324921000462.
Jeong D, Oh S, Park E. Demohash: hashtag recommendation based on user demographic information. Expert Syst Appl. 2022;210: 118375. https://doi.org/10.1016/j.eswa.2022.118375.
Qiang J, Qian Z, Li Y, Yuan Y, Wu X. Short text topic modeling techniques, applications, and performance: a survey. IEEE Trans Knowl Data Eng. 2022;34(3):1427–45. https://doi.org/10.1109/TKDE.2020.2992485.
Lai Y-W, Chen M-Y. Review of survey research in fuzzy approach for text mining. IEEE Access. 2023;11:39635–49. https://doi.org/10.1109/ACCESS.2023.3268165.
Pattanayak PK, Tripathy RM, Padhy S. A novel heuristic for graph-based topic modeling using spectral clustering. J Theor Appl Inf Technol. 2024;102:664–72.
Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003;3(Jan):993–1022.
Blei D, Carin L, Dunson D. Probabilistic topic models. IEEE Signal Process Mag. 2010;27(6):55–65.
Zou C. Analyzing research trends on drug safety using topic modeling. Expert Opin Drug Saf. 2018;17(6):629–36. https://doi.org/10.1080/14740338.2018.1458838.
Yu D, Fang A, Xu Z. Topic research in fuzzy domain: based on LDA topic modelling. Inf Sci. 2023;648: 119600. https://doi.org/10.1016/j.ins.2023.119600.
Yan X, Guo J, Lan Y, Cheng X. A biterm topic model for short texts. In: Proceedings of the 22nd International Conference on World Wide Web. 2013; pp. 1445–56.
Mazarura J, De Waal A. A comparison of the performance of latent Dirichlet allocation and the dirichlet multinomial mixture model on short text. In: 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech); 2016. pp. 1–6. https://doi.org/10.1109/RoboMech.2016.7813155.
Zhao F, Zhu Y, Jin H, Yang LT. A personalized hashtag recommendation approach using LDA-based topic model in microblog environment. Future Gener Computer Syst. 2016;65:196–206. https://doi.org/10.1016/j.future.2015.10.012.
Yin J, Wang J. A dirichlet multinomial mixture model-based approach for short text clustering. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD ’14. 2014; pp. 233–42. https://doi.org/10.1145/2623330.2623715.
Lossio-Ventura JA, Gonzales S, Morzan J, Alatrista-Salas H, Hernandez-Boussard T, Bian J. Evaluation of clustering and topic modeling methods over health-related tweets and emails. Artif Intell Med. 2021;117: 102096. https://doi.org/10.1016/j.artmed.2021.102096.
Wang F, Zhang JL, Li Y, Deng K, Liu JS. Bayesian text classification and summarization via a class-specified topic model. J Mach Learn Res. 2021;22(89):1–48.
Huakui Z, Cai Y, Bingshan Z, Haopeng R, Qing L. Multimodal topic modeling by exploring characteristics of short text social media. IEEE Trans Multimed. 2022. https://doi.org/10.1109/TMM.2022.3147064.
Tang Y-K, Huang H, Shi X, Mao X-L. Neural variational gaussian mixture topic model. ACM Trans Asian Low-Resour Lang Inf Process. 2023;22(4):1–8.
Weisser C, Gerloff C, Thielmann A, Python A, Reuter A, Kneib T, Säfken B. Pseudo-document simulation for comparing LDA, GSDMM and GPM topic models on short and sparse text using twitter data. Computational Statistics. 2022.
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst. 2013;26:3111–9.
Mikolov T, Chen K, Dean GC. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781. 2013.
Park S, Liu C. A study on topic models using LDA and Word2Vec in travel route recommendation: focus on convergence travel and tours reviews. Pers Ubiquit Comput. 2020;26(2):429–45. https://doi.org/10.1007/s00779-020-01476-2.
Yuan Z, Congrui L, Hao L, Junjie W. Topic modeling of short texts: a pseudo-document view with word embedding enhancement. IEEE Trans Knowl Data Eng. 2023;35(1):972–85. https://doi.org/10.1109/TKDE.2021.3073195.
Meddeb A, Romdhane LB. Using topic modeling and word embedding for topic extraction in twitter. Proc Computer Sci. 2022;207:790–9. https://doi.org/10.1016/j.procs.2022.09.134.
Verma P, Verma A, Pal S. An approach for extractive text summarization using fuzzy evolutionary and clustering algorithms. Appl Soft Comput. 2022;120: 108670. https://doi.org/10.1016/j.asoc.2022.108670.
Rijcken E, Scheepers F, Mosteiro P, Zervanou K, Spruit M, Kaymak U. A comparative study of fuzzy topic models and LDA in terms of interpretability. In: 2021 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE; 2021. pp. 1–8.
Zangerle E, Gassler W, Specht G. Recommending#-tags in twitter. In: Proceedings of the Workshop on Semantic Adaptive Social Web (SASWeb 2011). CEUR Workshop Proceedings, vol. 730; 2011. pp. 67–78.
Li T, Wu Y, Zhang Y. Twitter hash tag prediction algorithm. In: Proceedings on the International Conference on Internet Computing (ICOMP). 2011; p. 1
Zangerle E, Gassler W, Specht G. On the impact of text similarity functions on hashtag recommendations in microblogging environments. Soc Netw Anal Min. 2013;3:889–98.
Otsuka E, Wallace SA, Chiu D. A hashtag recommendation system for twitter data streams. Comput Soc Netw. 2016;3:1–26.
Zhao F, Zhu Y, Jin H, Yang LT. A personalized hashtag recommendation approach using LDA-based topic model in microblog environment. Future Gener Computer Syst. 2016;65:196–206.
Li C, Duan Y, Wang H, Zhang Z, Sun A, Ma Z. Enhancing topic modeling for short texts with auxiliary word embeddings. ACM Trans Inf Syst. 2017;36:1–30. https://doi.org/10.1145/3091108.
Ben-Lhachemi N, Nfaoui EH. Using tweets embeddings for hashtag recommendation in twitter. Proc Computer Sci. 2018;127:7–15. https://doi.org/10.1016/j.procs.2018.01.092.
Cui W, Du J, Wang D, Kou F, Liang M, Xue Z, Zhou N. Extended search method based on a semantic hashtag graph combining social and conceptual information. World Wide Web. 2019;22:2589–610. https://doi.org/10.1007/s11280-018-0584-z.
Li P, Li T, Zhang S, Li Y, Tang Y, Jiang Y. A semi-explicit short text retrieval method combining Wikipedia features. Eng Appl Artif Intell. 2020;94: 103809. https://doi.org/10.1016/j.engappai.2020.103809.
Cantini R, Marozzo F, Bruno G, Trunfio P. Learning sentence-to-hashtags semantic mapping for hashtag recommendation on microblogs. ACM Trans Knowl Discov Data. 2021. https://doi.org/10.1145/3466876.
Chakrabarti P, Malvi E, Bansal S, Kumar N. Hashtag recommendation for enhancing the popularity of social media posts. Soc Netw Anal Min. 2023. https://doi.org/10.1007/s13278-023-01024-9.
Rijcken E, Zervanou K, Spruit M, Mosteiro P, Scheepers F, Kaymak U. Exploring embedding spaces for more coherent topic modeling in electronic health records. In: 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC). 2022; pp. 2669–74. https://doi.org/10.1109/SMC53654.2022.9945594.
Yuan X, Han L, Qian S, Xu G, Yan H. Singular value decomposition based recommendation using imputed data. Knowl-Based Syst. 2019;163:485–94. https://doi.org/10.1016/j.knosys.2018.09.011.