Experimental explorations on short text topic mining between LDA and NMF based Schemes

Knowledge-Based Systems - Tập 163 - Trang 1-13 - 2019
Yong Chen1,2,3, Hui Zhang1,2,3, Rui Liu1,3, Zhiwen Ye1,3, Jianying Lin1,3
1State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, PR China
2Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing, 100191, PR China
3School of Computer Science and Engineering, Beihang University, Beijing 100191, PR China

Tài liệu tham khảo

Song, 2014, Short text classification: A survey, J. Multimed., 9, 635, 10.4304/jmm.9.5.635-643

Kiritchenko, 2014, Sentiment analysis of short informal texts, J. Artificial Intelligence Res., 50, 723, 10.1613/jair.4272

Zhang, 2017, Processing long queries against short text: Top-k advertisement matching in news stream applications, ACM Trans. Inf. Syst., 35, 28:1, 10.1145/3052772

Liu, 2015, IncreSTS: Towards real-time incremental short text summarization on comment streams from social network services, IEEE Trans. Knowl. Data Eng., 27, 2986, 10.1109/TKDE.2015.2405553

Xuan, 2016, Uncertainty analysis for the keyword system of web events, IEEE Trans. Syst. Man Cybern., 46, 829, 10.1109/TSMC.2015.2470645

Lee, 1999, Learning the parts of objects by non-negative matrix factorization, Nature, 401, 788, 10.1038/44565

Walker, 2007, Sampling the dirichlet mixture model with slices, Comm. Statist. Simulation Comput., 36, 45, 10.1080/03610910601096262

Cheng, 2014, BTM: topic modeling over short texts, IEEE Trans. Knowl. Data Eng., 26, 2928, 10.1109/TKDE.2014.2313872

Francois Caron, Manuel Davy, Arnaud Doucet, Generalized polya urn for time-varying dirichlet process mixtures, UAI, 2007, pp. 33–40.

Chen, 2013, Generalized polya urn models, J. Appl. Probab., 50, 1169, 10.1239/jap/1389370106

Xu, 2017, Incorporating wikipedia concepts and categories as prior knowledge into topic models, Intell. Data Anal., 21, 443, 10.3233/IDA-160021

Xiaojun Quan, Chunyu Kit, Yong Ge, Sinno Jialin Pan, Short and sparse text topic modeling via self-aggregation, IJCAI, 2015, pp. 2270–2276.

Yu, 2016, Understanding short texts through semantic enrichment and hashing, IEEE Trans. Knowl. Data Eng., 28, 566, 10.1109/TKDE.2015.2485224

Jey Han Lau, David Newman, Timothy Baldwin, Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality, EACL, 2014, pp. 530–539.

Richard Landis, 1977, The measurement of observer agreement for categorical data, Biometrics, 33, 159, 10.2307/2529310