Phrase2Vec: Phrase embedding based on parsing

Information Sciences - Tập 517 - Trang 100-127 - 2020
Yongliang Wu1, Shuliang Zhao2,3,4, Wenbin Li5
1College of Mathematics and Information Science, Hebei Normal University, Hebei 050024, China
2College of Computer and Cyber Security, Hebei Normal University, Hebei 050024, China
3Hebei Provincial Engineering Research Center for Supply Chain Big Data Analytics & Data Security, Hebei 050024, China
4Hebei Provincial Key Laboratory of Network and Information Security, Hebei 050024, China
5College of Information Engineering, Hebei GEO University, Hebei 050024, China

Tóm tắt

Từ khóa


Tài liệu tham khảo

Blei, 2003, Latent Dirichlet allocation, J. Mach. Learn. Res., 3, 993

Brockmeier, 2018, Self-tuned descriptive document clustering using a predictive network, IEEE Trans. Knowl. Data Eng., 30, 1929, 10.1109/TKDE.2017.2781721

Burkhardt, 2018, Online multi-label dependency topic models for text classification, Mach. Learn., 107, 859, 10.1007/s10994-017-5689-6

Camacho-Collados, 2018, A survey on vector representations of meaning, J. Artif. Intell. Res., 63, 743, 10.1613/jair.1.11259

Canuto, 2018, A thorough evaluation of distance-based meta-features for automated text classification, IEEE Trans. Knowl. Data Eng., 30, 2242, 10.1109/TKDE.2018.2820051

Durrani, 2015, The operation sequence model - combining n-gram-based and phrase-based statistical machine translation, Comput. Linguist., 41, 185, 10.1162/COLI_a_00218

Eriguchi, 2019, Incorporating source-side phrase structures into neural machine translation, Comput. Linguist., 45, 267, 10.1162/coli_a_00348

Gebhardt, 2017, Hybrid grammars for parsing of discontinuous phrase structures and non-projective dependency structures, Comput. Linguist., 43, 465, 10.1162/COLI_a_00291

Greene, 2006, Practical solutions to the problem of diagonal dominance in kernel document clustering, 377

Hashimoto, 2016, Adaptive joint learning of compositional and non-compositional phrase embeddings, 205

Jie, 2017, Efficient dependency-guided named entity recognition, 3457

Kim, 2019, Multi-co-training for document classification using various document representations: TF-IDF, LDA, and Doc2Vec, Inf. Sci., 477, 15, 10.1016/j.ins.2018.10.006

Kim, 2019, Improving visual question answering by referring to generated paragraph captions, 3606

Le, 2014, Distributed representations of sentences and documents, 1188

Li, 2017, 3067

Li, 2018, An adaptive hierarchical compositional model for phrase embedding, 4144

Li, 2019, An efficient method for high quality and cohesive topical phrase mining, IEEE Trans. Knowl. Data Eng., 31, 120, 10.1109/TKDE.2018.2823758

Li, 2013, Computing term similarity by large probabilistic isA knowledge, 1401

Liang, 2017, Inferring dynamic user interests in streams of short texts for user clustering, ACM Trans. Inf. Syst., 36, 10, 10.1145/3072606

Liu, 2015, Mining quality phrases from massive text corpora, 1729

Liu, 2016, Predicting your career path, 201

Mei, 2017, Large scale document categorization with fuzzy clustering, IEEE Trans. Fuzzy Syst., 25, 1239, 10.1109/TFUZZ.2016.2604009

Meng, 2018, Weakly-Supervised neural text classification, 983

Mikolov, 2013, Efficient estimation of word representations in vector space, CoRR

Mikolov, 2013, Distributed representations of words and phrases and their compositionality, 3111

Passban, 2016, Enriching phrase tables for statistical machine translation using mixed embeddings, 2582

Passos, 2014, Lexicon infused phrase embeddings for named entity resolution, 78

Pei, 2018, Concept factorization with adaptive neighbors for document clustering, IEEE Trans. Neural Netw. Learn. Syst., 29, 343, 10.1109/TNNLS.2016.2626311

Pennington, 2014, Glove: global vectors for word representation, 1532

Preotiuc-Pietro, 2017, Political ideology prediction of twitter users, 729

Salles, 2018, Improving random forests by neighborhood projection for effective text classification, Inf. Syst., 77, 1, 10.1016/j.is.2018.05.006

Sánchez-Cartagena, 2016, Integrating rules and dictionaries from shallow-transfer machine translation into phrase-based statistical machine translation, J. Artif. Intell. Res., 55, 17, 10.1613/jair.4761

Shang, 2018, Automated phrase mining from massive text corpora, IEEE Trans. Knowl. Data Eng., 30, 1825, 10.1109/TKDE.2018.2812203

Socher, 2010, Learning continuous phrase representations and syntactic parsing with recursive neural networks, 1

Stein, 2019, An analysis of hierarchical text classification using word embeddings, Inf. Sci., 471, 216, 10.1016/j.ins.2018.09.001

Sun, 2016, Two jointly predictive models for word representations and phrase representations, 2821

Wang, 2016, Connecting phrase based statistical machine translation adaptation, 3135

Wang, 2017, Translating phrases in neural machine translation, 1421

Wieting, 2015, From paraphrase database to compositional paraphrase model and back, TACL, 3, 345, 10.1162/tacl_a_00143

Wieting, 2016, Towards universal paraphrastic sentence embeddings, 1

Xu, 2017, Self-Taught convolutional neural networks for short text clustering, Neural Netw., 88, 22, 10.1016/j.neunet.2016.12.008

Yin, 2016, Discriminative phrase embedding for paraphrase identification, CoRR

Zhang, 2014, Bilingually-constrained phrase embeddings for machine translation, 111

Zhang, 2016, Probabilistic graph-based dependency parsing with convolutional neural network, 1382

Zhao, 2018, Fuzzy bag-of-words model for document representation, IEEE Trans. Fuzzy Syst., 26, 794, 10.1109/TFUZZ.2017.2690222

Zhao, 2018, Phrase table as recommendation memory for neural machine translation, 4609

Zhao, 2017, Ngram2vec: learning improved word representations from ngram co-occurrence statistics, 244

Zhuang, 2017, Bag-of-Discriminative-Words (BoDW) representation via topic modeling, IEEE Trans. Knowl. Data Eng., 29, 977, 10.1109/TKDE.2017.2658571