A Survey of the Usages of Deep Learning for Natural Language Processing

IEEE Transactions on Neural Networks and Learning Systems - Tập 32 Số 2 - Trang 604-624 - 2021
Daniel W. Otter1, Julian Richard Medina1, Jugal Kalita1
1University of Colorado at Colorado Springs, Colorado Springs, CO, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

10.18653/v1/W18-5446

winograd, 1971, Procedures as a representation for data in a computer program for understanding natural language

hennessy, 2017, Computer Architecture A Quantitative Approach

schuman, 2017, A survey of neuromorphic computing and neural networks in hardware, arXiv 1705 06963

10.18653/v1/S17-2094

10.1177/0007650316683926

10.1145/3269206.3271800

10.1145/365153.365168

bobrow, 1964, Natural language input for a computer problem solving system

10.1145/234173.234209

10.1145/3331184.3331317

nogueira dos santos, 2015, Boosting named entity recognition with neural character embeddings, arXiv 1505 05008

10.3115/1119176.1119202

10.3115/1118853.1118857

santos, 2006, Harem: An advanced ner evaluation contest for portuguese, Proc LREC, 1986

sang, 2003, Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition, Proc HLT-NAACL, 4, 142

chiu, 2015, Named entity recognition with bidirectional LSTM-CNNs, arXiv 1511 08308

pang, 2016, Text matching as image recognition, Proc 13th AAAI Conf Artif Intell, 2793

10.1145/2983323.2983769

10.18653/v1/D15-1044

bahdanau, 2014, Neural machine translation by jointly learning to align and translate, arXiv 1409 0473

el hihi, 1996, Hierarchical recurrent neural networks for long-term dependencies, Proc NIPS, 493

10.1162/neco.1992.4.2.234

10.1109/ASRU.2011.6163930

10.1109/ICASSP.2011.5947611

johnson, 2016, Google’s multilingual neural machine translation system: Enabling zero-shot translation, arXiv 1611 04558

chung, 2014, Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv 1412 3555

jurafsky, 2017, Speech & language processing

10.3115/v1/W14-4012

10.1007/s13042-015-0347-4

10.1109/TNNLS.2016.2582924

10.1162/neco.1997.9.8.1735

10.18653/v1/N16-1030

pradhan, 2013, Towards robust linguistic analysis using ontonotes, Proc CoNLL, 143

10.18653/v1/N16-1034

10.3115/v1/P15-1017

akbik, 2018, Contextual string embeddings for sequence labeling, Proc COLING, 1638

lafferty, 2001, Conditional random fields: Probabilistic models for segmenting and labeling sequence data, Proc 18th Int Conf Mach Learn, 282

10.18653/v1/P16-1008

10.1145/3159652.3159712

10.1016/j.neucom.2016.12.075

10.18653/v1/P19-1276

fausett, 1994, Fundamentals of Neural Networks Architectures Algorithms and Applications

10.1016/0364-0213(90)90002-E

10.3115/1614049.1614064

mikolov, 2010, Recurrent neural network based language model, Proc 11th Annu Conf Int Speech Commun Assoc, 2, 3

10.3115/v1/D14-1181

dos santos, 2014, Deep convolutional neural networks for sentiment analysis of short texts, Proc COLING, 69

10.3115/v1/P14-1062

10.1007/BF00364149

zeng, 2014, Relation classification via convolutional deep neural network, Proc COLING, 2335

10.1145/2601069

socher, 2011, Dynamic pooling and unfolding recursive autoencoders for paraphrase detection, Proc NIPS, 801

10.1109/ICNN.1996.548916

10.1016/j.csl.2014.09.005

10.1109/ASRU.1997.659013

10.18653/v1/S18-2023

10.1162/tacl_a_00152

10.3115/v1/P15-1129

10.18653/v1/P17-2098

white, 2017, Inference is everything: Recasting semantic resources into a unified evaluation framework, Proc IJCNLP, 1, 996

rahman, 2012, Resolving complex cases of definite pronouns: The winograd schema challenge, Proc Joint EMNLP CoNLL, 777

10.3115/v1/P15-2067

10.3115/1072228.1072378

10.18653/v1/N18-2082

williams, 2017, A broad-coverage challenge corpus for sentence understanding through inference, arXiv 1704 05426

10.18653/v1/W17-5301

lin, 2013, Network in network, arXiv 1312 4400

10.18653/v1/D16-1123

10.21437/Interspeech.2017-1442

daniluk, 2017, Frustratingly short attention spans in neural language modeling, arXiv 1702 04521

chelba, 2013, One billion word benchmark for measuring progress in statistical language modeling, arXiv 1312 3005

10.21236/ADA273556

10.1006/csla.2000.0156

chen, 2008, Evaluation metrics for language models

paulus, 2017, A deep reinforced model for abstractive summarization, arXiv 1705 04304

lu, 2013, A deep architecture for matching short texts, Proc Adv Neural Inf Process Syst, 1367

10.1145/2567948.2577348

huang, 2013, Learning deep structured semantic models for Web search using clickthrough data, Proc ACM CIKM, 2333

10.1145/3077136.3082062

10.1109/ICASSP.2013.6638947

10.1109/MSP.2012.2205597

10.1109/TKDE.2017.2754499

auer, 2007, DBpedia: A nucleus for a Web of open data, The Semantic Web, 722, 10.1007/978-3-540-76298-0_52

10.1145/1553374.1553486

coates, 2013, Deep learning with cots HPC systems, Proc ICML, 1337

lecun, 2015, Deep learning, Nature, 521, 436, 10.1038/nature14539

goodfellow, 2016, Deep Learning, 1

10.1145/219717.219748

ciresan, 2011, Flexible, high performance convolutional neural networks for image classification, Proc IJCAI, 22, 1237

bengio, 2003, A neural probabilistic language model, J Mach Learn Res, 3, 1137

10.1016/j.neunet.2014.09.003

brunner, 2018, Natural language multitasking: Analyzing and improving syntactic saliency of hidden representations, arXiv 1801 06024

koehn, 2005, Europarl: A parallel corpus for statistical machine translation, Proc MT Summit, 5, 79

collobert, 2011, Natural language processing (almost) from scratch, J Mach Learn Res, 12, 2493

kumar srivastava, 2015, Highway networks, arXiv 1505 00387

10.1109/CVPR.2016.90

jurafsky, 2000, Speech & language processing

10.1109/CVPR.2017.243

vaswani, 2017, Attention is all you need, Proc NIPS, 6000

10.18653/v1/P17-1018

nair, 2010, Rectified linear units improve restricted Boltzmann machines, Proc ICML, 807

10.1109/72.279181

liu, 2017, Stochastic answer networks for machine reading comprehension, arXiv 1712 03556

10.18653/v1/P19-1441

devlin, 2018, BERT: Pre-training of deep bidirectional transformers for language understanding, arXiv 1810 04805

10.18653/v1/N18-1202

luong, 2013, Better word representations with recursive neural networks for morphology, Proc CoNLL, 104

10.1145/371920.372094

liu, 2018, Stochastic answer networks for natural language inference, arXiv 1804 07888

10.18653/v1/P19-1334

10.1145/1187415.1187418

10.1080/01690969108406936

kim, 2016, Character-aware neural language models, Proc AAAI, 2741

zaremba, 2014, Recurrent neural network regularization, arXiv 1409 2329

botha, 2014, Compositional morphology for word representations and language modelling, Proc ICML, 1899

jozefowicz, 2016, Exploring the limits of language modeling, arXiv 1602 02410

ji, 2015, Document context language models, arXiv 1511 03962

shazeer, 2015, Sparse non-negative matrix language modeling for skip-grams, Proc INTERSPEECH, 1428

10.1109/ICASSP.2015.7179001

mikolov, 2013, Efficient estimation of word representations in vector space, arXiv 1301 3781 [cs]

mikolov, 2013, Distributed representations of words and phrases and their compositionality, Proc NIPS, 3111

radford, 2018, Improving language understanding by generative pre-training

adhikari, 2019, DocBERT: BERT for document classification, arXiv 1904 08398

worsham, 2018, Genre identification and the compositional effect of genre in literature, Proc COLING, 1963

10.1109/ICCNC.2018.8390270

10.1162/neco.2006.18.7.1527

sutton, 1998, Reinforcement Learning An Introduction, 1

smolensky, 1986, Information processing in dynamical systems: Foundations of harmony theory

fletcher, 2013, Practical Methods of Optimization

socher, 2013, Parsing with compositional vector grammars, Proc ACL, 1, 455

socher, 2013, Recursive deep models for semantic compositionality over a sentiment treebank, Proc EMNLP, 1631

lin, 2019, A bert-based universal model for both within-and cross-sentence clinical temporal relation extraction, Proc Clin NLP Workshop, 65

10.3115/1697236.1697250

10.18653/v1/E17-1104

10.3115/1613148.1613156

10.1007/s00521-016-2401-x

nivre, 2003, An efficient algorithm for projective dependency parsing, Proc Int Workshop Parsing Technol, 149

more, 2018, CONLL-UL: Universal morphological lattices for universal dependency parsing, Proc 11th Int Conf Lang Resour Eval, 3847

10.3115/v1/D14-1081

vinyals, 2015, Grammar as a foreign Language, Proc NIPS, 2773

10.1007/978-1-4615-3986-5_10

10.1007/978-3-642-76626-8_35

10.18653/v1/P17-1080

huang, 2012, Improving word representations via global context and multiple word prototypes, Proc ACL, 1, 873

cettolo, 2016, An Arabic–Hebrew parallel corpus of TED talks, arXiv 1610 00572

cettolo, 2012, WIT3: Web inventory of transcribed and translated talks, Proc Conf Eur Assoc Mach Transl, 261

10.1145/365628.365657

10.18653/v1/D18-1312

10.18653/v1/D15-1276

kawahara, 2006, Case frame compilation from the Web using high-performance computing, Proc LREC, 1344

kawahara, 2002, Construction of a Japanese relevance-tagged corpus, Proc LREC, 2008

hangyo, 2012, Building a diverse document leads corpus annotated with semantic relations, Proc Pacific–Asia Conf Lang Inf Comput, 535

10.18653/v1/P17-1016

10.3115/1220175.1220238

petrov, 2012, Overview of the 2012 shared task on parsing the Web, Proc Notes 1st Workshop Syntactic Anal Non-Canonical Lang, 59, 1

martin, 2018, Event representations for automated story generation with deep neural nets, Proc 32nd AAAI Conf Artif Intell, 868

tucker, 2019, Genrating believable poetry in multiple languages using GPT-2

10.18653/v1/P18-1153

radford, 2019, Language models are unsupervised multitask learners, OpenAIRE blog, 1, 9

bena, 2019, Introducing aspects of creativity in automatic poetry generation, Proc Int Conf NLP

jain, 2017, Story generation from sequence of independent short descriptions, arXiv 1707 05501

10.18653/v1/W18-1505

ren, 2017, Neural joke generation

chippada, 2018, Knowledge amalgam: Generating jokes and quotes together, arXiv 1806 04387

10.18653/v1/D18-1462

10.18653/v1/N18-1204

drissi, 2018, Hierarchical text generation using an outline, Proc Int Conf NLP, 180

huang, 2018, Hierarchically structured reinforcement learning for topically coherent visual story generation, arXiv 1805 08191

10.18653/v1/P18-1082

10.3115/v1/P15-1107

lin, 2017, Adversarial ranking for language generation, Proc Adv Neural Inf Process Syst, 3155

tambwekar, 2018, Controllable neural story plot generation via reinforcement learning, arXiv 1809 10736

zhang, 2017, Adversarial feature matching for text generation, Proc 34th Int Conf Mach Learn, 70, 4006

chen, 2018, Adversarial text generation via feature-mover’s distance, Proc Adv Neural Inf Process Syst, 4666

guo, 2018, Long text generation via adversarial training with leaked information, Proc 32nd AAAI Conf Artif Intell, 5141

doersch, 2016, Tutorial on variational autoencoders, arXiv 1606 05908

kingma, 2013, Auto-encoding variational Bayes, arXiv 1312 6114

10.1613/jair.5477

10.18653/v1/N19-1169

10.18653/v1/P19-1264

holtzman, 2019, The curious case of neural text degeneration, arXiv 1904 09751

wang, 2019, Topic-guided variational autoencoders for text generation, arXiv 1903 07137

hu, 2017, Toward controlled generation of text, Proc 34th Int Conf Mach Learn, 70, 1587

serban, 2017, A hierarchical latent variable encoder-decoder model for generating dialogues, Proc 31st AAAI Conf Artif Intell, 3295

hershcovich, 2018, Universal dependency parsing with a general transition-based DAG parser, arXiv 1808 09354

zeman, 2018, CoNLL 2018 shared task: Multilingual parsing from raw text to universal dependencies, Proc CoNLL Shared Task Multilingual Parsing Raw Text Universal Dependencies, 1

10.1007/978-3-319-18111-0_1

10.18653/v1/P18-2008

qi, 2019, Universal dependency parsing from scratch, arXiv 1901 10457

ji, 2018, AntNLP at CoNLL 2018 shared task: A graph-based parser for universal dependency parsing, Proc CoNLL Shared Task Multilingual Parsing Raw Text Universal Dependencies, 248

10.18653/v1/N18-1088

hu, 2014, Convolutional neural network architectures for matching natural language sentences, Proc NIPS, 2042

10.1007/s10994-013-5363-6

10.3115/v1/D14-1162

10.1080/00437956.1954.11659520

krantz, 2018, Abstractive summarization using attentive neural techniques, Proc Int Conf NLP, 1

ranzato, 2015, Sequence level training with recurrent neural networks, arXiv 1511 06732

10.1145/3295748

10.1007/s00371-018-1566-y

yang, 2019, End-to-End open-domain question answering with BERTserini, arXiv 1902 01718

raposo, 2017, Discovering objects and their relations from entangled scene representations, arXiv 1702 05068

zhang, 2019, Pretraining-based natural language generation for text summarization, arXiv 1902 09243

gehring, 2017, Convolutional sequence to sequence learning, arXiv 1705 03122

santoro, 0, A simple neural network module for relational reasoning, Proc NIPS, 2017, 4974

10.3115/v1/P15-1026

10.3115/1220355.1220406

10.3115/v1/N15-1091

10.3115/v1/S14-2001

10.18653/v1/D15-1181

agirre, 2012, SemEval-2012 task 6: A pilot on semantic textual similarity, Proc Joint Conf Lexical Comput Semantics, 1, 385

10.18653/v1/N16-1108

10.3115/v1/S14-2010

10.18653/v1/D15-1237

wang, 2007, What is the jeopardy model? A quasi-synchronous grammar for QA, Proc Joint EMNLP CoNLL, 22

le, 2014, Distributed representations of sentences and documents, Proc ICML, 1188

liddy, 2001, Natural language processing, Encyclopedia of Library and Information Science

go, 2009, Twitter sentiment classification using distant supervision, 1

10.1007/978-0-585-35958-8_1

10.3115/1626431.1626476

kalchbrenner, 2013, Recurrent continuous translation models, Proc EMNLP, 1700

10.1145/1390156.1390177

10.3115/v1/D14-1179

schwenk, 2012, Continuous space translation models for phrase-based statistical machine translation, Proc COLING, 1071

10.18653/v1/E17-3017

10.18653/v1/D17-1151

wu, 2016, Google’s neural machine translation system: Bridging the gap between human and machine translation, arXiv 1609 08144

sutskever, 2014, Sequence to sequence learning with neural networks, Proc NIPS, 3104

10.18653/v1/P17-4012

10.3115/1608858.1608859

10.3115/v1/P15-1033

10.3115/v1/P14-1043

10.3115/v1/P15-1032

10.3115/v1/P15-1117

10.3115/v1/D14-1082

10.3115/1596324.1596352

stenetorp, 2013, Transition-based dependency parsing using recursive neural networks, Proc Deep Learn Workshop Conf Neural Inf Process Syst (NIPS)

10.18653/v1/P16-1231

wang, 2018, A neural transition-based approach for semantic dependency graph parsing, Proc 32nd AAAI Conf Artif Intell, 5561

10.1017/S135132490400364X

sennrich, 2016, Linguistic input features improve neural machine translation, arXiv 1606 02892

ahmed, 2017, Weighted transformer network for machine translation, arXiv 1711 02132

richard medina, 2018, Parallel attention mechanisms in neural machine translation, arXiv 1810 12427

papineni, 2002, BLEU: A method for automatic evaluation of machine translation, Proc ACL, 311

hochreiter, 2001, Gradient flow in recurrent nets: The difficulty of learning long-term dependencies, A Field Guide to Dynamical Recurrent Neural Networks

cettolo, 2014, Report on the 11th IWSLT evaluation campaign, IWSLT 2014, Proc Int Workshop Spoken Lang Transl, 2

lample, 2019, Cross-lingual language model pretraining, arXiv 1901 07291

10.18653/v1/D19-1633

10.18653/v1/W17-3203

10.2200/S00762ED1V01Y201703HLT037

xu chen, 2018, The best of both worlds: Combining recent advances in neural machine translation, arXiv 1804 09849

10.1162/COLI_r_00312

10.1109/MCI.2018.2840738

10.21236/ADA164453

10.1162/neco.1989.1.4.541

10.1109/5.726791

10.18653/v1/D17-1160

10.1007/BF00344251

10.3115/v1/P14-2105

10.1016/0031-3203(82)90024-3

lecun, 1995, Convolutional networks for images, speech, and time series, The Handbook of Brain Theory and Neural Networks, 3361

10.18653/v1/N16-1024

krizhevsky, 2014, One weird trick for parallelizing convolutional neural networks, arXiv 1404 5997

sheng tai, 2015, Improved semantic representations from tree-structured long short-term memory networks, arXiv 1503 00075

10.18653/v1/P16-2006

10.18653/v1/S16-1167

10.18653/v1/S15-2153

10.18653/v1/D16-1257

10.18653/v1/P17-2025

10.18653/v1/P18-2077

tan, 2018, Deep semantic role labeling with self-attention, Proc 32nd AAAI Conf Artif Intell, 4929

10.18653/v1/W17-3204

kuang, 2018, Modeling coherence for neural machine translation with dynamic and topic caches, Proc COLING, 596

luong, 2014, Addressing the rare word problem in neural machine translation, arXiv 1410 8206

sennrich, 2015, Neural machine translation of rare words with subword units, arXiv 1508 07909

mager, 2018, Lost in translation: Analysis of information loss during machine translation between polysynthetic and fusional languages, arXiv 1807 00286

ott, 2018, Analyzing uncertainty in neural machine translation, arXiv 1803 00047

10.18653/v1/W18-6314