Unsupervised grammar induction using history based approach

Computer Speech & Language - Tập 20 - Trang 644-658 - 2006
Heshaam Feili1, Gholamreza Ghassem-Sani1
1Department of Computer Engineering, Sharif University of Technology, Azadi Avenue, Tehran, Iran

Tài liệu tham khảo

Adriaans, 1999, Grammar induction as substructural inductive logic programming, 117 Allen, 1995 Amaya, F., Benedi, J.M., Sanchez, J.A., 1999. Learning of stochastic context-free grammars from bracketed corpora by means of reestimation algorithms. In: The VIII Symposium on Pattern Recognition and Image Analysis, vol. 1, pp. 19–126, Bilbao. Amtrup, J.W., Rad, H.R., Megerdoomian, K., Zajac, R., 2000. Persian-English Machine Translation: An Overview of the Shiraz Project, NMSU, CRL. Memoranda in Computer and Cognitive Science MCCS-00-319. Baker, J.K., 1979. Trainable grammars for speech recognition. Speech communication papers for the 97th Meeting of the Acoustical Society of America, pp. 547–550. Bateni, 1995 Bijankhan, 2003, Emkansanji baraye Tarhe Modelsaziye Zabane Farsi (The feasibility study for Persian language modeling), The Journal of Literature, 162–163, 81 Bijankhan, 2005, Naghshe Peykarehaye Zabani dar Neveshtane Dasture Zaban: Mo’arrefiye yek Narmafzare Rayane’i, Iranian Linguistic Journal, 2, 48 Black, E., Abnery, S., Flickinger, D., et al., 1991. A procedure for quantitatively comparing the syntactic coverage of English grammars, In: DARPA Speech and Natural Language Workshop, pp. 306–311. Black, E., Lafferty, J., Roukos, S., 1992. Development and evaluation of a broad-coverage probabilistic grammar of English-language computer manuals. In: The Proceedings of the 30th Annual Meeting of the Association for computational Linguistics, pp. 185–192. Black, E., Jelinek, F., Lafferty, J., Magerman, D., Mercer, R., Roukos, S. 1992., Towards history-based grammars: using richer models for probabilistic parsing. In: The Proceedings of the 5th DARPA Speech and Natural Languages Workshop, Harriman, NY. Brill, E., 1993. A corpus-based approach to language learning. Ph.D. Thesis, Department of the computer and Information Science, University of Pennsylvania. Briscoe, T., Waegner, N., 1992. Robust stochastic parsing using the inside-outside algorithm. In: AAAI-92 Workshop on Statistically Based NLP Techniques. Carroll, G., Charniak, E., 1992. Two experiments on learning probabilistic dependency grammars from corpora, Technical Reports CS-92-16, Department of Computer Science, Brown University, March . Casacuberta, 1996, Growth transformations for probabilistic functions of stochastic grammars, IJPRAI, 10, 183 Charniak, 1993 Charniak, 1996 Charniak, 1997, Statistical parsing with a context-free grammar and word statistics, 598 Charniak, 1997, Statistical techniques for natural language parsing, AI Magazine, 18, 33 Charniak, E., 2000. A maximum-entropy-inspired parser. In: NAACL 1, pp. 132–139. Chen, S.F., 1995. Bayesian grammar induction for language modeling. In: Proceedings of the Association for Computational Linguistics, pp. 228–235. Church, K., 1988. A stochastic parts program and noun phrase parser for unrestricted text. In: The Proceedings of the Second Conference on Applied Natural Language Processing, pp. 136–143. Clark, A., 2001., Unsupervised induction of stochastic context-free grammars using distributional clustering. In: The Fifth Conference on Natural Language Learning. Clark, A., 2001. Unsupervised language acquisition: theory and practice. Ph.D. Thesis, University of Sussex. Collins, M., 1996. A new statistical parser based on bigram lexical dependencies. In: The Proceedings of the 34th Annual Meeting of the ACL, Santa Cruz. Collins, M.J., 1997. Three generative, lexicalized models for statistical parsing. In: ACL 35/EACL 8, pp. 16–23. Feili, 2004, An application of lexicalized grammars in English-Persian translation, 596 Grune, 1990 Hemphill, C.T., Godfrey, J., Doddington, G., 1990. The ATIS spoken language systems pilot corpus. In: DARPA Speech and Natural language Workshop, Hidden Valey, Pennsylvania, June. Homes, 1988 Jelinek, F., Laferty, J.D., Magerman, D., Mercer, R., Ratnaparakhi, A., Roukos, S., 1994. Decision-tree parsing using hidden derivation model. In: The Proceedings of the 1994 Human Language Technology Workshop, pp. 272–277. Johnson, M., 1998. The effect of alternative tree representations on tree bank grammars. In: Powers, D.M.W. (Ed.) NeMLaP3/CoNLL98: New Methods in Language Processing and Computational Natural Language Learning, ACL, pp. 39–48. Jurafsky, 2000 Kasami, T., 1965. An efficient recognition and syntax algorithm for context-free languages. Scientific Report AFCRL-65-758, Air Force Cambridge Research Laboratory, Bedford, MA. Kehler, A., Stolcke, A., 1999. Preface. In: Kehler, A. Stolcke, A. (Eds.), Unsupervised Learning in Natural Language Processing. Proceedings of the Workshop. Association for Computational Linguistics. Khanlari, 1995 Klein, D., Manning, C.D., 2001. Distributional phrase structure induction. In: Proceedings of the Fifth Conference on Natural Language Learning (CoNLL 2001), pp. 113–120. Klein, 2001, Natural language grammar induction using a constituent-context model, vol. 1, 35 Klein, D., Manning, C.D., 2002. A generative constituent-context model for improved grammar induction. In: ACL 40, pp. 128–135. Klein, D., Manning, C.D., 2004. Corpus-based induction of syntactic structure: models of dependency and constituency. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 04). Klein, D., Manning, C.D. 2005. The Unsupervised Learning of Natural Language Structure. Ph.D. Thesis, Department of Computer Science, Stanford University. Lari, 1990, The estimation of stochastic context-free grammar using the inside-outside algorithm, Computer Speech and Language, 4, 35, 10.1016/0885-2308(90)90022-X Magerman, D.M., Marcus, M.P., 1990.Parsing a natural language using mutual information statistics. In Proceedings of the Eighth National Conference on Artificial Intelligence, August. Magerman, D., Marcus, M., 1991. Pearl: a probabilistic chart parser. In: The Proceedings of the 1991 European ACL conference, Berlin, Germany. Magerman, D., Weir, D., 1992. Efficiency, robustness and accuracy in picky chart parsing. In: The Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pp. 40–47. Magerman, D.M., 1995. Statistical decision-tree models for parsing. In: The Proceedings of ACL Conference, June 1995. Manning, 1999 Marcken, C., 1995. On the unsupervised induction of phrase-structure grammars. In: The Proceedings of the 3rd Workshop on Very Large Corpora. Marcken, C., 1996. Unsupervised language acquisition. Ph.D. Thesis, Department of Electrical Engineering and Computer Science, MIT. Marcus, 1993, Building a large annotated corpus of English: the Penn Treebank, Computational Linguistics, 19, 313 Megerdoomian, K., 2000. Persian Computational Morphology: A Unification-Based Approach, NMSU, CRL. Memoranda in Computer and Cognitive Science (MCCS-00-320). Paskin, 2002, Grammatical bigrams, 14 Pereira, F., Schabes, Y., 1992. Inside-outside re-estimation from partially bracketed corpora. In: The Proceeding of 30th Annual Meeting of the ACL, pp. 128–135. Sanchez, J.A., Benedi, J.M., Casacuberta, F., 1996. Comparison between the inside-outside algorithm and the Viterbi algorithm for stochastic context-free grammars. In: The Proceedings of the 6th International Workshop on Advances in Structural and Syntactical Pattern Recognition, pp. 50–59. Sanchez, J.A., Benedi, J.M., 1998. Estimation of the probability distributions of stochastic context-free grammars from the K-best derivations. In: The 5th International Conference on Spoken Language Proceeding. Sanchez, J.A., Benedi, J.M., 1999. Probabilistic estimation of stochastic context-free grammars from the k-best derivations. In: The VIII Symposium on Pattern Recognition and Image Analysis, vol. 2, pp. 7–8, Bilbao. Schabes, Y., Roth, M., Obsorne, R., 1993. Parsing the Wall Street Journal with the inside-outside algorithm. In: The Proceedings of the 6th Conference of the European Chapter of the ACL, pp. 341–347. Stolcke, 1994, Inducing probabilistic grammars by Bayesian model merging Thanaruk, 1995, Grammar acquisition and statistical parsing, Journal of Natural language Processing, 2 Van Zaanen, M., 2000. ABL: alignment-based learning. In: COLING 2000, pp. 961–967. Van Zaanen, M., 2002. Bootstrapping structure into language: alignment-based learning. Ph.D. Thesis, School of Computing, University of Leeds. Van Zaanen, M., Adriaans, P.W., 2001. Comparing two unsupervised grammar induction systems: alignment-based learning vs. EMILE. Technical Report: TR2001.05, School of Computing, University of Leeds. Younger, 1967, Recognition and parsing of context-free languages in time O(n3), Information and Control, 10, 189, 10.1016/S0019-9958(67)80007-X Yuret, D., 1998. Discovery of linguistic relations using lexical attraction. Ph.D. Thesis, MIT.