K-SPAN: A lexical database of Korean surface phonetic forms and phonological neighborhood density statistics
Tóm tắt
This article presents K-SPAN (Korean Surface Phonetics and Neighborhoods), a database of surface phonetic forms and several measures of phonological neighborhood density for 63,836 Korean words. Currently publicly available Korean corpora are limited by the fact that they only provide orthographic representations in Hangeul, which is problematic since phonetic forms in Korean cannot be reliably predicted from orthographic forms. We describe the method used to derive the surface phonetic forms from a publicly available orthographic corpus of Korean, and report on several statistics calculated using this database; namely, segment unigram frequencies, which are compared to previously reported results, along with segment-based and syllable-based neighborhood density statistics for three types of representation: an “orthographic” form, which is a quasi-phonological representation, a “conservative” form, which maintains all known contrasts, and a “modern” form, which represents the pronunciation of contemporary Seoul Korean. These representations are rendered in an ASCII-encoded scheme, which allows users to query the corpus without having to read Korean orthography, and permits the calculation of a wide range of phonological measures.
Tài liệu tham khảo
Ahn, S C. (1998). An Introduction to Korean Phonology Hansin Munhwasa. Seoul: Hansin Munhwasa.
Carreiras, M, Alvarez, C J, & de Vega, M (1993). Syllable frequency and visual word recognition in Spanish. Journal of Memory and Language, 32, 766–780.
Coady, J A, & Aslin, R N (2003). Phonological neighbourhoods in the developing lexicon. Journal of Child Language, 30, 441–469.
Cock, P J A, Antao, T, Chang, J T, Chapman, B A, Cox, C J, Dalke, A., & Hoon, M (2009). Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics, 25 (11), 1422–1423.
Cutler, A, Mehler, J, Norris, D, & Segui, J (1986). The syllable’s differing role in the segmentation of French and English. Journal of Memory and Language, 25, 385–400.
Eychenne, J, & Jang, T Y (2015). On the merger of Korean mid front vowels. Phonetics and Speech Sciences (Journal of the Korean Society of Speech Sciences), 7(2), 119–129.
Hieronymus, J L. (1994). ASCII Phonetic symbols for the world’s languages: Worldbet: Tech. rep. AT&T Bell Laboratories.
Holliday, J J, & Turnbull, R (2015). Effects of phonological neighborhood density on word production in Korean. In Proceedings of the Eighteenth International Congress of the Phonetic Sciences.
Hong, Y. (1988). A sociolinguistic study of Seoul Korean. Seoul: Hanshin Publishing Co.
Kim, H. (2005). Hyeondae Gugeo Sayong Bindo Josa 2. Seoul: National Institute of the Korean Language.
Kim, H (2006). Korean national corpus in the 21st century Sejong project. In Proceedings of the 13th National Institute of Japanese Literature (NIJL) International Symposium (pp. 49–54).
Kwon, Y (2014). The syllable type and token frequency effect in naming task. Korean Journal of Cognitive Science, 25, 91–107.
Kwon, Y, & Nam, K (2011). The relationship between morphological family size and syllabic neighborhoods density in Korean visual word recognition. The Korean Journal of Cognitive and Biological Psychology, 23, 301–319.
Kwon, Y, Lee, C, Lee, K, & Nam, K (2011). The inhibitory effect of phonological syllables, rather than orthographic syllables, as evidenced in Korean lexical decision tasks. Psychologia, 54, 1–14.
Lee KM, & Ramsey SR. (2011). A history of the Korean language: Cambridge University Press.
Luce, P A. (1986). Neighborhoods of words in the mental lexicon: PhD thesis, Indiana University.
Luce, P A, & Pisoni, D B (1998). Recognizing spoken words: the neighborhood activation model. Ear & Hearing, 19(1), 1–36.
Mehler, J, Dommergues, J Y, Frauenfelder, U, & Segui, J (1981). The syllable’s role in speech segmentation. Journal of Verbal Learning and Verbal Behavior, 20, 298–305.
Munson, B, & Solomon, N P (2004). The effect of phonological neighborhood density on vowel articulation. Journal of Speech, Language, and Hearing Research, 47, 1048–1058.
Oh, Y M, Coupé, C, Marsico, E, & Pellegrino, F (2015). Bridging phonological system and lexicon: insights from a corpus study of functional load. Journal of Phonetics, 53, 153–176.
Perea, M, & Carreiras, M (1998). Effects of syllable frequency and syllable neighborhood frequency in visual word recognition. Journal of Experimental Psychology: Human Perception and Performance, 24, 134–144.
Pisoni, D B, Nusbaum, H C, Luce, P A, & Slowiaczek, L M (1985). Speech perception, word recognition and the structure of the lexicon. Speech Communication, 4, 75–95.
Scarborough, R. (2004). Coarticulation and the structure of the lexicon. Los Angeles: PhD thesis, UCLA.
Shin, J (2008). Phoneme and syllable frequencies of Korean based on the analysis of spontaneous speech data. Korean Journal of Communication Disorders, 13(2), 193–215.
Shin, J, Kiaer, J, & Cha, J. (2013). The sounds of Korean. Cambridge: Cambridge University Press.
Silverman, D (2010). Neutralization and anti-homophony in Korean. Journal of Linguistics, 46(02), 453–482.
Sohn, HM. (1999). The Korean language: Cambridge University Press.
Song, J, Nam, K, & Koo, M (2012). The effect of word frequency and neighborhood density on spoken word segmentation in Korean. Journal of the Korean Society of Speech Sciences, 4(2), 3– 20.
Stokes, S F (2010). Neighborhood density and word frequency predict vocabulary size in toddlers. Journal of Speech, Language, and Hearing Research, 53, 670–683.
The Unicode Consortium (2015). The Unicode Standard, Version 8.0.0. The Unicode Consortium, http://www.unicode.org/versions/Unicode8.0.0/.
Vitevitch, M S, & Stamer, M K (2006). The curious case of competition in Spanish speech production. Language and Cognitive Processes, 21, 760–770.
Wedel, A, Jackson, S, & Kaplan, A (2013a). Functional load and the lexicon: evidence that syntactic category and frequency relationships in minimal lemma pairs predict the loss of phoneme contrasts in language change. Language and Speech, 56(3), 395–417.
Wedel, A, Kaplan, A, & Jackson, S (2013b). High functional load inhibits phonological contrast loss: a corpus study. Cognition, 128(2), 179–186.
Wright, R (2004). Factors of lexical competition in vowel articulation. In Local, J, & Ogden, R (Eds.) Papers in Laboratory Phonology, (Vol. 6 pp. 26–50). Cambridge: Cambridge University Press.