This is a preprint

This article features a new multimedia corpus with 22 hours of recordings of a Mandarin-speaking child from the age of 1;7 to 3;4. We review the state of the art in the use of corpora for first language acquisition of Mandarin, and highlight the importance of corpus studies in evaluating children’s language developmental patterns vis-a-vis adult input. The transcripts in our new corpus are annotated with a morphological tier indicating parts of speech, and linked to audio or video files. This corpus goes beyond existing published corpora of child Mandarin in having more data for a single child, as well as media linking. It contributes to a number of fields including language acquisition, Chinese linguistics, corpus linguistics, developmental psycholinguistics, education, and speech and language therapy.


本文发布一个新的多媒体语料库的首阶段成果。这部分内容记录了一名普通话儿童从 1 岁 7 个月到 3 岁 4 个月期间的语言发展,共录得 22 个小时的语料。借此机会,我们回顾了汉语 普通话一语习得研究中语料库使用的最新情况,强调语料库研究在考察儿童语言发展和成人 语言输入时的重要作用。在我们这个新的语料库中,文字转写材料添加了词类注释层,并已 实现与多媒体材料的链接。这个语料库在单个普通话儿童数据量和音频视频链接上超越了现 有已发表的语料库。它将为语言习得、汉语语言学、语料库语言学、发展心理语言学、教育 以及言语治疗等诸领域做出贡献。


Child language corpus, Mandarin Chinese, Language input, Media linking, Morphological tier


儿童语料库 汉语普通话 语言输入 多媒体链接 词类注释层


Additional Information

Print ISSN
Launched on MUSE
Open Access
Back To Top

This website uses cookies to ensure you get the best experience on our website. Without cookies your experience may not be seamless.