Abstract
In this paper, we present the newly established Danish speech corpus PiTu. The corpus consists of recordings of 28 native Danish talkers (14 female and 14 male) each reproducing (i) a series of nonsense syllables, and (ii) a set of authentic natural language sentences. The speech corpus is tailored for investigating the relationship between early stages of the speech perceptual process and later stages. We present our considerations involved in preparing the experimental set-up, producing the anechoic recordings, compiling the data, and exploring the materials in linguistic research. We report on a small pilot experiment demonstrating how PiTu and similar speech corpora can be used in studies of prosody as a function of semantic content. The experiment addresses the issue of whether the governing principles of Danish prosody assignment is mainly talker-specific or mainly content-typical (under the specific experimental conditions).
Original language | English |
---|---|
Title of host publication | 8th International Conference on Language Resources and Evaluation |
Number of pages | 6 |
Publication date | 2012 |
Publication status | Published - 2012 |
Event | LREC 2012 Istanbul - Istanbul, Turkey Duration: 21 May 2012 → 27 May 2012 |
Conference
Conference | LREC 2012 Istanbul |
---|---|
Country/Territory | Turkey |
City | Istanbul |
Period | 21/05/2012 → 27/05/2012 |
Bibliographical note
The corpus is available at http://amtoolbox.sourceforge.net/pitu/Keywords
- Speech corpus
- Danish language
- Nonsense syllables
- Prosodic structure
- Corpus-based spoken language analysis