Danish SpeechDat(M) database - DB1
View resource name in all available languages
Base de données SpeechDat(M) du danois DB1
The Danish SpeechDat(M) database is the speech database collected within the SpeechDat(M) project. It consists ofpolyphone-like data recorded by 1,523 speakers.
The speech files are stored as sequences of 8 bit 8 kHz A-law samples. Each prompted utterance is stored within a separatefile and the associated label files are stored in SAM file format.
An ASCII file is attached and is listing information about each speaker: speaker code, sex, age, region, prompt number.
The lexicon is presented in a TAB delimited ASCII file containing an alphabetically ordered list of distinct lexical itemsoccurring in the database. Each entry contains a frequency count and corresponding pronunciation information.
WORD FREQUENCY PHONEMIC TRANSCRIPTIONS
åbnede 104 O b n @ D | O b n @ D @
adresseangivelse 97 a d R a s @ a n g i: u l s @
The complete Danish SpeechDat database consists of 5 CD-ROMs. The first three CD-ROMs contain the application oriented sub-set. The last two CD-ROMs contain the phonetically rich sentences.
The included items are:
· 5 application word phrases (semi spontaneous)
· 12 connected digit strings with 8 digits
· 24 natural numbers (3-4 digits)
· 27 application words
· 3 dates, D3 spontaneous (birthday)
· 3 spelled words
· 2 money amounts, M1 small, M2 large
· City name (spontaneous)
· 3 yes/no questions (spontaneous)
· 22-25 sentences
· T1 time phrase, T2 time of day (spontaneous)
There are 1,523 speakers in the SpeechDat database from 11 linguistic regions of Denmark and five age groups (under 16, 16-30, 31-45, 46-60, over 60). 78% of them are between 16 and 60 years old.
A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
View resource description in all available languages