Speech Corpus Development for Speaker Independent Speech Recognition for Indian Languages

Conference: McGraw-Hill International Conference on Signal, Image Processing Communication and Automation
Author(s): Amaresh P Kandagal, V Udayashankara Year: 2017
Grenze ID: 02.MH-ICSIPCA.2017.1.23 Page: 146-151

Abstract

In this paper, we discuss development of speech corpus for speaker independent speech recognition for Indian\nairports and it is extended for continuous speech recognition Indian languages. We have collected the speech corpus from 801\nspeakers to build large vocabulary ASR engine. Speech corpus recorded over telephone line and microphone. It is recorded\nfrom speakers ranging from age group between 20 to 60 years. 4.5 hours of microphone data recorded from 375 male and\n244 female voices. The telephonic data is of 1.3 hours which includes male and female voices. Total 6.2 hours of speech\ncorpus is collected. The recording was conducted at office, college and home environments. We also discuss preliminary\nisolated speech recognition results using the acoustic models created on these corpus using Hidden Markov Model toolkit\n(HTK).

<< BACK

MH-ICSIPCA - 2017