Automatic Speech Recognition System

Topic > Automatic Speech Recognition System - 747

Today many learning applications use the ASR system. ASR can capture children's interest and engage them in their learning (Husniza. H, Fauziah. AR, Sobihatun. NAS, 2012). ASR can also increase the quality of learning and teaching helps ensure that e-learning is accessible to all through the cost-effective production of synchronized and subtitled multimedia content (Mike. W., 2002). Therefore, IMELDA is one of the applications that use ASR technologies to help the most challenging children in primary school in Malaysia. IMELDA encouraged children to take an interest in learning English. However, when testing IMELDA at school for six primary school children, some limitations were found in this ASR system that affect its accuracy. To ensure that this application using ASR technology works properly, it must be able to handle any possible effects caused on the performance of an ASR engine (Victoria. Y., 2012). The theory literature examined the relationship between ASR recognition performance as measured by the accuracy in which the percentage of the word is correct, divided by the total number of words used. ASR ACCURACY ASR accuracy is a challenging problem for automatic speech recognition system to handle. This difficulty is caused by some factor. According to Victoria Y (2012) these errors are caused by two factors: external and internal factors. An external factor which is the acoustic environment and an internal factor resulting from the error of the components and the language model (LM) of the ASR system. In IMELDA the factor causes ASR problems from both (e.g. child's voice, acoustic environment, pronunciation error and the language model (LM) used in IMELDA not suitable for L2. Noisy environmentNoise in the environment is one of the external factors that determine... ... middle of paper ......ion error occurs when younger children may not have correct pronunciation Furthermore, the model of an ASR engine in IMELDA is not suitable to L2 children because the phonics of L2 children are different from those of L1 children. Sometimes young children do not know how to articulate specific phonemes (Schotz, 2001). ASR system gives problems to L2 children especially in terms of accuracy because the pronunciation of the style of the English language is different compared to L1 children. According to Muhirwe.J (2005) to build a speech engine we need a corpus which can be obtained from collections of texts and speech and both are used as the basis for statistical natural language processing (NLP). Cole et al., (1994) also state that the development of a speech corpus could involve the collection and transcription of data.