A Hidden Markov Model-Based Speech Recognition System Using Baum-Welch, Forward-Backward and Viterbi Algorithms
Benjamin Kommey, Ernest O. Addo, Elvis Tamakloe | Pages: 509-536 |

Abstract— Speech is the most complex part or component of human intelligence and for that matter speech signal processing is very important. The variability of speech is very high, and this makes speech recognition difficult. Other factors like dialects, speech duration, context dependency, different speech speed, speaker differentiation, environment and locality all add to the difficulty in speech processing. The absence of distinct boundaries between tones or words causes additional problems. Speech has speaker dependent characteristics, so that no one can reproduce or repeat phrases in the same way as another. Nevertheless, a speech recognition system should be able to model and recognize the same words and phrases absolutely. Digital signal processors (DSP) are often used in speech signal processing systems to control these complexities. This paper presents a Hidden Markov Model (HMM) based speech signal modulation through the application of the Baum-Welch, Forward-Backward and Viterbi algorithms. The system was implemented using a 16-bit floating point DSP (TMS320C6701) from Texas instruments and the vocabulary was trained using the Microsoft Hidden Markov Model Toolkit (HTK). The proposed system achieved about 79% correct word recognition which represents approximately 11,804 correct words recognized out of a total of 14960 words provided. This result indicates that the proposed model accuracy and speaker independent system has a very good evaluation score, and thus can be used to aid dictation for speech impaired persons and applications in real time with a 10 ms data exchange rate.



DOI: http://doi.org/10.5455/jjee.204-1675950756