APS Logo

"Robust Speaker Identification System Under Adverse Conditions"

ORAL

Abstract

When only a speech utterance is given, finding out the person who spoke the given speech utterance from a group of reference speakers is referred to as Speaker Identification. It is also known as biometric based on voice. Its success has great potential to bring a paradigm shift in the way we communicate with the machine. It will facilitate in the easier and secure communication between Man and Machine using speech. It will particularly benefit the elderly members of the society. Factors like voice disguising, emotional state of the person, background noise and throat diseases create a mismatch between the training and the test speech data, referred to as Mismatched Problem. It decreases the speaker identification accuracy and needs to be addressed. To make the speaker identification system robust against these mismatched conditions, we have developed speech frame selection methods for feature extraction. It captures the characteristics of the speech signal efficiently from the time-domain signal. For modeling the speaker, the machine learning technique, Gaussian Mixture Model with 64 components is utilized. It has shown good performance under voice disguise and environmental noise conditions. Future work will test its performance for emotional and diseased speech.

Presenters

  • Swati Prasad

    Electronics and Communication Engineering, Birla Institute of Technology, Mesra, Ranchi, Jharkhand, India

Authors

  • Swati Prasad

    Electronics and Communication Engineering, Birla Institute of Technology, Mesra, Ranchi, Jharkhand, India