Presentation On Introduction To Speech Recognition
Download
Introduction To Speech Recognition Presentation Transcript:
1.Speech Recognition
2.Introduction
Speech Recognition System: Process of automatically recognizing who is speaker based on the unique characteristic contained in speech waves.
Speaker recognition systems involve two phases :
1. Training
2. Testing
Training is the process of familiarizing the system with the voice characteristics of the speakers registering. Testing is the actual recognition task.
3.System Overview
4.Steps to construct speech recognition SYSTEM
5.TOOL USED
Matlab
high-level language and interactive environment for numerical computation, visualization, and programming.
analyze data, develop algorithms, and create models and applications.
6.ELEMENTS IN SPEECH RECOG.
Two elements in speech recognition system are:
Feature Extraction: process of extracting unique information from speech files.
Feature Matching: process of identifying the speaker that involves comparing unknown data.
7.Techniques
Feature Extraction:
Mel Frequency Ceptrum Coefficient (MFCC)
Feature Matching:
Vector Quantization(VQ)
8.MFCC
Frame Blocking
In frame blocking, the continuous speech signal is blocked into frames of N samples, with adjacent frames being separated by M (M < N). The first frame consists of the first N samples. The second frame begins M samples after the first frame, and overlaps it by N - M samples. Similarly, the third frame begins 2M samples after the first frame (or M samples after the second frame) and overlaps it by N - 2M samples.
Typical values for N and M are N = 256 and M = 100.
9.Windowing
The next step in the processing is to window each individual frame so as to minimize the signal discontinuities at the beginning and end of each frame. The concept here is to minimize the spectral distortion by using the window to taper the signal to zero at the beginning and end of each frame.
10.Fast Fourier Transform
The next processing step is the Fast Fourier Transform, which converts each frame of N samples from the time domain into the frequency domain. The FFT is a fast algorithm to implement the Discrete Fourier Transform(DFT) which is defined on the set of N samples {xn}, as follow:
Download
Introduction To Speech Recognition Presentation Transcript:
1.Speech Recognition
2.Introduction
Speech Recognition System: Process of automatically recognizing who is speaker based on the unique characteristic contained in speech waves.
Speaker recognition systems involve two phases :
1. Training
2. Testing
Training is the process of familiarizing the system with the voice characteristics of the speakers registering. Testing is the actual recognition task.
3.System Overview
4.Steps to construct speech recognition SYSTEM
5.TOOL USED
Matlab
high-level language and interactive environment for numerical computation, visualization, and programming.
analyze data, develop algorithms, and create models and applications.
6.ELEMENTS IN SPEECH RECOG.
Two elements in speech recognition system are:
Feature Extraction: process of extracting unique information from speech files.
Feature Matching: process of identifying the speaker that involves comparing unknown data.
7.Techniques
Feature Extraction:
Mel Frequency Ceptrum Coefficient (MFCC)
Feature Matching:
Vector Quantization(VQ)
8.MFCC
Frame Blocking
In frame blocking, the continuous speech signal is blocked into frames of N samples, with adjacent frames being separated by M (M < N). The first frame consists of the first N samples. The second frame begins M samples after the first frame, and overlaps it by N - M samples. Similarly, the third frame begins 2M samples after the first frame (or M samples after the second frame) and overlaps it by N - 2M samples.
Typical values for N and M are N = 256 and M = 100.
9.Windowing
The next step in the processing is to window each individual frame so as to minimize the signal discontinuities at the beginning and end of each frame. The concept here is to minimize the spectral distortion by using the window to taper the signal to zero at the beginning and end of each frame.
10.Fast Fourier Transform
The next processing step is the Fast Fourier Transform, which converts each frame of N samples from the time domain into the frequency domain. The FFT is a fast algorithm to implement the Discrete Fourier Transform(DFT) which is defined on the set of N samples {xn}, as follow:
No comments:
Post a Comment