Forums - how to make speech recognition

Pages: 1 2

I will make a speech recognition system the which can
Recognize vowels (A, I, U, E, O), using a backpropagation
neural network system.
I use the components IntelligenceLab 5.0.1 for Delphi, AudioLab 5.0.1 for Delphi and I am using Delphi XE2 program.

I encounter problems when applying it, I use Openwire Editor and
components that I use:
1. ALAudio1 (to take sound from a microphone)
2. ALSpectrum1 (spectrum components)
3. ILNeuralNetworkBackpropTrain1 (backprop training algorithm components)
4. ILNeuralNetwork1 (neural network clasifier)

to the output pins of the component spectrum can I connect with the
input of clasifier neuralnetwork1, but I can not connect the output
spectrum with component input ILNeuralNetworkBackpropTrain1.
My question whether the steps that I am doing is correct or whether
there is still a mistake? or need additional program listings?

greetings Hymae

Hi!

You may want to have a peek at the ILNeuralNetworkDemo.

Regards,
Dave

Yes I have learned about the neural network demos. But I am still confused to connect it to the backprop training function.

ATrainingData: ISLRealMatrixBuffer;
....
....
ILNeuralNetworkBackpropTrain1.Train (ATrainingData, AResposes);

Based on the demo example, to train the data used ATrainingData variable with data type ISLMatrixBuffer. And how do I insert a sound file into backprop training function? What is a sound file, must be converted first to a variable with data type ISLRealMatrixBuffer or can be directly incorporated into the training function? And how the example program listings,

thanks..

Sounds like you're trying a huge first step with neural networks, something simpler might be an idea to start with (like a XOR function for example).

If you wonder about sound files, check this https://secure.wikimedia.org/wikipedia/en/wiki/Wav for example or try google for more info.

Regards,
Dave

Do you have examples of simple programs related to speech recognition? because I still have problems with this, and I hope in this way can help me understand and make it easier
thanks for the help

Sorry no samples, for more info in general about speech recognition read some of this: https://secure.wikimedia.org/wikipedia/e...ecognition

About half way down neural networks and speech recognition are explained.

Regards,
Dave

I encountered several obstacles that have error messages when I combine the output of the component spectrum with bacpropagation training component.
Namely the error message:
[DCC Error] Unit1.pas (83): E2250 There is no overloaded version of 'Train' that can be Called with these arguments

This happens when I combine the component spectrum with a training component, where its form:
TILNeuralNetworkTrain.Train (ISLRealBuffer [], ISLRealBuffer []) method
Then I modified into:
TILNeuralNetworkTrain.Train (ALSpectrum1, AResposes)

To be able to understand each other, I described my first stage of what I have done. And if wrong I beg to be corrected,
- I use a computer with windows 7 operating system and using the program Delphi XE2.
- I am using component AudioIn, to pick up sound signals.
- Wav logger to store the sound file. Since I will detect the vowels (A, I, U, E, O) then I create a file in the form of wav sound training. Where for every vowel I use five training data spoken by 5 different people.
- Then to do the training network, I use wav player, to load, voice training files.
- After that I used a component spectrum for the signal extraction process.
My question, what causes an incompatible data is causing the error?
What is correct, step I have done to make the speech recognition system or there is a mistaken or wrong and what should I do?
thanks for your help

Are you trying to recognize one single audio frame from the spectrum as a vowel? And the error is self-explanatory, your feeding the Train function some argument(s) it can't deal with. An ALSpectrum object isn't the same as a TSLCRealMatrixBuffer.

Regards,
Dave

Yes, I mean like that, so I want to make the system intended to recognize a person's speech. For example when someone say "A", then the system recognizes it as "A". then if somebody say "I" then the system recognizes it as "I". and so on for the "U", "E" and "O". and what should I do? Please guidance, thanks

Well like i said before you may want to try something easier to get the hang of neural networks. For example a neural network that adds 2 numbers or one that detects a simple pattern. Taking small steps to start with is a good way to understand what a NN does, it's not like you can feed it very complex data and just tell it to make sense of it somehow.

Dave

Pages: 1 2