Forums

Full Version: Pitch analysis and silence vs. speech detection with SignalLab
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2

angelcaf

Hello,
I am new to SignalLab but I have been succesfully using VideoLab before.
Before asking my question I will put down some context (schematically)

- Real time audio processing
- Input is a mic
- Single user

This is what I need to do and my question is how using SignalLab?

1 - Need to detect whether the user is speaking or there is silence
2 - Need to do basic pitch analysis: within a region of the past 100ms of input I need to take two points (pitch value) and check whether there has been a raising/falling pitch of 30Hz.

What are the SignalLab components that I need to use for that?
Is it possible to detect (1)?
Please note that I would need to use some code for that and not just the visual components since i will need to use those information (voice vs. silence) for a higher level algorithm.

Thanks a lot in advance!
Angelo

Dave

Hi! Are you able to use AudioLab as well? For the input-level detection you could use a VU-meter, for the pitch analysis you could perhaps use a Fast-Fourier-Transition component.

Regards,
Dave

angelcaf

(12-08-2012 09:38 AM)Dave Wrote: [ -> ]Hi! Are you able to use AudioLab as well? For the input-level detection you could use a VU-meter

Hello Dave,
I do have AudioLab.
I have put the VU-Meter component into my form and set the Mode to Decibel and number of channels to 1 to have a single output pin.
My question is, how can I obtain via code the data coming out from the Output pin of the VU-Meter?

(even if I want to connect the output pin to a level detector component the question stays, how can I get the detector output/result via code when the level is below a certain threshold?)

Thanks a lot for your supprot!
Angelo

Dave

Not sure how this is done under VC/.NET but the VU-meter could or should have a ValueChange event or a callback function. That way you can read the current value and check. If included you could try have a peek at some AudioLab demos as well.

Regards,
Dave

angelcaf

(12-08-2012 12:16 PM)Dave Wrote: [ -> ]Not sure how this is done under VC/.NET but the VU-meter could or should have a ValueChange event or a callback function. That way you can read the current value and check. If included you could try have a peek at some AudioLab demos as well.

Hello Dave, yes in .NET c# is done using the callback function to the ValueChange event.
I am starting to like more and more the library and also the wire editor! Great tool.

I have another question. I think putting the 4 lines of the algorithm that I am implementing will help understanding my question.

I need to set at TRUE a boolean flag.
So my goal is to set FLAG = TRUE upon detection of:
(1) A pause of 400ms (using vuMeter + detectLevels)
(2) preceded by at least 1000ms of speech,
(3) where the last 100ms,
(4) contain a rising or falling pitch of at least 30Hz.

I can easily do (1) even without any line of code. I am planning to do the rest similarly or adding the minimum code required.

However, what puzzles me is how to consider the previous 1000ms of speech (provided that I select "speech" in the same way as (1)).

Then, by using Fourier, how can I compute (4) just for the last 100ms?
For raising and falling I mean selecting two points (may be min and max) from the FFT output and doing the difference.

What would you suggest me to use?

Your help is really appreciated.
Thanks!

Dave

You could indeed just use the Fourier-analysis for both your needs using a SLGenericReal component, get the timing and do some frequency-checking. For the timing you could use the input-stream (samples/second) for example, and GetTickCount() might be an alternative.
For your true's and false's to be set correctly you should get a little creative, check how long a certain minimum wasn't reached (that would be your silence) and check when some higher minimum was reached first and last, that would be the start and ending of your speech for example.

Hope that gets you going.

Regards,
Dave

angelcaf

Hello Dave,
I am almost there with the implementation.
I am using the FFT component and I have set the SamplingWindowStep to 100.
This number came out after some calculations and should apply the FFT every 100 samples, that translated to time should be every 10ms using a sample rate of 11.000 Hz.

This is the formula used: SamplingRate = NumSamples/ Time.
Therefore Time = NumSamples/SamplingRate = 100 / 11000 = 0.009 s = 10ms

Now I am doing the following:
- I am using the callback method FrequencyEvent of the FFT component.
- In the callback implementation: I get the value Args.MaxFrequency (let's name it MAX_FREQ)
if I am not wrong this should give me the highest intensity in the window just calculated.
- then I add it in a queue containing my last 60 outputs (MAX_FREQ) coming from the FFT. This should give me access to 60 x 10ms = 600ms of data.

The problem is that all values in the queue are always the same, it looks like the callback is not called every 100ms according to the steps parameter set in the FFT component or I am not understanding the meaning of Args.MaxFrequency or the meaning of the callback itself.
Do you have any suggestions?

Thanks a bunch,
Angelo

Dave

That timing ought to be correct.

Are you sure the FFT gets data? Perhaps you can hook up a Scope either to the same input as the FFT is connected to or to an output of the FFT itself.

Regards,
Dave

angelcaf

(12-15-2012 09:32 AM)Dave Wrote: [ -> ]Are you sure the FFT gets data?

Yes I used a scope to display the frequencies in output.
I don't know how accurate it is but using a StopWatch I measured the time when the callback is called and it looks like every 1ms I have a value despite the parameters I have used. This raised up my doubts about how often the FourierUpdate callback is called.
I am now trying to give the fourier frequency coming from the FFT in input to a GenericReal component. It would be nice to have a more detailed description of what GenericReal does and how the internal buffer handles the data in input, in particular when new data (from fft) arrives as time goes. Could you please clarify that for me?
The solution is really close Smile

Angelo

Dave

To determine how often something is called it's easy to use a counter (global int) increased in the event and have it read and cleared once a second by a timer for example.

If you connect the GenericReal to the FFT-SpectrumOutputPin the inputbuffer simply represents the spectrumdata (volume per frequency in real/double) and you can get the buffersize by calling InBuffer.GetSize().

Regards,
Dave
Pages: 1 2
Reference URL's