Speech recognition (Isolated words 1-9)

Question

Chan el 10 de Sept. de 2011

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/15554-speech-recognition-isolated-words-1-9

Comentada: Javier el 11 de Nov. de 2020

Hi there,

I'm an electronic student that doing speech recognition (Isolated words 1-9) system for my school project. This project is to take any speaker voice to recognize (One,Two,...,Eight,Nine) 9 words. All the word are isolated single word.

At the moment I have some coding for :

i. Saving the wav file from input (microphone)

%This program records the voice
function [norm_voice,h] = Voice_Rec(sample_freq)
option = 'n';
option_rec = 'n';
record_len = 3;         %Record time length in seconds
sample_freq = 8192;    %Sampling frequency in Hertz
sample_time = sample_freq * record_len;
'Get ready to record your voice'
name = input('Enter the file name you want to save the file with: ','s');
file_name = sprintf('%s.wav',name);
option_rec = input('Press y to record: ','s');
if option_rec=='y'
    while option=='n',
        input('Press enter when ready to record--> ');
        record = wavrecord(sample_time, sample_freq);       %Records the input through the sound card to the variable with specified sampling frequency
        input('Press enter to listen the recorded voice--> ');
        sound(record, sample_freq);
        option = input('Press y to save or n to record again: ','s');
    end    
    wavwrite(record, sample_freq, file_name);  %Save the recorded data to a file with the specified file name in .wav format
end
[voice_read,FS,NBITS]=wavread(file_name);
norm_voice = normalize(voice_read);
norm_voice = downsmpl(norm_voice, sample_freq);
le=32;
h=daubcqf(le,'min');
function vec = normalize(vec)
temp_vec = vec-mean(vec);
sum_temp_vec = sum(temp_vec.*temp_vec);
sqrt_temp_vec = sqrt(sum_temp_vec);
vec = (1/sqrt_temp_vec)*temp_vec;
function sampled = downsmpl(voice, freq)
x=freq;
y = freq/2;
z=1;
a=1;
sampled=0;
while z<freq,
    sampled(a) = sqrt(abs(voice(z)*voice(z+1)));
    a=a+1;
    z = z+2;
end
sampled = sampled';
function [h_0,h_1] = daubcqf(N,TYPE)
%    [h_0,h_1] = daubcqf(N,TYPE); 
%
%    Function computes the Daubechies' scaling and wavelet filters
%    (normalized to sqrt(2)).
%
%    Input: 
%       N    : Length of filter (must be even)
%       TYPE : Optional parameter that distinguishes the minimum phase,
%              maximum phase and mid-phase solutions ('min', 'max', or
%              'mid'). If no argument is specified, the minimum phase
%              solution is used.
%
%    Output: 
%       h_0 : Minimal phase Daubechies' scaling filter 
%       h_1 : Minimal phase Daubechies' wavelet filter 
%
%    Example:
%       N = 4;
%       TYPE = 'min';
%       [h_0,h_1] = daubcqf(N,TYPE)
%       h_0 = 0.4830 0.8365 0.2241 -0.1294
%       h_1 = 0.1294 0.2241 -0.8365 0.4830
%
if(nargin < 2),
  TYPE = 'min';
end;
if(rem(N,2) ~= 0),
  error('No Daubechies filter exists for ODD length');
end;
K = N/2;
a = 1;
p = 1;
q = 1;
h_0 = [1 1];
for j  = 1:K-1,
  a = -a * 0.25 * (j + K - 1)/j;
  h_0 = [0 h_0] + [h_0 0];
  p = [0 -p] + [p 0];
  p = [0 -p] + [p 0];
  q = [0 q 0] + a*p;
end;
q = sort(roots(q));
qt = q(1:K-1);
if TYPE=='mid',
  if rem(K,2)==1,  
    qt = q([1:4:N-2 2:4:N-2]);
  else
    qt = q([1 4:4:K-1 5:4:K-1 N-3:-4:K N-4:-4:K]);
  end;
end;
h_0 = conv(h_0,real(poly(qt)));
h_0 = sqrt(2)*h_0/sum(h_0);   %Normalize to sqrt(2);
if(TYPE=='max'),
  h_0 = fliplr(h_0);
end;
if(abs(sum(h_0 .^ 2))-1 > 1e-4) 
  error('Numerically unstable for this value of "N".');
end;
h_1 = rot90(h_0,2);
h_1(1:2:N)=-h_1(1:2:N);

ii. Perform FFT directly from input (microphone)

% An example showing how to obtain a speech signal from microphone
% and compute its Fourier Transform (FFT)
Fs = 10000;   % Sampling Frequency (Hz)
Nseconds = 5; % Length of speech signal
fprintf('say a word immediately after hitting enter: ');
input('');
% Get time-domain speech signal from microphone
y  = wavrecord(Nseconds*Fs, Fs, 'double');
% Plot time-domain signal
subplot(2,1,1);
t=(0:(Nseconds*Fs)-1)*Nseconds/(Nseconds*Fs);
plot(t,y);
xlabel('time');
% Compute FFT
x = fft(y);
% Get response until Fs/2 (for frequency from Fs/2 to Fs, response is repeated)
x = x(1:floor(Nseconds*Fs/2));
% Plot magnitude vs. frequency
subplot(2,1,2);
m = abs(x);
f = (0:length(x)-1)*(Fs/2)/length(x);
plot(f,m);
xlabel('Frequency (Hz)');
ylabel('Magnitude');

I have some sample coding about BOF and LPC but i not sure how it work since i still not fully understand the operation of them and i seem still missing out some of the library for them..

I know I still far away from the total aim I want for this project and I hope that maybe anyone can give me a hand guide me what step do I need still or mind to share me some references coding for my speech recognition.

Hope you understand my pain since our course only teaching matlab basis but not in details and I still not fully understand the process of speech recognition.

Any help or reply will be greatly appreciated!!!

Thanks in advanced!

Regards,

ckchoy

ckchoy_0123@hotmail.com

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Wayne King el 10 de Sept. de 2011

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/15554-speech-recognition-isolated-words-1-9#answer_21119

Hi, I would recommend you look through this demo:

http://www.mathworks.com/company/newsletters/digest/2010/jan/word-recognition-system-matlab.html

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Jussi Tuovinen el 4 de Mzo. de 2015

This looks like an interesting article. Where is the full code available? What do you mean by "the top part of this page"? Thanks.

Javier el 11 de Nov. de 2020

It appears the article is no longer available. I tried to enter and see it.

Iniciar sesión para comentar.

Answer 2

Brian Hemmat el 30 de Dic. de 2019

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/15554-speech-recognition-isolated-words-1-9#answer_408157

Editada: Brian Hemmat el 20 de Mzo. de 2020

Spoken Digit Recognition with Wavelet Scattering and Deep Learning illustrates two diferent approaches to spoken digit recognition:

wavelet scattering + support vector machine
mel spectrograms + deep convolutional neural nets

Both methods achieve ~98% test accuracy.

Another approach, using LSTMs and acheiving ~97% accuracy: Sequential Feature Selection for Audio Features.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Speech recognition (Isolated words 1-9)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Más respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

Speech recognition (Isolated words 1-9)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

4 comentarios Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Más respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos