Sir, I tried for the feature extraction of a speech using mel frequency cepstral coefficient (mfcc) but the code now showing error. I dont know how to rectify this error. So, Sir can you please help me to rectify this error.

2 visualizaciones (últimos 30 días)
The code is given below,
[audio, fs1] = audioread('cryrumble.wav');
%sound(x,fs1);
ts1=1/fs1;
N1=length(audio);
Tmax1=(N1-1)*ts1;
t1=(0:ts1:Tmax1);
figure;
plot(t1,audio),xlabel('Time'),title('Original audio');
% fs2 = (20/441)*fs1;
% y=resample(audio,2000,44100);
% %sound(y,fs2);
% ts2=1/fs2;
% N2=length(y);
% Tmax2=(N2-1)*ts2;
% t2=(0:ts2:Tmax2);
% figure;
% plot(t2,y),xlabel('Time'),title('resampled audio');
%Step 1: Pre-Emphasis
a=[1];
b=[1 -0.95];
z=filter(b,a,audio);
subplot(413),plot(t1,z),xlabel('Time'),title('Signal After High Pass Filter - Time Domain');
subplot(414),plot(fs1,fftshift(abs(fft(z)))),xlabel('Freq (Hz)'),title('Signal After High Pass Filter - Frequency Spectrum');
nchan = size(audio,2);
for chan = 1 : nchan
%subplot(1, nchan, chan)
spectrogram(y(:,chan), 256, [], 25, 2000, 'yaxis');
title( sprintf('spectrogram of resampled audio ' ) );
end
% Step 2: Frame Blocking
frameSize=1000;
% frameOverlap=128;
% frames=enframe(y,frameSize,frameOverlap);
% NumFrames=size(frames,1);
frame_duration=0.03;
frame_len = frame_duration*fs1;
framestep=0.01;
framestep_len=framestep*fs1;
% N = length (x);
num_frames =floor(N2/frame_len);
% new_sig =zeros(N,1);
% count=0;
% frame1 =x(1:frame_len);
% frame2 =x(frame_len+1:frame_len*2);
% frame3 =x(frame_len*2+1:frame_len*3);
frames=[];
for j=1:num_frames
frame=z((j-1)*framestep_len + 1: ((j-1)*framestep_len)+frame_len);
% frame=x((j-1)*frame_len +1 :frame_len*j);
% identify the silence by finding frames with max amplitude less than
% 0.025
max_val=max(frame);
if (max_val>0.025)
% count = count+1;
% new_sig((count-1)*frame_len+1:frame_len*count)=frames;
frames=[frames;frame];
end
end
% Step 3: Hamming Windowing
NumFrames=size(frames,1);
hamm=hamming(1000)';
windowed = bsxfun(@times, frames, hamm);
% Step 4: FFT
% Taking only the positive values in the FFT that is the first half of the frame after being computed.
ft = abs( fft(windowed,500, 2) );
plot(ft);
% Step 5: Mel Filterbanks
Lower_Frequency = 100;
Upper_Frequency = fs1/2;
% With a total of 22 points we can create 20 filters.
Nofilters=20;
lowhigh=[300 fs/2];
%Here logarithm is of base 'e'
lh_mel=1125*(log(1+lowhigh/700));
mel=linspace(lh_mel(1),lh_mel(2),Nofilters+2);
figure;
plot(mel);
xlabel('frequency in Hertz');ylabel('mels');
title('melscale');
melinhz=700*(exp(mel/1125)-1);
%Converting to frequency resolution
fres=floor(((frameSize)+1)*melinhz/fs2);
%Creating the filters
for m =2:length(mel)-1
for k=1:frameSize/2
if k<fres(m-1)
H(m-1,k) = 0;
elseif (k>=fres(m-1)&&k<=fres(m))
H(m-1,k)= (k-fres(m-1))/(fres(m)-fres(m-1));
elseif (k>=fres(m)&&k<=fres(m+1))
H(m-1,k)= (fres(m+1)-k)/(fres(m+1)-fres(m));
elseif k>fres(m+1)
H(m-1,k) = 0;
end
end
end
%H contains the 20 filterbanks, we now apply it to the processed signal.
for i=1:NumFrames
for j=1:Nofilters
bankans(i,j)=sum((ft(i,:).*H(j,:)).^2);
end
end
figure;
plot(bankans(i,j));
figure;
plot(H);
xlabel('Frequency');ylabel('Magnitude');
title('Mel-Frequency Filter bank');
% Step 6: Nautral Log and DCT
% pkg load signal
%Here logarithm is of base '10'
logged=log10(bankans);
for i=1:NumFrames
mfcc(i,:)=dct2(logged(i,:));
end
%plotting the MFCC
figure
hold on
for i=1:NumFrames
plot(mfcc(i,1:13));
title('mfcc');
end
hold off
% save c5 mfcc
i= mfcc;
save i i
load i.mat
X=i;
k=1;
[IDXi,ci] = kmeans(X,k);
save c41i ci
The error is showing like this:
>> mfccfinal
Error using bsxfun
Non-singleton dimensions of the two input arrays must match each other.
Error in mfccfinal (line 70)
windowed = bsxfun(@times, frames, hamm);
  1 comentario
KALYAN ACHARJYA
KALYAN ACHARJYA el 1 de Feb. de 2019
It have multiples error, define fs1 and for debug put the plots statements in comment section, and see is there any error.

Iniciar sesión para comentar.

Respuesta aceptada

Walter Roberson
Walter Roberson el 1 de Feb. de 2019
Editada: Walter Roberson el 2 de Feb. de 2019
You failed to set the hamming window size to either the frame size or the number of frames .
Also your frames variable is probably a column vector. you construct frame by indexing a column vector with a row vector. When you index a vector with a vector the result has the same orientation as the vector being indexed which is column vector in this case. Therefore frame is a column vector and you vertcat those together which gives you a column vector result .
I recommend that you use buffer() instead of breaking up the array yourself .
  4 comentarios
Romody Momoto Sogavo
Romody Momoto Sogavo el 7 de Jun. de 2020
Sir, regarding this code I tried running different audio.WAV files ( less then 5 seconds long each) but frames came up empty. As can be seen from my workspace "frame [ ]" but hamm = 1x1000 double. I'm thinking the problem is with frames not the hamm. if so what should i do to rectify this?

Iniciar sesión para comentar.

Más respuestas (0)

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by