Sir, I tried for the feature extraction of a speech using mel frequency cepstral coefficient (mfcc) but the code now showing error. I dont know how to rectify this error. So, Sir can you please help me to rectify this error.
    5 visualizaciones (últimos 30 días)
  
       Mostrar comentarios más antiguos
    
    Suchithra K S
 el 1 de Feb. de 2019
  
    
    
    
    
    Editada: Walter Roberson
      
      
 el 7 de Jun. de 2020
            The code is given below,
[audio, fs1] = audioread('cryrumble.wav');
%sound(x,fs1);
   ts1=1/fs1;
    N1=length(audio);
    Tmax1=(N1-1)*ts1;
    t1=(0:ts1:Tmax1);
    figure;
  plot(t1,audio),xlabel('Time'),title('Original audio');
%   fs2 = (20/441)*fs1;
%   y=resample(audio,2000,44100);
% %sound(y,fs2);
%    ts2=1/fs2;
%    N2=length(y);
%    Tmax2=(N2-1)*ts2;
%    t2=(0:ts2:Tmax2);
%    figure;
%    plot(t2,y),xlabel('Time'),title('resampled audio');
 %Step 1: Pre-Emphasis
 a=[1];
     b=[1 -0.95];
    z=filter(b,a,audio);
    subplot(413),plot(t1,z),xlabel('Time'),title('Signal After High Pass Filter - Time Domain');
     subplot(414),plot(fs1,fftshift(abs(fft(z)))),xlabel('Freq (Hz)'),title('Signal After High Pass Filter - Frequency Spectrum');
       nchan = size(audio,2);
for chan = 1 : nchan
  %subplot(1, nchan, chan)
  spectrogram(y(:,chan), 256, [], 25, 2000, 'yaxis');
  title( sprintf('spectrogram of resampled audio ' ) );
end
% Step 2: Frame Blocking
     frameSize=1000;
%     frameOverlap=128;
%     frames=enframe(y,frameSize,frameOverlap);
%     NumFrames=size(frames,1);
frame_duration=0.03;
frame_len = frame_duration*fs1;
framestep=0.01;
framestep_len=framestep*fs1;
% N = length (x);
num_frames =floor(N2/frame_len);
% new_sig =zeros(N,1);
% count=0;
% frame1 =x(1:frame_len);
% frame2 =x(frame_len+1:frame_len*2);
% frame3 =x(frame_len*2+1:frame_len*3);
frames=[];
for j=1:num_frames
     frame=z((j-1)*framestep_len + 1: ((j-1)*framestep_len)+frame_len);
%     frame=x((j-1)*frame_len +1 :frame_len*j);
%     identify the silence by finding frames with max amplitude less than
%     0.025
max_val=max(frame);
   if (max_val>0.025)
%     count = count+1;
%     new_sig((count-1)*frame_len+1:frame_len*count)=frames;
    frames=[frames;frame];
   end
   end
   % Step 3: Hamming Windowing
NumFrames=size(frames,1);
hamm=hamming(1000)';
windowed = bsxfun(@times, frames, hamm);
     % Step 4: FFT 
% Taking only the positive values in the FFT that is the first half of the frame after being computed. 
       ft = abs( fft(windowed,500, 2) );
       plot(ft);
% Step 5: Mel Filterbanks
Lower_Frequency = 100;
Upper_Frequency = fs1/2;
% With a total of 22 points we can create 20 filters.
    Nofilters=20;
    lowhigh=[300 fs/2];
    %Here logarithm is of base 'e'
    lh_mel=1125*(log(1+lowhigh/700));
    mel=linspace(lh_mel(1),lh_mel(2),Nofilters+2);
    figure;
    plot(mel);
    xlabel('frequency in Hertz');ylabel('mels');
    title('melscale');
    melinhz=700*(exp(mel/1125)-1);
    %Converting to frequency resolution
    fres=floor(((frameSize)+1)*melinhz/fs2); 
    %Creating the filters
    for m =2:length(mel)-1
        for k=1:frameSize/2
     if k<fres(m-1)
        H(m-1,k) = 0;
    elseif (k>=fres(m-1)&&k<=fres(m))
        H(m-1,k)= (k-fres(m-1))/(fres(m)-fres(m-1));
    elseif (k>=fres(m)&&k<=fres(m+1))
       H(m-1,k)= (fres(m+1)-k)/(fres(m+1)-fres(m));
    elseif k>fres(m+1)
        H(m-1,k) = 0;    
     end 
        end
    end
        %H contains the 20 filterbanks, we now apply it to the processed signal.
    for i=1:NumFrames
    for j=1:Nofilters
        bankans(i,j)=sum((ft(i,:).*H(j,:)).^2);
    end
    end
    figure;
    plot(bankans(i,j));
    figure;
    plot(H);
    xlabel('Frequency');ylabel('Magnitude');
    title('Mel-Frequency Filter bank');
    % Step 6: Nautral Log and DCT
%     pkg load signal
    %Here logarithm is of base '10'
    logged=log10(bankans);
    for i=1:NumFrames
        mfcc(i,:)=dct2(logged(i,:));
    end
    %plotting the MFCC
    figure 
    hold on
    for i=1:NumFrames
        plot(mfcc(i,1:13));
        title('mfcc');
    end
    hold off
% save c5 mfcc
i= mfcc;
save i i
load i.mat
X=i;
k=1;
[IDXi,ci] = kmeans(X,k);
save c41i ci
The error is showing like this:
>> mfccfinal
Error using bsxfun
Non-singleton dimensions of the two input arrays must match each other.
Error in mfccfinal (line 70)
windowed = bsxfun(@times, frames, hamm);
1 comentario
  KALYAN ACHARJYA
      
      
 el 1 de Feb. de 2019
				It have multiples error, define fs1 and for debug put the plots statements in comment section, and see is there any error.
Respuesta aceptada
  Walter Roberson
      
      
 el 1 de Feb. de 2019
        
      Editada: Walter Roberson
      
      
 el 2 de Feb. de 2019
  
      You failed to set the hamming window size to either the frame size or the number of frames .
Also your frames variable is probably a column vector. you construct frame by indexing a column vector with a row vector. When you index a vector with a vector the result has the same orientation as the vector being indexed which is column vector in this case. Therefore frame is a column vector and you vertcat those together which gives you a column vector result .
I recommend that you use buffer() instead of breaking up the array yourself .
4 comentarios
  Romody Momoto Sogavo
 el 7 de Jun. de 2020
				Sir, regarding this code I tried running different audio.WAV files ( less then 5 seconds long each) but frames came up empty. As can be seen from my workspace "frame [ ]" but hamm = 1x1000 double. I'm thinking the problem is with frames not the hamm. if so what should i do to rectify this?
  Walter Roberson
      
      
 el 7 de Jun. de 2020
				
      Editada: Walter Roberson
      
      
 el 7 de Jun. de 2020
  
			Which of the 8 choices I listed did you choose?
Más respuestas (0)
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!



