How to generate audio from colored MFCC in the form of image?

Question

Shilpa Sonawane el 9 de Mayo de 2023

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/1960399-how-to-generate-audio-from-colored-mfcc-in-the-form-of-image

Comentada: Shilpa Sonawane el 15 de Mayo de 2023

In above fig, 41 images are tiled. Each image is generated mfcc. Its size is 28x28x3. It is rgb image. I have to find inverse of mfcc to generate sound. I have used inverse mfcc function from mathworks website. but it is applicable to 2-D matrix of mfcc. It can not be used with rgb image. I am unable to find audio signal from coloured mfcc.

please provide guidance.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Sarthak el 15 de Mayo de 2023

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/1960399-how-to-generate-audio-from-colored-mfcc-in-the-form-of-image#answer_1235564

Hi Shilpa,

MFCC goes through multiple computations like STFT, Mel Projection, etc. Assuming you are mentioning the invMFCC function available on FileExchange, it should reverse all such computations and introduce necessary approximations. If it is only applicable for double 2-D matrices, you can use the im2double function. However, if the function doesnt behave as expected, you may need to write your own inverse functions for MFCC or leverage other third-party libraries.

Attaching links to the mentioned functions for your reference

https://www.mathworks.com/help/matlab/ref/im2double.html

https://www.mathworks.com/matlabcentral/fileexchange/53186-invmfccs

2 comentarios
Mostrar NingunoOcultar Ninguno

Shilpa Sonawane el 15 de Mayo de 2023

Thank you so much. I will use im2double function before invmfccs. Thanks a lot.

Shilpa Sonawane el 15 de Mayo de 2023

Sir,

In my project, mfccs are stored in RGB image format. It is 3-D matrix. I used im2double function before calling invmfcc. I tried invmfccs which is available on Mathworks website.I am facing so many errors.

Iniciar sesión para comentar.

Answer 2

Brian Hemmat el 15 de Mayo de 2023

1
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/1960399-how-to-generate-audio-from-colored-mfcc-in-the-form-of-image#answer_1235829

Editada: Brian Hemmat el 15 de Mayo de 2023

Abrir en MATLAB Online

Inverting MFCC will require knowledge of the algorithm and parameters used to extract the MFCC. Note that perfectly reconstructing the audio is by definition impossible since you are discarding a lot of information during the feature extraction. It's just an interesting toy to understand what information is actually preserved in MFCC.

Here is an example of extracting MFCC using Audio Toolbox functionality and then attempting to reconstruct.

[audioIn,fs] = audioread("Counting-16-44p1-mono-15secs.wav");
audioIn = resample(audioIn,16e3,fs);
fs = 16e3;
audioIn = audioIn./max(abs(audioIn));
%% Extract MFCC
% This is roughly equivalent to how audioFeatureExtractor works with
% default settings.
win = hann(round(fs*0.025),"periodic");
overlap = round(fs*0.01);
fftlength = 2048;
numBands = 40;
% Design auditory filter bank.
filterBank = designAuditoryFilterBank(fs, ...
    FrequencyScale="mel", ...
    NumBands=numBands, ...
    FFTLength=fftlength, ...
    FrequencyRange=[0,4000], ...
    Normalization="none");
% Compute STFT of speech signal.
X = stft(audioIn, ...
    Window=win, ...
    OverlapLength=overlap, ...
    FFTLength=fftlength, ...
    FrequencyRange="onesided");
% Compute Mel auditory spectrogram
B = filterBank*abs(X);
% Compute cepstral coefficients
coeffs = cepstralCoefficients(B);
%% Visualize MFCC
figure
imh = imagesc(normalize(coeffs'));
ylabel('Coefficient')
xlabel('Frame')
set(imh.Parent,YDir="normal")

%% Inverse MFCC
% Inverse Cepstral Coefficients
B_reconstruct = 10.^(idct(coeffs',size(filterBank,1),Type=2));
% Inverse Mel Spectrum
% Scale the band per time step energy.
B_reconstruct = permute(B_reconstruct,[1,3,2]);
bands = filterBank.*B_reconstruct;
% Sum the bands at each time step so that you have a single spectrum per
% time step.
X = squeeze(sum(bands,1));
% Reconstruct signal from magnitude spectrum
audioOut = stftmag2sig(X,fftlength,fs,FrequencyRange="onesided", ...
    OverlapLength=overlap,Window=win,Method='fgla');
% Clean up the edges
audioOut([1:numel(win),end-numel(win):end]) = 0;
audioOut = audioOut./max(abs(audioOut));
%% Listen to and plot the reconstruction
soundsc(audioIn,fs),pause(size(audioIn,1)/fs+1)
soundsc(audioOut,fs)
figure
tiledlayout(3,1)
nexttile
plot(audioIn,'bo'),hold on
plot(audioOut,'r*'),hold off
legend("Original","Reconstruction")
nexttile
stft(audioIn,fs,Window=win,OverlapLength=overlap,FrequencyRange="onesided")
title("Original audio")
nexttile
stft(audioOut,fs,Window=win,OverlapLength=overlap,FrequencyRange="onesided")
title("Reconstructed audio")

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Shilpa Sonawane el 15 de Mayo de 2023

Thank you sir. I will definately go through it.

Iniciar sesión para comentar.

How to generate audio from colored MFCC in the form of image?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar NingunoOcultar Ninguno

Más respuestas (1)

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

How to generate audio from colored MFCC in the form of image?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar NingunoOcultar Ninguno

Más respuestas (1)

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos