How do you turn an object into a matrix for the posterior function?
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
I'm trying to write a code that can recognize different digits of numbers in a selected sound file and transfer those digits into text--a word recognition program. I'm having trouble with the posterior function, which I'm trying to input a gmm distribution model into. Everytime I do, though, it gives me an error that I can't, so I'm not really sure how to get it to work. The code right now is:
% Define system parameters
speech = data; % data is a sound file of some spoken digits
seglength = 160; % Length of frames
overlap = seglength/2; % # of samples to overlap
stepsize = seglength - overlap; % Frame step size
nframes = length(speech)/stepsize-1;
% Visualizes the power spectral density estimates of each spoken digit in 'speech'
order = 12;
nfft = 512;
Fs = 22050;
pyulear(speech,order,nfft,Fs);
% Number of Gaussian component densities
M = 8;
model_1 = gmdistribution.fit(matrix_ones,M);
model_2 = gmdistribution.fit(matrix_twos,M);
model_3 = gmdistribution.fit(matrix_threes,M);
model_4 = gmdistribution.fit(matrix_fours,M);
model_5 = gmdistribution.fit(matrix_fives,M);
model_6 = gmdistribution.fit(matrix_sixes,M);
model_7 = gmdistribution.fit(matrix_sevens,M);
model_8 = gmdistribution.fit(matrix_eights,M);
model_9 = gmdistribution.fit(matrix_nines,M);
model_10 = gmdistribution.fit(matrix_tens,M,'regularize',1e-5);
% Find the digit model with the maximum a posteriori probability for the
% set of test feature vectors, which reduces to maximizing a log-likelihood
% value
[P, log_like] = posterior(model_1, speech);
and everytime I run it, I get
Warning: Failed to converge in 100 iterations for gmdistribution with 8 components
> In gmcluster (line 202)
In gmdistribution.fit (line 98)
In word_classifier (line 58)
Error using checkdata (line 18)
X must be a matrix with 10 columns.
Error in gmdistribution/posterior (line 24)
checkdata(X,obj);
Error in word_classifier (line 74) %the name of my file
[P, log_like] = posterior(model_1, speech);
I have no idea what to do, I've tried replacing model_1 with a cell array of all the models, I've tried using the matrixes, and I need help :(
2 comentarios
Respuestas (1)
Chris Perkins
el 8 de Ag. de 2017
Hi Jonathan,
The error given in your sample output means that there is a dimension mismatch between the two inputs to the "posterior" function. Check the size of each input - you likely will need to alter one so they have the same dimensions.
0 comentarios
Ver también
Categorías
Más información sobre Speech Recognition en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!