Invalid training data. Responses must be nonempty.

Question

0 votos

Hello,

I am trying to build simple network which will recognize gender from voice. I have many records. I read them in DataStore but I cant get them in sequenceInputLayer. I tried everything. I know that my Neural network will maybe not work because of layers, but I only want to strat it and than I will make it accurate. Every record is longer than 6000 samples.

I gives me this error:

Error using trainNetwork (line 183)
Invalid training data. Responses must be nonempty.
Error in Program2 (line 31)
net = trainNetwork(audioTrain,layers, options)

clc;
close all;
clear all;
net = network
audio = audioDatastore(fullfile('E:\Projekt\M or F'), ...
    'IncludeSubfolders',true, ...
    'FileExtension', '.wav', ...
    'LabelSource','foldernames');
labelCount = countEachLabel(audio)
numTrainFiles = 1000;
[audioTrain,audioValidation] = splitEachLabel(audio,numTrainFiles,'randomize');
layers = [ ...
         sequenceInputLayer(6000)
    fullyConnectedLayer(10)
    softmaxLayer
    classificationLayer];
options = trainingOptions("adam", ...
    "MaxEpochs",4, ...
    "MiniBatchSize",256, ...
    "Plots","training-progress", ...
    "Verbose",false, ...
    "Shuffle","every-epoch", ...
    "LearnRateSchedule","piecewise", ...
    "LearnRateDropFactor",0.1, ...
    "LearnRateDropPeriod",1, ...
    'ValidationFrequency',100);
net = trainNetwork(audioTrain,layers, options)

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Iniciar sesión para seguir la actividad

Answer 1

jibrahim el 1 de Mzo. de 2021

Abrir en MATLAB Online

1 voto

Hi Martin,

You can't pass an audioDatastore directly to the network. Create a transform datastore that organizes the data into (audio,label) pairs.

The code below is a simple example where we try to recognize a speaker using an idea similar to yours. The accuracy is not good, but hopefully it is a good starting point.

If you have not done so already, O also recommend looking into this gender ID example in Audio Toolbox:

https://www.mathworks.com/help/deeplearning/ug/classify-gender-using-long-short-term-memory-networks.html

You might have better luck extracting features from the audio, rather than passing the raw audio to a network.

In any case, here is some example code:

% Download the FSDD data set 
url = 'https://ssd.mathworks.com/supportfiles/audio/FSDD.zip';
datasetFolder = tempdir;
unzip(url,datasetFolder)
% Create datastore
% Use speaker name in file name as label
ads = audioDatastore(fullfile(datasetFolder,'FSDD'), ...
    'IncludeSubfolders',true);
[~,filenames] = fileparts(ads.Files);
ads.Labels = categorical(extractBetween(filenames,'_','_'));
[adsTrain,adsValidation] = splitEachLabel(ads,.9);
inputSize = 500;
numHiddenUnits = 100;
numClasses = length(unique(ads.Labels));
layers = [ ...
    sequenceInputLayer(inputSize)
    bilstmLayer(numHiddenUnits,"OutputMode","sequence")
    bilstmLayer(numHiddenUnits,"OutputMode","last")
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer];
% Transformed datastores to be passed directly to network
tdsTrain = transform(@(x,info)processData(x,inputSize,info),adsTrain,'IncludeInfo',true);
tdsValidation = transform(@(x,info)processData(x,inputSize,info),adsValidation,'IncludeInfo',true);
options = trainingOptions("adam", ...
    "MaxEpochs",4, ...
    "MiniBatchSize",256, ...
    "Plots","training-progress", ...
    "Verbose",false, ...
    "Shuffle","every-epoch", ...
    "LearnRateSchedule","piecewise", ...
    "LearnRateDropFactor",0.1, ...
    "LearnRateDropPeriod",1, ...
    "ValidationData",tdsValidation,...
    'ValidationFrequency',100);
net = trainNetwork(tdsTrain,layers, options)

Here is the transform function I used:

function [data,info] = processData(audio,inputSize,info)
    % Break audio into sequences to length inputSize with overlap
    % inputSize/2
    audio = buffer(audio,inputSize,floor(inputSize/2));
    audio = mat2cell(audio,inputSize,ones(1,size(audio,2))).';
    label = repmat(info.Label,size(audio,1),1);
    
    data = table(audio,label);
end

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Martin Hájek el 3 de Mzo. de 2021

thanks

Iniciar sesión para comentar.

Invalid training data. Responses must be nonempty.

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Más respuestas (0)

Categorías

Etiquetas

Community Treasure Hunt

Invalid training data. Responses must be nonempty.

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Más respuestas (0)

Categorías

Etiquetas

Ver también

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos