Can't use a Validation set when training a sequence-to-sequence BiLSTM Classification model
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Mahdi Sharara
el 8 de Abr. de 2021
Respondida: Ruth
el 11 de Dic. de 2023
I am trying to train a sequence-to-sequnce classifcation model, and i use a BiLSTM layer, with Data and labels, X and Y respectively. I am getting the following error:
Error using trainNetwork (line 184)
Training and validation responses must have the same categories. To view the categories of the
responses, use the categories function.
Error in my_DNN_script (line 146)
[net , netInfo] = trainNetwork(X,Y,layers,options);
Caused by:
Error using nnet.internal.cnn.trainNetwork.DLTDataPreprocessor>iAssertClassNamesAreTheSame (line
213)
Training and validation responses must have the same categories. To view the categories of the
responses, use the categories function.
--------------------------------------------------------------------------
I set a breakpoint at the corresponding line in nnet.internal.cnn.trainNetwork.DLTDataPreprocessor>iAssertClassNamesAreTheSame (line 213).
function iAssertClassNamesAreTheSame(trainingCategories, validationCategories)
% iHaveSameClassNames Assert that the class names for the training and
% validation responses are the same.
trainingClassNames = categories(trainingCategories);
validationClassNames = categories(validationCategories);
if ~isequal(trainingClassNames, validationClassNames)
error(message('nnet_cnn:trainNetwork:TrainingAndValidationDifferentClasses'));
end
end
the problem appears to be an ordering problem. I printed the outputs of the variables trainingClassNames and validationClassNames . The number of classes is the same, but the order is different
>> trainingClassNames =
13×1 cell array
{'2' }
{'3' }
{'4' }
{'5' }
{'6' }
{'7' }
{'8' }
{'9' }
{'10'}
{'11'}
{'12'}
{'1' }
{'0' }
>> validationClassNames
validationClassNames =
13×1 cell array
{'0' }
{'1' }
{'3' }
{'4' }
{'5' }
{'6' }
{'7' }
{'8' }
{'9' }
{'10'}
{'11'}
{'12'}
{'2' }
I modifed the function nnet.internal.cnn.trainNetwork.DLTDataPreprocessor>iAssertClassNamesAreTheSame:
I used the function reordercats to do so:
function iAssertClassNamesAreTheSame(trainingCategories, validationCategories)
% iHaveSameClassNames Assert that the class names for the training and
% validation responses are the same.
trainingClassNames = categories(reordercats(trainingCategories));
validationClassNames = categories(reordercats(validationCategories));
if ~isequal(trainingClassNames, validationClassNames)
error(message('nnet_cnn:trainNetwork:TrainingAndValidationDifferentClasses'));
end
end
With this modification, the trainnet function ran without an error.
Could you please tell me if this modification should be flowless, of if it could be leading to a hidden, wrong training behaviour
1 comentario
Steve Philbert
el 12 de Oct. de 2023
I am training a classification network with k-fold cross-validation and ran into this same error. When the training and validation categories are reordered, as shown above, their values are equal.
Respuesta aceptada
Ruth
el 11 de Dic. de 2023
Hi Mahdi,
As you have noted, the order of the categories in the training and validation data must be the same to avoid an error.
The order of the categories must be the same in the training and validation data to correctly because this is used when calculating the validation loss and accuracy for each category. Therefore, reordering the categories immediately before the function checks the order will avoid the error but lead to a silent wrong answer.
Instead, please ensure that the category order is the same before training the model.
For example, if the your training labels are called "TTrain" and the validation labels are called "TValidation", you can execute:
TValidation = reordercats(TValidation, categories(TTrain));
To make sure the order of the categories are identical. You must do this before you call the trainingOptions function.
Alternatively, you can modify the way you partition the data to make sure the category order is preserved. Without example code, it's hard to say exactly how to do this, but generally if you convert all your labels to a single categorical array before partitioning, that should preserve the order of the categories.
Thanks,
Ruth
0 comentarios
Más respuestas (0)
Ver también
Categorías
Más información sobre Classification Trees en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!