how to divide a data set randomly into training and testing data set?
Mostrar comentarios más antiguos
Hello guys, I have a dataset of a matrix of size 399*6 type double and I want to divide it randomly into 2 subsets training and testing sets by using the cross-validation.
i have tried this code but did get what i want https://www.mathworks.com/help/stats/cvpartition-class.html
Could anyone help me to do that?
Expected outputs:
training_data: k*6 double
testing_data: l*6 double
1 comentario
chocho
el 16 de Abr. de 2018
Respuesta aceptada
Más respuestas (8)
Jeremy Breytenbach
el 24 de Mayo de 2019
Editada: Jeremy Breytenbach
el 24 de Mayo de 2019
3 votos
Hi there.
If you have the Deep Learning toolbox, you can use the function dividerand: https://www.mathworks.com/help/deeplearning/ref/dividerand.html
[trainInd,valInd,testInd] = dividerand(Q,trainRatio,valRatio,testRatio) separates targets into three sets: training, validation, and testing.
1 comentario
Koosha
el 30 de Mzo. de 2022
Thank you
ALDO
el 2 de Feb. de 2020
2 votos
you can use The helper function 'helperRandomSplit', It performs the random split. helperRandomSplit accepts the desired split percentage for the training data and Data. The helperRandomSplit function outputs two data sets along with a set of labels for each. Each row of trainData and testData is an signal. Each element of trainLabels and testLabels contains the class label for the corresponding row of the data matrices.
percent_train = 70;
[trainData,testData,trainLabels,testLabels] = ...
helperRandomSplit(percent_train,Data);
make sure to have the proper toolbox to use it.
1 comentario
Lucrezia Cester
el 7 de Feb. de 2021
could you please send a link to this function?
sidra ashiq
el 23 de Nov. de 2018
1 voto
Training = A(idx(1:round(P*m)),:) ;
what is the A function??
2 comentarios
Mohamed Marei
el 17 de Dic. de 2018
A is the vector or array indexed by the elements inside the bracket. It is not a function.
madhan ravi
el 17 de Dic. de 2018
A is a matrix
Mehernaz Savai
el 26 de Mayo de 2022
Editada: Mehernaz Savai
el 26 de Mayo de 2022
You can partition data in a number of ways:
Let X be your input matrix. You can also use similar workflow for Tables.
% Partiion with 40% data as testing
hpartition = cvpartition(size(X,1),'Holdout',0.4);
% Extract indices for training and test
trainId = training(hpartition);
testId = test(hpartition);
% Use Indices to parition the matrix
trainData = X(trainId,:);
testData = X(testId,:);
% Partiion with 60:20:20 ratio for training,validation and testing
% respectively
[trainId,valId,testId] = dividerand(size(X,1),0.6,0.2,0.2);
% Use Indices to parition the matrix
trainData = X(trainId,:);
valData = X(valInd,:);
testData = X(testId,:);
Pramod Hullole
el 5 de Mzo. de 2019
0 votos
hello sir,
iI'm new to the neuralnetworks..now i am working on my projects which is leaf disease detections using image processing. i am done with feature extraction and now not getting what is the next step..i know that i should apply nn and divide it in training and testing data set.. but in practically how to procced that's what i am not getting .please help me through this... please send steps..each steps in details. .
1 comentario
Savas Yaguzluk
el 8 de Mzo. de 2019
Dear Pramod,
Open a new topic and ask your question there. So, people can see your topic title and help you.
Hossein Amini
el 15 de Jul. de 2019
0 votos
Hi there, it worked for me but I have problem in rest of the code. In newrb doc, it has been witten how to write the code but the more tried that I did, I got error like below.

Hossein Amini
el 15 de Jul. de 2019
[z,r] = size(X);
idx = randperm(z);
TrainX = (X(idx(1:round(Ptrain.*z)),:))';
TrainY = (Y(idx(1:round(Ptrain.*z)),:))';
TestX = (X(idx(round(Ptrain.*z)+1:end),:))';
TestY = (Y(idx(round(Ptrain.*z)+1:end),:))';
If I'm not mistaken, in newrb doc, the size of input data and output data should be same like (4x266 and 1x266), that's why I transposed that matrixes. But the error which I got is specifying zeros matrix. I don't know how to prepare that.
ranjana roy chowdhury
el 15 de Jul. de 2019
0 votos
the dataset is WS Dream dataset with 339*5825.The entries have values between 0 and 0.1,few entries are -1.I want to make 96% of this dataset 0 excluding the entries having -1 in dataset.
Categorías
Más información sobre Gaussian Process Regression en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!