Borrar filtros
Borrar filtros

Preparing input data for classification using LSTM

8 visualizaciones (últimos 30 días)
I am interested in classifying graphs (senquence) data to category labels. I saw that I could use LSTM however, I would like know how the primary sequence data is store for inputing into the LSTM, I also want to know how to attach know labels to each graph for purpose of training.
In this there is a variable / struture called waveform, how was it constructed?
Please assist
  2 comentarios
Cris LaPierre
Cris LaPierre el 31 de Mayo de 2024
The format is described at the top of the linked example.
You can also find it described at the top of this example: Sequence-to-One Regression Using Deep Learning
Ernest Modise - Kgamane
Ernest Modise - Kgamane el 1 de Jun. de 2024
Editada: Ernest Modise - Kgamane el 1 de Jun. de 2024
Hi Cris
I am looking at your response, I am trying to understand it, please see my code and input file and explain where I went wrong
label = strings(997,1);
label(1:200) = 'graphtype1';
label(201:399) = 'graphtype2';
label(400:598) = 'graphtype3';
label(599:798) = 'graphtype4';
label(799:997) = 'graphtype5';
className = categorical(label);
className2 = categories(className);
Datain = xlsread('C:\Users\ernes\OneDrive\Documents\MATLAB\LSTMdataIn.xlsx');
% Above Datain has 897 graphs each with 100 samples
% E.g for graphs Datain(1:200,:) - graphtype 1
% graphs Datain(201:399) - graphtype 2
%So my objective is to train my LSTM using the graphs to labels
numObservations = 997;
[idxTrain,idxTest] = trainingPartitions(numObservations,[0.9 0.1]);
XTrain = Datain(idxTrain,:);% in Xtrain - there are 897 graphs each with 100 values, so
% Xtrain is 897 x 100,
TTrain = className(idxTrain,:);
numHiddenUnits = 120;
numClasses = 5;
layers = [
sequenceInputLayer(100) % I am not sure about this input, because my data comes in 1 by 100 arrys of a seq
%,with 1 - 100 ms timestamps
bilstmLayer(numHiddenUnits,OutputMode="last")
fullyConnectedLayer(numClasses)
softmaxLayer]
options = trainingOptions("adam", ...
MaxEpochs=200, ...
InitialLearnRate=0.002,...
GradientThreshold=1, ...
Shuffle="never", ...
Plots="training-progress", ...
Metrics="accuracy", ...
Verbose=false);
net = trainnet(XTrain,TTrain,layers,"crossentropy",options);

Iniciar sesión para comentar.

Respuesta aceptada

Cris LaPierre
Cris LaPierre el 31 de Mayo de 2024
It is a mat file. This is a way of saving variables in MATLAB to a file (see save). It loads 3 variables to the Workspace
  • data - a 1000x1 cell array. Each cell contains an nx3 array of signal data
  • freq - 1000x1 array. This is the frequency of the corresponding observation
  • labels - a 1000x1 categorical array containg the waveform label for the corresponding observation
You don't need to create a mat file. You just need to organze your data into a numObservations-by-1 cell array of sequences as the input data.
Each sequence (cell of data) is a numTimeSteps-by-numChannels numeric array, where numTimeSteps is the number of time steps of the sequence and numChannels is the number of channels of the sequence.
The label data is a numObservations-by-1 categorical vector.
You do not need to use freq for the example you are using.
  3 comentarios
Ernest Modise - Kgamane
Ernest Modise - Kgamane el 7 de Jun. de 2024
The above answer did not help me. Regards.
Cris LaPierre
Cris LaPierre el 12 de Jun. de 2024
Editada: Cris LaPierre el 14 de Jun. de 2024
Thank you for adding your data. That makes it easier to help.
If you separate your data into a cell array where each cell contains 3 signals of the same type (a 100x3 matrix), you can the use the example code. The challenging part is that you do not have an exact multiple of 3 of all your signals. That makes the actual code a little more complicated. Still, something like this should work.
Datain = readmatrix('LSTMdataIn.xlsx');
% orient the data to be time x sample
Datain = Datain';
% split the data into an numObservatoins x 1 cell array
% Each cell contains a 100x3 matrix. All 3 signals are of the same type
% Also create
data = {};
labels = {};
s=2;
idx = [0 200 399 598 798 997];
sig = 1:3:997;
L = {'graphtype1' 'graphtype2' 'graphtype3' 'graphtype4' 'graphtype5'};
for c = 2:length(sig)
if sig(c)>idx(s-1) && sig(c)+2<=idx(s)
data(end+1,:) = {Datain(:,sig(c-1):sig(c)-1)};
labels(end+1,:) = L(s-1);
elseif sig(c)==idx(s)
data(end+1,:) = {Datain(:,sig(c-1):sig(c)-1)};
labels(end+1,:) = L(s-1);
s=s+1;
else
s=s+1;
% skip cells that would contain a mix of signal types
continue
end
end
You can then pick up using the code from the example
numChannels = size(data{1},2);
idx = [3 4 5 12];
figure
tiledlayout(2,2)
for i = 1:4
nexttile
stackedplot(data{idx(i)},DisplayLabels="Channel "+string(1:numChannels))
xlabel("Time Step")
title("Class: " + string(labels(idx(i))))
end
labels = categorical(labels);
classNames = categories(labels)
classNames = 5x1 cell array
{'graphtype1'} {'graphtype2'} {'graphtype3'} {'graphtype4'} {'graphtype5'}
numObservations = numel(data);
[idxTrain,idxTest] = trainingPartitions(numObservations,[0.9 0.1]);
XTrain = data(idxTrain);
TTrain = labels(idxTrain);
XTest = data(idxTest);
TTest = labels(idxTest);
The only change you must make is numClasses = 5;
You must open the LSTM example locally and set that as your current folder in order to get the helper function trainingPartitions.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Image Data Workflows en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by