Image Regression using .mat Files and a datastore

I would like to train a CNN for image regression using a datastore. My images are stored in .mat files (not png or jpeg). This is not image-to-image regression, rather an image to single regression label problem. Is it possible to do this using a datastore, or at least some other out-of-memory approach?

 Respuesta aceptada

luisa di monaco
luisa di monaco el 7 de Dic. de 2019
Editada: luisa di monaco el 2 de En. de 2020
I have solved something similar.
I'm trying to train a CNN for regression. My inputs are numeric matrices of size 32x32x2 (each input includes 2 grayscale images as two channels). My outputs are numeric vectors of length 6.
500 000 is the total amount of data.
I created 500 000 .mat file for inputs in folder 'inputData' and 500 000 .mat file for target in folder 'targetData'. Each .mat file contains only 1 variable of type double called 'C'.
The size of C is 32x32x2 (if input) or 1x6 (if target).
inputData=fileDatastore(fullfile('inputData'),'ReadFcn',@load,'FileExtensions','.mat');
targetData=fileDatastore(fullfile('targetData'),'ReadFcn',@load,'FileExtensions','.mat');
inputDatat = transform(inputData,@(data) rearrange_datastore(data));
targetDatat = transform(targetData,@(data) rearrange_datastore(data));
trainData=combine(inputDatat,targetDatat);
% here I defined my network architecture
% here I defined my training options
net=trainNetwork(trainData, Layers, options);
function image = rearrange_datastore(data)
image=data.C;
image= {image};
end

18 comentarios

Matthew Fall
Matthew Fall el 10 de Dic. de 2019
Seems like it should work! I've already moved away from this project, but I will give this a try if it comes up again. What version of Matlab did you use?
2019b because I need to define a custom regression layer. I don't know if 2019b is needed to use filedatastore, combine and transform.
shi long liu
shi long liu el 14 de Jul. de 2020
Editada: shi long liu el 14 de Jul. de 2020
Thank you.
This is wonderful answer I have met in hanlding with this problem.
supriya Naik
supriya Naik el 7 de Oct. de 2020
500000 data means is this 500000 number of images???
Yes!
Sofia Esteves
Sofia Esteves el 11 de Mzo. de 2021
Hello Luisa,
Precious info!
How did you define the validation set using this scheme?
Hi Sofia,
this is my main file (I used it to run both generation of inputs and training).
I hope this code can answer your question =)
clear
close all
clc
%% IMAGE GENERATION
tot_imagepairs=500000; % tot_imagepairs is the total number of input (validation + training)
val_fraction=1/10; % fraction of tot_imagepairs for validation set
val_imagepairs=tot_imagepairs*val_fraction; % validation set
train_imagepairs=tot_imagepairs-val_imagepairs; % training set
mkdir inputData train
mkdir inputData val
mkdir targetData train
mkdir targetData val
% image generator4 is a function that generates synthetic images (size 32x32x2)
% it uses a for loop and it saves images in the folder passed as input ('val' or 'train')
% as .mat file
tic
image_generator4('val', val_imagepairs,1)
toc
tic
image_generator4('train', tot_imagepairs,val_imagepairs+1)
toc
%% TRAINING
train_cnn_piv4_x2019b % script for datastore set up and training
Sofia Esteves
Sofia Esteves el 13 de Mzo. de 2021
Yes, thank you for your reply! I assume you also use the fileDatastore, transform and combine functions for the validation set to later insert in the options field?
Yes!
Sofia Esteves
Sofia Esteves el 14 de Abr. de 2021
Editada: Sofia Esteves el 14 de Abr. de 2021
Hello again, Luisa!
I have another question: did you use the 'Shuffle','every-epoch' training option and parallel or multi-GPU training?
When I do, I get this warning: Input datastore is not shuffleable but trainingOptions specified shuffling. Training will proceed without shuffling.
The following isPartionable and isShuffeable functions return 1 in case the datastore is partionable/shuffeable and 0 in case it is not.
tf = isPartitionable(inputDatat)
tf = 1
tf = isShuffleable(inputDatat)
tf = 0
tf = isPartitionable(trainData)
tf = 0
tf = isShuffleable(trainData)
tf = 0
Were you able to solve this problem? Thank you
Hi, Sofia! No, I faced this problem and found no solution. Fortunately, it was not a critical issue in my case, because I used randomly generated synthetic data and I managed to train my net even without shuffling. If your data absolutely need to be shuffled, I think you can try to shuffle them somehow before training and then you may be able to perform training without shuffling.
Sofia Esteves
Sofia Esteves el 18 de Abr. de 2021
Ok, thank you so much once again :)
tianliang wang
tianliang wang el 28 de Abr. de 2021
Editada: tianliang wang el 28 de Abr. de 2021
Hi,Luisa, I have two folders (input and traget), each of these two folder has 100 mat files(image). I want to know how can i define the validation and the test dataset, as we all know, the imagedatastore use the function of splitEachLabel. And, how to set the training options?
luisa di monaco
luisa di monaco el 28 de Abr. de 2021
Editada: luisa di monaco el 28 de Abr. de 2021
Hi.
I think the easiest way to set training options is to find a way to separate training and validation data before datastore generation.
I put training data and validation data into different folders (it was easy in my case because I generated synthetic data using Matlab code). Then, I defined a datastore for training data and a different datastore for validation data. Here is my code. I hope this can help!
%% IMAGE GENERATION
tot_imagepairs=500000; % image pairs for training
val_fraction=1/10; % validation data [fraction of tot_imagepairs]
val_imagepairs=tot_imagepairs*val_fraction;
train_imagepairs=tot_imagepairs-val_imagepairs;
mkdir inputData train
mkdir inputData val
mkdir targetData train
mkdir targetData val
image_generator4('val', val_imagepairs,1) % generation of validation dataset
image_generator4('train', tot_imagepairs,val_imagepairs+1) % generation of training dataset
%% LOAD AND REARRANGE DATA
% training data
inputData=fileDatastore(fullfile('inputData', 'train'),'ReadFcn',@load,'FileExtensions','.mat');
targetData=fileDatastore(fullfile('targetData','train'),'ReadFcn',@load,'FileExtensions','.mat');
inputDatat = transform(inputData,@(data) rearrange_datastore(data));
targetDatat = transform(targetData,@(data) rearrange_datastore(data));
trainData=combine(inputDatat,targetDatat);
% validation data
inputData=fileDatastore(fullfile('inputData', 'val'),'ReadFcn',@load,'FileExtensions','.mat');
targetData=fileDatastore(fullfile('targetData','val'),'ReadFcn',@load,'FileExtensions','.mat');
inputDatat = transform(inputData,@(data) rearrange_datastore(data));
targetDatat = transform(targetData,@(data) rearrange_datastore(data));
valData=combine(inputDatat,targetDatat);
%%
options = trainingOptions(...,
'Validationdata', valData,...
'ValidationFrequency',1000);
OK! Thanks for your reply. I have solved my problem!
@tianliang wang hello sir, can I please get in touch with you, I need you to help me solve the same problem
Fadhurrahman
Fadhurrahman el 6 de En. de 2022
Editada: Fadhurrahman el 6 de En. de 2022
hello @luisa di mona how did you create all 50000 mat files with 32x32? is there any refrence to do it?
Hi,
the creation process was part of my thesis work. Here you can download my thesis:
http://webthesis.biblio.polito.it/id/eprint/14716 . Dataset creation is described in chapter 4 (4.2, 4.3 and 4.5) .
Here you can find some Matlab code: https://github.com/lu-p/standard-PIV-image-generator
I hope this can help.

Iniciar sesión para comentar.

Más respuestas (2)

Johanna Pingel
Johanna Pingel el 29 de Abr. de 2019
Editada: Johanna Pingel el 29 de Abr. de 2019

0 votos

I've used a .mat to imagedatastore conversion here:
imds = imageDatastore(ImagesDir,'FileExtensions','.mat','ReadFcn',@matRead);
function data = matRead(filename)
inp = load(filename);
f = fields(inp);
data = inp.(f{1});

2 comentarios

Matthew Fall
Matthew Fall el 29 de Abr. de 2019
Thank you for your swift reply.
Unfortunately, the matlab regression example requires loading all of the training and validation data in memory, which I want to avoid by using the datastore.
I've tried using the imageDatastore with regression labels before, but then trainNetwork gives me the error:
Error using trainNetwork (line 150)
Invalid training data. The labels of the ImageDatastore must be a categorical vector.
Is it more convenient to use mat files as the training set for the images to vectors regression ?

Iniciar sesión para comentar.

Lykke Kempfner
Lykke Kempfner el 16 de Ag. de 2019

0 votos

I have same problem.
I have many *.mat files with data that can not fit in memory. You may consider the files as not standard images. I have the ReadFunction for the files. I wish to create a datastore (?) where each sample are associated with two single values and not a class.
Are there any solution to this issue ?

2 comentarios

Tomer Nahshon
Tomer Nahshon el 22 de En. de 2020
Same here
tanfeng
tanfeng el 12 de Oct. de 2020
You could try this
tblTrain=table(X,Y)
net = trainNetwork(tblTrain,layers,options);

Iniciar sesión para comentar.

Categorías

Más información sobre Get Started with Deep Learning Toolbox en Centro de ayuda y File Exchange.

Productos

Versión

R2018b

Preguntada:

el 29 de Abr. de 2019

Comentada:

el 6 de En. de 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by