the output of datastore is different by initial state

1 visualización (últimos 30 días)
ababa
ababa el 22 de Jul. de 2023
Editada: ababa el 27 de Jul. de 2023
Thank you so much for watching this question.
below code is a matlab example of custom datastore.
below datastore include two imageDatastore, ImageX and ImageY.
The ImageX and ImageY are not custom datastore. So, maybe, I can't touch it.
I edited to extract label information from imageDatastore.
and I want to pass it through my custem read function.
If the part of 'obj.NumObservations = numel(obj.ImagesX.Files)' is commented, The suffle function part is broken.
But It output label.
If the part of 'obj.NumObservations = numel(obj.ImagesX.Files)' is not commented, The suffle function part is not broken.
but It don't output label.
what's the problem?
classdef cycleGanImageDatastore < matlab.io.Datastore & ...
matlab.io.datastore.Shuffleable
% cycleGanImageDatastore Create a Datastore to work with collections of images in 2 directory.
% IMDS = cycleGanImageDatastore(Xsize,Ysize,dirX, dirY) creates a Datastore,
% where Xsize/Ysize show the output size of images in X or Y directory
% and dirX/dirY are the path of the directory having image data to be used for training cycleGAN model.
% By calling read(IMDS), you can get unpaired images from each directory.
% Copyright 2019-2020 The MathWorks, Inc.
properties
Xsize
Ysize
DirX
DirY
ImagesX
ImagesY
MiniBatchSize
end
properties (SetAccess = protected)
NumObservations
end
methods
function obj = cycleGanImageDatastore(Xsize,Ysize,dirX, dirY)
obj.Xsize = Xsize;
obj.Ysize = Ysize;
obj.DirX = dirX;
obj.DirY = dirY;
obj.ImagesX = imageDatastore(obj.DirX,'IncludeSubfolders',true,'LabelSource','foldernames');
obj.ImagesY = imageDatastore(obj.DirY,"IncludeSubfolders",true,'LabelSource','foldernames');
obj.MiniBatchSize = 1;
obj.ImagesX.ReadSize = 1;
obj.ImagesY.ReadSize = 1;
%num = min(numel(obj.ImagesX.Files),numel(obj.ImagesY.Files));
%obj.ImagesX = obj.ImagesX.subset(1:num);
%obj.ImagesY = obj.ImagesY.subset(1:num);
%obj.NumObservations = numel(obj.ImagesX.Files);
end
function tf = hasdata(obj)
tf = obj.ImagesX.hasdata() && obj.ImagesY.hasdata();
end
function [data] = read(obj) % 画像の呼び出し:read a image
obj.ImagesX.ReadSize = obj.MiniBatchSize;
obj.ImagesY.ReadSize = obj.MiniBatchSize;
[ImagesX,infoX] = obj.ImagesX.read();
[ImagesY,infoY] = obj.ImagesY.read();
labelsX = infoX.Label;
labelsY = infoY.Label;
ImagesX = imresize(ImagesX,[128,128]);
ImagesY = imresize(ImagesY,[128,128]);
%imshow(ImagesX)
% 出力をCellでそろえる:set data type to cell
if ~iscell(ImagesX)
ImagesX = {ImagesX};
ImagesY = {ImagesY};
end
% 画像の前処理:do the preprocessing
%[transformedX, transformedY] = transformImagePair(obj,ImagesX, ImagesY);
transformedX = ImagesX;
transformedY = ImagesY;
% 正規化する:call function for normalization
[X, Y] = obj.normalizeImages(transformedX, transformedY);
% テーブル化して出力
data = table(X, Y);
end
function reset(obj)
obj.ImagesX.reset();
obj.ImagesY.reset();
end
function objNew = shuffle(obj)
objNew = obj.copy();
numObservations = objNew.NumObservations;
idx1 = randperm(numObservations);
objNew.ImagesX.Files = objNew.ImagesX.Files(idx1);
idx2 = randperm(numObservations);
objNew.ImagesY.Files = objNew.ImagesY.Files(idx2);
end
function [xOut, yOut] = normalizeImages(obj, xIn, yIn)
% 各最大値を元に正規化する:normalization with the max value
xOut = cellfun(@(x) rescale(x,'InputMin',0,'InputMax',255), xIn, 'UniformOutput', false);
yOut = cellfun(@(x) rescale(x,'InputMin',0,'InputMax',255), yIn, 'UniformOutput', false);
end
end
end
function [transformedX, transformedY] = transformImagePair(obj,ImagesX, ImagesY)
arguments
obj
ImagesX (:,1) cell
ImagesY (:,1) cell
end
finalSize = obj.Xsize(1:2); % 最終出力を指定:define the size of output
initialSize = finalSize + 30; % 最初にリサイズする大きさを指定:initial image size before cropping
mirror = rand(1) < 0.5; % 50%の確率で反転:flip or not
% データ拡張を適用:Apply augmentation
transformedX = cellfun(@(im) applyAugmentation(im, initialSize, finalSize, mirror), ...
ImagesX, ...
'UniformOutput', false);
transformedY = cellfun(@(im) applyAugmentation(im, initialSize, finalSize, mirror), ...
ImagesY, ...
'UniformOutput', false);
end
function imOut = applyAugmentation(imIn, initialSize, finalSize, mirror)
imInit = imresize(imIn, initialSize); % 画像をリサイズ
win = randomCropWindow2d(initialSize,finalSize);
imOut = imcrop(imInit,win);
if mirror
imOut = fliplr(imOut); % 左右反転:flip the image
end
end

Respuestas (1)

LeoAiE
LeoAiE el 23 de Jul. de 2023
In your code, the obj.NumObservations is essential to perform shuffling operation. It defines the number of files or images you have in your dataset. In shuffle function, you are creating a random permutation of indices from 1 to numObservations and then assigning these indices to the Files property of your ImagesX and ImagesY. If numObservations is not defined, then this operation would not make sense as the range of indices to shuffle would be undefined.
When you commented obj.NumObservations = numel(obj.ImagesX.Files);, you also commented num = min(numel(obj.ImagesX.Files),numel(obj.ImagesY.Files)); which equalizes the number of files in ImagesX and ImagesY datastores, this could be a reason for the shuffle function breaking.
If you want to output label information, you should modify your read function. Currently, you are storing label information in labelsX and labelsY but you are not using them. If you want to return them in the data table, you should add them to the table construction:
function [data] = read(obj)
% ... Your existing code ...
data = table(X, Y, labelsX, labelsY);
end
This will output a table with columns for the images and the labels. You might need to adjust this if your labels are not one per image, or if they are in a format that can't be put directly into a table.
Please make sure you uncomment this line obj.NumObservations = numel(obj.ImagesX.Files); to ensure that the shuffling operation is defined properly.
Remember that your read function should return the same structure regardless of whether a label is present for every observation. This could mean returning a missing value, NaN, or some other placeholder when a label is not available.
  1 comentario
ababa
ababa el 27 de Jul. de 2023
Editada: ababa el 27 de Jul. de 2023
Thank you for the comment,
I think its problem is on obj.copy() in shuffle function.
obj.copy() don't copy all element to NewObj, I think.
So, When I intialize the number of observations, the shuffle function work well, but the read function don't output the label.
However, When I don't initialize the number of observations, the shuffle function don't work. so the read function output the label.
what's the problem of obj.copy()? how to copy all elements?

Iniciar sesión para comentar.

Categorías

Más información sobre Data Import and Analysis en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by