ImageDataStore and tall array, How to use to save Labels and 4D Matrices in for loop?

I generate 4D matrices of images (Double) and their respective labels (Double) with huge sizes in each iteration,
for example in one iteration I get images with size 120*120*1*6000 where 6000 the number of images, 1 is the number of channels, and their respective lables 1*6000
How to use ImageDataStore and tall array to store Labels and 4D Matrices in for loop in each iteration to use it then in further analysis in machine learning?
Also, Is there any other efficient solution to deal with huge data?
Note that the number of images and their labels vary in each iteration

4 comentarios

Hi @Walter Roberson Do you have any idea please?
in some cases you might be able to use tall arrays.
M
M el 13 de Nov. de 2023
Editada: M el 13 de Nov. de 2023
@Walter Robersonany hint how to pass in each iteration the 4D double matrices and double lables to imagedatastore and tall arrays please?
Because in the documentation there is no double format supported!
SupportedOutputFormats: ["png" "jpg" "jpeg" "tif" "tiff"]
Hi @Divya Gaddipati , Do you have any idea please? thanks

Iniciar sesión para comentar.

Respuestas (3)

I'm not sure what your workflow exactly looks like, but if you transform your images from double-precision floating numbers (64 bits per pixel) to uint8 (8 bits per pixel) you'll only need 1/8th of the memory capacity.
In most cases this won't even cause a loss of data, as images are often encoded in 8-bit depth anyway. In each iteration, you can then transform the (set of) image(s) you're working on back to double-precision floats so you don't have to change your workflow.
Transform the whole stack:
Ims_uint8 = cast(Ims_double*255,'uint8');
Then in each iteration:
Ims_double = cast(Ims_uint8,'double')/255;
imageDatastores are provided for situations like that.

8 comentarios

"this solution is good to store all the matrices in cell arraies" <== who told you that or what makes you say that? Not sure exactly what you're doing but it's not what I would recommend, as I said in my answer. One reason is it takes up a lot of memory unnecessarily, as you've obviously already found out. I would process each image one at a time inside a loop or with a single function called by the image Datastore.
@Matt J Any suggestion please? I am struggling in this problem, I cant proceed the work
Matt J
Matt J el 12 de Nov. de 2023
Editada: Matt J el 12 de Nov. de 2023
I can't imagine what problems you're running into with imageDataStore. Your posted code doesn't seem to show any attempt at using it, nor the error messages you encounter. For that matter as well, the code immediately above doesn't resemble that in your original post, where the variables where Images and Labels.
M
M el 13 de Nov. de 2023
Editada: M el 13 de Nov. de 2023
any hint how to pass in each iteration the 4D double matrices and double lables to imagedatastore and tall arrays please?
Because in the documentation there is no double format supported!
SupportedOutputFormats: ["png" "jpg" "jpeg" "tif" "tiff"]
When using an image data store, the matrix would not exist. Each image should be kept in its own file, which the data store would read in one-by-one as you loop. If your images must be doubles (though I wonder why), you can put them in individual .mat files. The image data store has a ReadFcn property that lets you define your own file read function, so any format that you can read should be supportable.
M
M el 13 de Nov. de 2023
Editada: M el 13 de Nov. de 2023
@Matt J still I am not getting the process, what do you mean by Each image should be kept in its own file, which the data store would read in one-by-one as you loop
do you mean that I have to save all the files first then put them in the imds? What is the benefits of doing that??
If your images must be doubles (though I wonder why),
to not change the work flow
you can put them in individual .mat files.
how to do that? by them in the desk??
I cant find any simple example, the process is complex!
do you mean that I have to save all the files first then put them in the imds? What is the benefits of doing that??
Yes. That way, the images are stored on disk, rather than consuming RAM.

Iniciar sesión para comentar.

Cell arrays are very inefficient compared to regular numerical arrays. They use a lot more memory. You could even use single instead of double to preallocate using half the memory.
Not sure what you're doing but you may be doing what a lot of beginners do and that is to read ALL the input images into one huge array in a loop, then after the loop process each slice of the huge array as an individual image. This is usually NOT the way to process a sequence of images. You usually (if it's a series of 2-D images you want to analyze and not a volumetric image that you're reading in slice-by-slice) want to have a loop where you read in one image at a time and then analyze it immediately in that same iteration. See the FAQ:
Another option is an imageDatastore as @Matt J already mentioned.
If you must read all the images into memory at once time, then try to use a single array rather than a cell array.

Categorías

Preguntada:

M
M
el 7 de Nov. de 2023

Comentada:

el 13 de Nov. de 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by