How can I read huge amount of image files and take the corresponding histogram data to a matrix in order?

Hi all, Suppose I have a database of 100000 image files (the names of files are numbers 1.jpg to 100000.jpg) . I want to compute the histogram of individual grayscaled versions of the image and store it to a matrix in the order of file names numerically. ie Finally I will be having a matrix of size 256 * 100000, where the first column will be histogram of 1.jpg, 2nd column will be histogram of 2.jpg and so on.
As a first step, I tried the code below. But since the output was cell, I could not sort it the way I wanted. Also it took lot time.
file = dir('*.jpg');
n = length(file);
images = cell(n,1);
for k = 1 : n
images{k} = imread(fullfile( file(k).name));
end
Here file(2).name = 10.jpg , but what is required for me is 2.jpg. What is the efficient way to code this? Can I do this without for loop?

 Respuesta aceptada

Code for creating those filenames is in the FAQ :<http://matlab.wikia.com/wiki/FAQ#How_can_I_process_a_sequence_of_files.3F>. Though be sure to add an "if exist(fullFileName, 'file')" before you attempt to use the file. Like Dishant said, no need to store all the images, and just append a row onto your accumulator array of histograms.

5 comentarios

Thanks for the link. Do we have any option other than using for loop for 100000 files?
See Anand's suggestion. But it's not the loop that's taking up all the time. Look:
tic
for k = 1:100000
a=1;
end
toc
Elapsed time is 0.000390 seconds.
So don't worry about the loop - a small loop of 100,000 iterations doesn't take any time at all, just 390 microseconds. It's the image analysis within the loop that will take all the time and you should look into Anand's suggestion to speed it up.
OK, I will do. But which answer must I accept now? :o
Well, do you even have the parallel computing toolbox? I don't so I can't really help you. Most of my images just take 2-3 seconds to analyze, and I have usually less than a hundred or so, so speed is not really a big concern for me. If you have the toolbox, give it a shot. You could also try to perform some computations on the GPU.
Yes I do have Parallel computing toolbox, but I am on a isolated machine.

Iniciar sesión para comentar.

Más respuestas (2)

If you have the parallel computing toolbox, you could try something like this:
allHists = zeros(256,numel(file));
parpool;
parfor n = 1 : numel(file)
allHists(:,n) = imhist(imread(file(n).name));
end
Ideally you send this out to a cluster and not your local machine. If you have an older version of MATLAB, you might have to use matlabpool instead of parpool.

4 comentarios

But I have only one machine with me , no cluster or network. So will it work for me.
It will, but the benefits won't be as high. It should still be faster than doing it serially.
If your machine is relatively recent (within the last few years), it should be multicore and that should do it for you.
I have a dual core processor, and yes it was relatively faster.

Iniciar sesión para comentar.

file = dir('*.jpg');
fileNames = {file.name};
fileNames = sort_nat(fileNames); % sorts string efficiently
sort_nat is not a built-in function. You can get it from file exchange, here: sort_nat
And you need not to stack images one over another in cell, it will consume more memory. Just read an image compute histogram and append it to your result one by one in loop.

1 comentario

Thanks for the answer. sort_nat is a good function. And yes I am trying for appending histograms, but for loop takes a lot time for 100000 images. Any suggestions will be helpful.

Iniciar sesión para comentar.

Categorías

Más información sobre Convert Image Type en Centro de ayuda y File Exchange.

Preguntada:

el 2 de Abr. de 2014

Comentada:

el 3 de Abr. de 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by