Borrar filtros
Borrar filtros

Question on saving and reading .mat files

2 visualizaciones (últimos 30 días)
Kai Wu
Kai Wu el 5 de Nov. de 2016
Comentada: Walter Roberson el 8 de Nov. de 2016
I'm currently working on dynamic programming algorithm but encountered a very interesting and wired problem. To say my algorithm has three steps, and the first step creates N .mat files and the second step reads the N .mat files one by one and creates another N files in the same directory. If N is less than 10000, the running time of my second step is affordable. It costs about 10 seconds to save a new file in my second step. However, if N is 60000. My second step goes to crazy, it may take several minutes to save a new file into the existed folder.
My observation is to add a new file into 10k files is ok, but to add a new file into 60k files is impossible. I checked my memory of my computer, it only takes 50% of the memory and the disk drive has enough space. Each of my first step .mat files is about 3M and each of my second .mat files is about 300k. Is there anybody can help me to explain this problem?
Thanks,

Respuestas (1)

Walter Roberson
Walter Roberson el 5 de Nov. de 2016
Have you experimented with creating folders for each group of (say) 10000, like
per_folder = 10000;
for K = 1 : N
...
this_folder = sprintf('subset_%05d', floor(K/per_folder));
if ~exist(this_folder, 'dir'); mkdir(this_folder); end
this_filename = sprintf('output_%08d.mat', K)
mat_filename = fullfile(this_folder, this_filename);
save(mat_filename, 'Variables_to_save');
end
That is, if the problem has to do with the number of files in the folder, then write fewer files into any one folder. You can merge the folders afterwards.
  2 comentarios
Kai Wu
Kai Wu el 8 de Nov. de 2016
Dear Walter,
Thanks so much for your answer. I test this way and it will cost me more time in the first step. I really curious on why that happened. Do you think it may because of the virtual memory or other reasons that I can set a test.
Thanks so much for your time!
Kai
Walter Roberson
Walter Roberson el 8 de Nov. de 2016
Sorry I am confused about which version turned out to be faster, and how much of a difference it was?
If my test code turned out to be slightly slower then I would suggest reducing to 5000 per folder.
It probably is not virtual memory, but it might be file system limitations. But there is a possibility that you have a memory leak; check to see if the used memory keeps going up as the program runs.

Iniciar sesión para comentar.

Categorías

Más información sobre Manage Products en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by