Matlab doesn't release memory after imported table is cleared

10 visualizaciones (últimos 30 días)
javiayk
javiayk el 8 de Ag. de 2023
Editada: Jeremy Hughes el 4 de En. de 2024
In an iterative process I'm importing data from a series of .csv files (~2 GB) into Matlab tables. For each iteration I process the data in Matlab using that format. In order to release the memory for the next import I use clear at the end of the iteration.
However, after some 'out of memory' issues I've realised that actually Matlab is not clearing the memory at the end of each iteration. The table is removed from the workspace but there is not memory released.
I've beeing checking if this is the case for tables with random sizes created within the code instead of imported ones, but in that case Matlab releases the memory after using clear T. Then, I think it's something linked to the fact that I'm importing the data from external files.
Here below some code to ilustrate my issue:
list_of_dir=dir(all_files);
for i=1:size(list_of_dir,1)
%import data from .csv file into table by iterating over list of
%directories
my_filepath=list_of_dir(i).folder + "/" + list_of_dir(i).name;
T=readtable(my_filepath);
%here I do my processing of T
%block of code
%
%I check memory before clearing
m=memory;
mymem(2*i-1)=m.MemUsedMATLAB/10^6;
%I clear T in order to relase memory for the next iteration
clear T
%leave time to memory for clearing before checking again
pause(10)
%I check memory after clearing
m=memory;
mymem(2*i)=m.MemUsedMATLAB/10^6;
end
Unrecognized function or variable 'all_files'.
plot(mymem)
I would expect to see some sawtooth-ish graph (both here and in the task manager) with the memory decreasing to a baseline after the 'clear', but instead I obtain something like the image below, which finally leads to an out-of-memory error.
In the graph, the odd numbers in the horizontal axis correspond to the memory before 'clear T' and the even positions correspond to the memory after 'clear T'. The memory use slightly decreases after using 'clear' but not at the previous level before the creation of T, so it finally collapses.
  7 comentarios
dpb
dpb el 11 de Ag. de 2023
Indeed; if @javiayk can reproduce this in a test case that others can duplicate, then there's apparently a memory in that case but there's insufficient data provided to be able to diagnose anything...
javiayk
javiayk el 14 de Ag. de 2023
Editada: javiayk el 14 de Ag. de 2023
Thank you @Siddharth Bhutiya for checking it.
It is indeed strange that you cannot reproduce the memory leak that I'm seeing. Maybe it's linked to the type of data and file size, or maybe it's a problem of my machine
Just in case it's related to the data, my .csv files (of which unfortunately I cannot provide any copy) it's not just double data, but it would be something like this, with ~50 variables of different types (double, int64, text) and hundred of thousands or millions of rows.
By the way, I realised the same problem when working with readgeotable() and using .shp as input.
Dummy example of my data:

Iniciar sesión para comentar.

Respuestas (1)

Jeremy Hughes
Jeremy Hughes el 4 de En. de 2024
Editada: Jeremy Hughes el 4 de En. de 2024
I'd caution against using the memory function to diagnose anything. This diagnostic code block may not have the same dynamics as the code you actually care about, and the results of memory aren't related at all to the cause of your particular issue.
I typically ask these questions when someone asks this kind of question:
  1. Are you getting an "Out of Memory" error?
  2. Does MATLAB become sluggish or unresponsive?
  3. Does MATLAB Crash?
If you're not experiencing one of those things, you likely don't have a memory problem.
Since you do see an OOM error, I'd look at the original code and try to determine if there is something else causing that issue. The whos function is useful for seeing what each variable is reporting to use.
MATLAB has it's own memory management system and can reuse allocated memory even if, according to the memory command or the OS, you might suspect that it doesn't have enough memory to do what you want.
T = readtable(...) should implicitly clear T each loop, but MATLAB might also reuse the memory.
Does this show anything different that what you're seeing?
N=100;
m = zeros(1,N);
for i = 1:N;
T = readtable("airlinesmall_subset.xlsx"); % feel free to change the file for something else.
mem = memory();
m(i) = mem.MemUsedMATLAB;
end
clear T;
mem = memory();
m(end+1) = mem.MemUsedMATLAB;
plot(1:N+1,m);
ylim([0 1.4*max(m)]);
xlim([1 N+1]);
This is what I see:

Categorías

Más información sobre Matrix Indexing en Help Center y File Exchange.

Productos


Versión

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by