Save -append increases mat file size
Mostrar comentarios más antiguos
Hi, I have a mat file that is being updated with the function save('myfile.mat', 'myvar' -append). Everytime this function is called, my mat file increases a lot in size, even though the 'myvar' has not changed. Does anybody know how I can fix this?
6 comentarios
I'm unable to reproduce your problem. Please post an example that results in the observed behavior.
a=rand(10,1);
save('tmp.mat','a')
for n=1:3
x=dir('tmp.mat');
disp(x.bytes)
save('tmp.mat','a','-append')
end
Alex Mesnier
el 18 de Feb. de 2022
I am experiencing the same problem. I have a large dataset saved in a structure 'data', which is roughly 20 MB. I have that saved with a second structure 'SCdata' (16 MB) in a .mat file 'myFile', which totals ~ 70 MB. I want to be able to save the two structures independently without overwriting the other, and I often don't have both structures loaded simultaneously.
In theory, save('myFile.mat', 'data', '-append') should be perfect; overwrite the old 'data' structure while keeping 'SCdata' untouched. However when I used '-append', the .mat file size nearly doubles, while the contents seemingly remain unchanged. This continues everytime I try to "append" the mat file. Performing save('myFile.mat', 'data', 'SCdata') reduces the file back to its orginal 70 MB size.
It seems like '-append' is not truly overwriting the existing data, which I do not understand. I have also tried overwriting the data using matfile objects, which produces the same problem. This seems to be the same problem Azucena is encoutnering.
"It seems like '-append' is not truly overwriting the existing data"
It doesn't.
The point of -append is to be fast. To achieve that it does not check and rearrange the entire file content (slow).
Which type of Mat-files are you using? v4, v6, v7, v7.3, v7.3 & -nocompression?
If compression is enabled, it is not trivial to overwrite an exitsing variable, e.g. if the size changes. Then the files suffer from the same fragmentation problems as files on the disk. Freeing the unused space would reduce the file size, but costs time.
Azucena Mendoza
el 18 de Feb. de 2022
Walter Roberson
el 19 de Feb. de 2022
When you -append to a -v7 or earlier .mat file, it doesn't update anything existing in the file: it just adds on the blob that represents the new variable. Part of the semantics of those files is that any program reading from them is responsible for scanning the content of the file, and finding the last blob with the desired variable name and using that. Earlier blobs with the same name are not even marked as deleted or as something that can potentially be released.
It is the presentation_final_final_final_last_final.doc kind of storage: you don't stop looking in your folder when you see presentation.doc or presentation_final.doc, you keep looking for the newest version and use that, leaving all the other ones where they are. Any cleanup comes later.
Respuestas (1)
Pritesh Parmar
el 15 de Abr. de 2025
Editada: Pritesh Parmar
el 17 de Abr. de 2025
0 votos
As a workaround, you could try loading the variables you want to preserve and save the MAT file again with both the variables you want to preserve and updated variables. Try the following code: fastSaveUpdate - Efficiently update variables in MAT-file - File Exchange - MATLAB Central
Pritesh Parmar
Categorías
Más información sobre Workspace Variables and MAT Files en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!