Creating multiple equally sized matrices from a single numerical cell
3 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Aaron Smith
el 7 de Feb. de 2017
Comentada: Stephen23
el 15 de Feb. de 2017
I have a very large text file composed of, in essence one row of numbers. Once I have reorganized the file into a matrix of, for example 500 x 10, I wish to create new matrices every 10 rows and have these save with their own title. A major problem I've experienced with my text file is that it's too big for Matlab, with an out of memory error appearing. This is why I need to separate each matrix into its own set of data. I have already turned a row of 1049600 numbers into a matrix of 1025 x 1024 but now the file is 50 of these sets in one file (1049600 x 50) and I need to create 50 1025 x 1024 matrices.
fid = fopen('test0001.asc');
Cell = textscan( fid, '%d', 'delimiter', ';');
Data = cell2mat(Cell);
N = 1024;
Finish = reshape(Data, N, [])';
The above is the code i had for the smaller files
I considered organizing the data into 51250 rows of 1024 and then creating a while ~ feof loop but this seems like it would require too much code and would thus be too slow. My thought was to have say:
F1 = Data(1:1025, :);
f2 = Data(1026:2051, :);
.....
Any thoughts at all would be much appreciated
Respuesta aceptada
Stephen23
el 7 de Feb. de 2017
Editada: Stephen23
el 10 de Feb. de 2017
Firstly, the idea of generating lots of variables is popular with beginners, but really should be avoided:
Also note that the MATLAB documentation is really good. It is readable, and has articles on lots of topics. Such as this one, which gives a good, robust method for reading a large file into MATLAB:
The core idea of that code is to call textscan in a loop, use textscan's N option to specify how much data to read, and save the data into a cell array. The N option simply defines how many times the format is applied when reading the file.
You should be able to work it out from the examples in the documentation.
As an alternative you might like to read about Tall Arrays, which are a special kind of data type especially for working with very large data files that cannot be read into memory:
EDIT 2017-02-10: add code from comment:
%%Create Fake Datafile %%
% fid = fopen('temp2.txt','wt');
% for k = 1:50,
% fprintf(fid,'%d;',randi([0,255],1,1025*1024));
% end
% fclose(fid);
%%Read DataFile %%
R = 1025;
C = 1024;
opt = {'EndOfLine',';', 'CollectOutput',true};
fid = fopen('temp2.txt','rt');
k = 0;
while ~feof(fid)
Z = textscan(fid,'%d', R*C, opt{:});
if ~isempty(Z{1})
k = k+1;
S = sprintf('temp2_%02d.txt',k);
dlmwrite(S,reshape(Z{1},[],R).',';') % might need to translate
end
end
fclose(fid);
12 comentarios
Stephen23
el 9 de Feb. de 2017
Editada: Stephen23
el 9 de Feb. de 2017
@Aaron Smith: take a look at these two lines:
opt = {'EndOfLine',';'};
...
Z = textscan(fid,'%d', R*C, opt{:});
one defines the cell array opt, the other provides the elements of opt as inputs to textscan. So it is simply a convenient way to write the inputs without writing them all in one line like this:
Z = textscan(fid,'%d', R*C, 'EndOfLine',';');
For just two arguments it does not make much difference, but sometimes there can be quite a few arguments, and I find the cell array keeps things tidy. It is just a personal choice to do it like that, there is no deeper meaning. You can write the inputs on one line, if you wish to.
Más respuestas (1)
Guillaume
el 8 de Feb. de 2017
Editada: Guillaume
el 8 de Feb. de 2017
Matlab, since R2014b, has had tools to allow reading in chunks files that are too big to fit in memory. Why not use these? See datastore and in your particular case tabulartextdatastore.
0 comentarios
Ver también
Categorías
Más información sobre Text Files en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!