Reading data into matlab

3 visualizaciones (últimos 30 días)
Baba
Baba el 31 de Oct. de 2011
Hi, I have a text file with space separated numbers that I need to import into Matlab to do some processing on. Can not use the "load" command to import the whole file because it's way too big (5Gb). Text file looks like this:
1.2 4.2 5.2 5.33 6.45 7.64 3.45 7.34 ........
2.34 5.23 .235 .2343 2.34 3.4 3.42........
and so on with
What I'd like to do is be able to read in and Store first 10 values of each row into a column vector. Then the next 10 values of each row and o on...
to have something like:
X=[row1 (1 thru 10); row2 (1 thru 10);...]
or more generally,
y=[row1 (start position thru end position;.....]
Any help appreciated,
Thank you!

Respuesta aceptada

Walter Roberson
Walter Roberson el 31 de Oct. de 2011
I'm not so sure this will make you any happier, but...
To read in columns P through Q (inclusive) of file XYZ.TXT, ignoring H lines of headers:
fid = fopen('XYZ.TXT','rt');
Then for each combination of columns:
fseek(fid, 0, -1); %rewind
result = textscan( [repmat('%*f',1,P-1) repmat('%f',1,Q-P+1) '%*[^\n]'], 'HeaderLines', H, 'CollectOutput', 1);
cols.(sprintf('C%d_%d',P,Q)) = result{1};
clear result
When you are done reading as much as you can hold or as you want to deal with:
fclose(fid);
Feel free to use something other than a structure to hold the values.. keeping in mind that you have not specified that you will be using the same number of values each time so a plain numeric array might not work.
There is a more elegant way to skip leading columns, which I know about 3 days ago, but I'm having a heck of a time digging it up at the moment.
  3 comentarios
Baba
Baba el 31 de Oct. de 2011
Walter, could you annotate your code with a little bit of explanation?
Walter Roberson
Walter Roberson el 31 de Oct. de 2011
%*f format means to read a floating point number and discard it. We repeat this read-discard enough times to read through to the column before the first one we are interested in.
%f format means to read a floating point number and save it. We repeat this read-save enough times to read from columns P to Q inclusive, which is Q-P+1 times.
%*[^\n] format means to find a sequence of characters that can match any character (including space) _except_ for \n which means newline in this context -- i.e., read to end of line. The * part means to discard it. Overall this means that we read whatever is left over after column Q on the line and discard it.
CollectOutput means to put all of the %f values read (columns P through Q) in to a single numeric array.
testscan() always wraps its output in a cell array even if only one item is output, so the result{1} extracts the numeric array.
sprintf('C%d_%d',P,Q) constructs strings like C7_15 intended to symbolize column 7 through 15.
cols.() the string above is dynamic field name referencing of a structure. So the assignment would be to (e.g.)
cols.C7_15

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Data Import and Export en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by