Borrar filtros
Borrar filtros

Trouble with textscan and large .dat files

2 visualizaciones (últimos 30 días)
kschau
kschau el 8 de Mayo de 2013
I am trying to import specific values from a very large .dat file (use dummy.dat).
These values are in a single column, that is extremely long (700000 rows). I am trying to pick out specific values within this column and then move on without importing the whole column.
When I use
A = importdata('dummy.dat')
I get a nice [700000x 1] array in my workspace, so that works but again, I don't want to take the time to import the whole thing.
When I use
fid=fopen('dummy.dat');
A = textscan(fid,%f,'delimiter','')
I get a 1 x 1 cell in which the cell is a [700000 x 1] double, so that works, but I am still importing the whole thing.
Say I want to pick out the number that is in the 5th row, and only that number. I am trying:
fid=fopen('dummy.dat');
A = textscan(fid,%f,1,'delimiter','','headerlines',4)
For some reason, when I do this, the single column nature of the .dat file is changed into 4 columns so instead of reading
1
2
3
4
5
6...
I get
1 2 3 4
5 6 ...
Which is screwing up my rows and headerlines and what values I am reading.
Anyone know whats going on here?
Thanks.
  1 comentario
Walter Roberson
Walter Roberson el 8 de Mayo de 2013
What is your intention in setting the delimiter to '' ? Why not just leave the delimiter unspecified ?

Iniciar sesión para comentar.

Respuesta aceptada

Walter Roberson
Walter Roberson el 8 de Mayo de 2013
If you are importing the same file multiple times, I suggest reading it once and writing a version of it in binary. Then, each time you want to read, knowing which position you want to start at, you can fseek() to the (position - 1) * (the size in bytes of a single entry) and fread() from there.
  2 comentarios
kschau
kschau el 8 de Mayo de 2013
I would but unfortunately I need to extract a few data points from one .dat file and then move on to another many many times.
kschau
kschau el 11 de Jun. de 2013
Trick was to just compile ALL the files into one long binary string and then just remember byte sequence to jump quickly between what were separate .dat files. Thanks for the advice!

Iniciar sesión para comentar.

Más respuestas (1)

Gabriel
Gabriel el 11 de Jun. de 2013
If you don't care about speed at all, The easiest way is to use fgetl to read each line, then textscan on each line to grab what you want. Slow but easy.

Categorías

Más información sobre Large Files and Big Data en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by