use textscan on a subset of from large ascii file

Question

nori el 16 de Mayo de 2011

0
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/7545-use-textscan-on-a-subset-of-from-large-ascii-file

hello,

i am trying to use the textscan function to open a large ascii file (2.5gb) and break the file up into smaller files.

the big file is the cru ts 3.1 world monthly temp data from jan 1901-dec2009 where each month is stored in 360x720. multiply that by 1308 (number of months in date range) and that is my big ascii file.

http://badc.nerc.ac.uk/view/badc.nerc.ac.uk__ATOM__dataent_1256223773328276

now my problem is that i cannot seem to find any documentation on how to use textscan to scan through the original file using a specified range (360x720).

the help does refer to the possiblity of opening large files and subsetting it but the examples show how to do it using a given number of characters but since this data has a range of 0-255 i cant set a fixed number of characters for each line.

fyi-i am able to use the textscan on smaller files and get the results i want but i only use textscan to read the entire file and not a subset of the data.

is textscan able to do what im hoping or is there another function? i searched and couldnt find anything suitable.

any help would be greatly appreciated.

thanks.

n

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Walter Roberson el 16 de Mayo de 2011

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/7545-use-textscan-on-a-subset-of-from-large-ascii-file#answer_10410

Abrir en MATLAB Online

monthnumber = 17;  %for example. First is 1
fid = fopen('YourDataFile.txt','rt');
monthcell = textscan(fid, repmat('%g',720), 360, 'HeaderLines', 360*(monthnumber-1), 'CollectOutput', 1);
fclose(fid);

Your data would then be the array monthcell{1}

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

nori el 17 de Mayo de 2011

hi Walter,

thanks for the quick reply.

i tried to understand and apply your solution but i confused on a few things and when i applied it to my dataset, i got empty cell arrays returned.

first off, i dont understand the repmat('%g',720) part of the code. isnt repmat for replicating a matrix? which would result in the same matrix replicated over and over again? and could you elaborate on the '%g'? i couldnt find what parameter that represents.

i understand the rest of the code. use the headerlines function to skip the number of datasets based on 360 rows. brilliant.

by looking at your code, i got the idea that i could try to read in the number of cells for each dataset (360*720) and use that to limit each month. and if that worked, i would try to use your idea of the headerlines function.

my idea sort of worked but now when i convert the .asc to an image, i get 2 images that are rotated 90 degrees. so the south pole is point west and north pole pointing east and the image doubled.

according to the metadata from CRU the dataset is definitely 720x360. so with what i am seeing, i should expect that i got a matrix of 1440x360. but that is not the case.

in order to see the image, i have to add this header to the top of the file to import it into arcgis.

ncols 720

nrows 360

xllcorner -180

yllcorner -89.9999999999999

cellsize 0.5

NODATA_value -999

the code below is what i tried and got the double image (rotated 90).

file = ('cru_ts_3_10.1901.2009.tmn.dat');

r = 720;

c = 360;

fid = fopen(file,'r');

a = r*c;

m = textscan(fid,'%d16',a);

b = m{1,1};

c = reshape(b, c, r);

fclose(fid);

dlmwrite('out.txt', c, 'delimiter', ' ');

now heres the really strange part of this problem.

if i transpose the final result c = reshape(b, c, r)';

and use the exact same header from above, i get a single image with everything looking just fine.

however, if i open the file and count the rows and columns, it is 720 rows and 360 cols (which is what i expect) but arcgis properly displays the image using the header of 720 cols and 360 rows.

now i am definitely confused.

sorry for the long winded reply. but its been a number of days and im going bananas!

Iniciar sesión para comentar.

Answer 2

nori el 22 de Mayo de 2011

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/7545-use-textscan-on-a-subset-of-from-large-ascii-file#answer_10951

not a matlab solution but it works.

i downloaded 7zip and used the split file utility.

worked like a charm.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

use textscan on a subset of from large ascii file

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (2)

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

use textscan on a subset of from large ascii file

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (2)

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos