error using textscan to read a csv file with number and text and date in one file

1 visualización (últimos 30 días)
Error using textscan Unable to read the DATETIME data with the format 'HH:mm:ss'. If the data is not a time, use %q to get string data.
One example data file I used is attached.
I tried several things, but no success. Please advise a good way of reading the data.
I try to write a code to read such a file. However, importdata can't handle the mixing types, and readtable don't skip the first 4 lines. Another try of using textscan as below gave me errors:
formatSpec = '%{HH:mm:ss}D,%C,%f,%f,%f,%f,%f,%f,%f,%{HH:mm:ss}D,%f,%f,%f,%f';
fid=fopen(cpc_name);
for j1=1:4
tmp1=fgetl(fid);
if j1==2
dstr=tmp1(6:14);
end
end
C=textscan(fid,formatSpec);
fclose(fid);

Respuesta aceptada

Peter Perkins
Peter Perkins el 19 de Dic. de 2016
Fan, leave the commas out fo the format and use readtable:
>> t = readtable('test1_V6.csv','Format','%{HH:mm:ss}D%f%C%f%f%f%f%f%f%f%{HH:mm:ss}D%f%f%f%f','Header',3)
Warning: Variable names were modified to make them valid MATLAB identifiers. The original names are saved in the
VariableDescriptions property.
t =
Time Comment ModuleID Alt_m_ P_mbar_ T_degC_ x_RH Wind_degrees_ Wind_m_s_ Supply_V_ UTCTime Latitude_deg_ Longitude_deg_ Course_deg_ Speed_m_s_
________ _______ ________ ______ _______ _______ ____ _____________ _________ _________ ________ _____________ ______________ ___________ __________
23:09:10 NaN F667D9 54.2 999.9 16.3 39.2 236 4.1 11.8 23:09:00 70.495 -149.89 78 0.02
23:09:11 NaN F667D9 54.2 999.9 16.3 39 237 3.9 11.8 23:09:05 70.495 -149.89 78 0.01
23:09:15 NaN F667D9 54.2 999.8 16.3 39.1 236 4.1 11.8 23:09:10 70.495 -149.89 78 0.01
23:09:20 NaN F667D9 54.2 999.9 16.3 38.9 236 4.1 11.8 23:09:15 70.495 -149.89 78 0.01
23:09:25 NaN F667D9 54.2 999.8 16.3 38.8 236 3.9 11.8 23:09:20 70.495 -149.89 78 0.02
23:09:30 NaN F667D9 54.2 999.8 16.3 38.7 236 4.1 11.8 23:09:25 70.495 -149.89 78 0.02
23:09:32 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 23:09:30 70.495 -149.89 78 0.02
23:09:36 NaN F667D9 54.2 999.8 16.3 39.4 235 4.1 11.8 23:09:35 70.495 -149.89 78 0.01
23:09:41 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 23:09:40 70.495 -149.89 78 0.01
23:09:46 NaN F667D9 54.2 999.8 16.3 39 236 4.1 11.8 23:09:45 70.495 -149.89 78 0.01
23:09:51 NaN F667D9 54.2 999.9 16.3 38.8 235 3.9 11.8 23:09:50 70.495 -149.89 78 0
23:09:56 NaN F667D9 54.2 999.8 16.3 39.1 235 4.1 11.8 23:09:55 70.495 -149.89 78 0.01
The times had no date portion, so they were read as "today", but the file header indocates that they're from 6/11/2016. Modify the two time variables accordingly:
>> t.Time = datetime(2016,6,11) + timeofday(t.Time);
>> t.UTCTime = datetime(2016,6,11) + timeofday(t.UTCTime)
t =
Time Comment ModuleID Alt_m_ P_mbar_ T_degC_ x_RH Wind_degrees_ Wind_m_s_ Supply_V_ UTCTime Latitude_deg_ Longitude_deg_ Course_deg_ Speed_m_s_
____________________ _______ ________ ______ _______ _______ ____ _____________ _________ _________ ____________________ _____________ ______________ ___________ __________
11-Jun-2016 23:09:10 NaN F667D9 54.2 999.9 16.3 39.2 236 4.1 11.8 11-Jun-2016 23:09:00 70.495 -149.89 78 0.02
11-Jun-2016 23:09:11 NaN F667D9 54.2 999.9 16.3 39 237 3.9 11.8 11-Jun-2016 23:09:05 70.495 -149.89 78 0.01
11-Jun-2016 23:09:15 NaN F667D9 54.2 999.8 16.3 39.1 236 4.1 11.8 11-Jun-2016 23:09:10 70.495 -149.89 78 0.01
11-Jun-2016 23:09:20 NaN F667D9 54.2 999.9 16.3 38.9 236 4.1 11.8 11-Jun-2016 23:09:15 70.495 -149.89 78 0.01
11-Jun-2016 23:09:25 NaN F667D9 54.2 999.8 16.3 38.8 236 3.9 11.8 11-Jun-2016 23:09:20 70.495 -149.89 78 0.02
11-Jun-2016 23:09:30 NaN F667D9 54.2 999.8 16.3 38.7 236 4.1 11.8 11-Jun-2016 23:09:25 70.495 -149.89 78 0.02
11-Jun-2016 23:09:32 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 11-Jun-2016 23:09:30 70.495 -149.89 78 0.02
11-Jun-2016 23:09:36 NaN F667D9 54.2 999.8 16.3 39.4 235 4.1 11.8 11-Jun-2016 23:09:35 70.495 -149.89 78 0.01
11-Jun-2016 23:09:41 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 11-Jun-2016 23:09:40 70.495 -149.89 78 0.01
11-Jun-2016 23:09:46 NaN F667D9 54.2 999.8 16.3 39 236 4.1 11.8 11-Jun-2016 23:09:45 70.495 -149.89 78 0.01
11-Jun-2016 23:09:51 NaN F667D9 54.2 999.9 16.3 38.8 235 3.9 11.8 11-Jun-2016 23:09:50 70.495 -149.89 78 0
11-Jun-2016 23:09:56 NaN F667D9 54.2 999.8 16.3 39.1 235 4.1 11.8 11-Jun-2016 23:09:55 70.495 -149.89 78 0.01
  3 comentarios
Jeremy Hughes
Jeremy Hughes el 20 de Dic. de 2016
I'll add to Peter's comment here. The reason the original call to textscan fails is that it expects to find spaces between each field. The default value for 'Delimiter' in TEXTSCAN is the same as 'Whitespace'
The initial code could be improved by removing the commas from the format string passed to textscan (as Peter suggested) and also by adding 'Delimiter',',' to the arguments.
Although I like Peter's solution better, the following should work:
fid = fopen(cpc_name);
for j1 = 1:4
tmp1 = fgetl(fid);
if j1==2
dstr = tmp1(6:14);
end
end
formatSpec = '%{HH:mm:ss}D%C%f%f%f%f%f%f%f%{HH:mm:ss}D%f%f%f%f';
C = textscan(fid,formatSpec,'Delimiter',',');
fclose(fid);
Incidentally, this looks like a CSV exported from Microsoft® Excel®. (It puts extra commas for empty cells when exporting to CSV) If you have the original spreadsheet file, you might have luck reading that directly with READTABLE.
T = readtable(ssFileName,'Range','A4:O16')

Iniciar sesión para comentar.

Más respuestas (1)

KSSV
KSSV el 16 de Dic. de 2016
Editada: KSSV el 16 de Dic. de 2016
fid = fopen('test1_V6.csv') ;
S = textscan(fid,'%s','delimiter','\n','HeadeRLines',3);
fclose(fid)
S = S{1}
  2 comentarios
Fan Mei
Fan Mei el 16 de Dic. de 2016
Thank you for your help. I hope to separate the columns after loading the file. For example, I can have a column of date, and several other columns with numerical data. Any suggestions?
KSSV
KSSV el 16 de Dic. de 2016
fid = fopen('test1_V6.csv') ;
S = textscan(fid,'%s','delimiter','\n','HeadeRLines',4);
S = S{1} ;
fclose(fid) ;
iwant = cell(12,14) ;
for i = 1:size(S,1)
iwant(i,:) = strsplit(S{i},',') ;
end
iwant

Iniciar sesión para comentar.

Categorías

Más información sobre Large Files and Big Data en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by