error using textscan to read a csv file with number and text and date in one file
1 visualización (últimos 30 días)
Mostrar comentarios más antiguos
Fan Mei
el 16 de Dic. de 2016
Comentada: Jeremy Hughes
el 20 de Dic. de 2016
Error using textscan Unable to read the DATETIME data with the format 'HH:mm:ss'. If the data is not a time, use %q to get string data.
One example data file I used is attached.
I tried several things, but no success. Please advise a good way of reading the data.
I try to write a code to read such a file. However, importdata can't handle the mixing types, and readtable don't skip the first 4 lines. Another try of using textscan as below gave me errors:
formatSpec = '%{HH:mm:ss}D,%C,%f,%f,%f,%f,%f,%f,%f,%{HH:mm:ss}D,%f,%f,%f,%f';
fid=fopen(cpc_name);
for j1=1:4
tmp1=fgetl(fid);
if j1==2
dstr=tmp1(6:14);
end
end
C=textscan(fid,formatSpec);
fclose(fid);
0 comentarios
Respuesta aceptada
Peter Perkins
el 19 de Dic. de 2016
Fan, leave the commas out fo the format and use readtable:
>> t = readtable('test1_V6.csv','Format','%{HH:mm:ss}D%f%C%f%f%f%f%f%f%f%{HH:mm:ss}D%f%f%f%f','Header',3)
Warning: Variable names were modified to make them valid MATLAB identifiers. The original names are saved in the
VariableDescriptions property.
t =
Time Comment ModuleID Alt_m_ P_mbar_ T_degC_ x_RH Wind_degrees_ Wind_m_s_ Supply_V_ UTCTime Latitude_deg_ Longitude_deg_ Course_deg_ Speed_m_s_
________ _______ ________ ______ _______ _______ ____ _____________ _________ _________ ________ _____________ ______________ ___________ __________
23:09:10 NaN F667D9 54.2 999.9 16.3 39.2 236 4.1 11.8 23:09:00 70.495 -149.89 78 0.02
23:09:11 NaN F667D9 54.2 999.9 16.3 39 237 3.9 11.8 23:09:05 70.495 -149.89 78 0.01
23:09:15 NaN F667D9 54.2 999.8 16.3 39.1 236 4.1 11.8 23:09:10 70.495 -149.89 78 0.01
23:09:20 NaN F667D9 54.2 999.9 16.3 38.9 236 4.1 11.8 23:09:15 70.495 -149.89 78 0.01
23:09:25 NaN F667D9 54.2 999.8 16.3 38.8 236 3.9 11.8 23:09:20 70.495 -149.89 78 0.02
23:09:30 NaN F667D9 54.2 999.8 16.3 38.7 236 4.1 11.8 23:09:25 70.495 -149.89 78 0.02
23:09:32 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 23:09:30 70.495 -149.89 78 0.02
23:09:36 NaN F667D9 54.2 999.8 16.3 39.4 235 4.1 11.8 23:09:35 70.495 -149.89 78 0.01
23:09:41 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 23:09:40 70.495 -149.89 78 0.01
23:09:46 NaN F667D9 54.2 999.8 16.3 39 236 4.1 11.8 23:09:45 70.495 -149.89 78 0.01
23:09:51 NaN F667D9 54.2 999.9 16.3 38.8 235 3.9 11.8 23:09:50 70.495 -149.89 78 0
23:09:56 NaN F667D9 54.2 999.8 16.3 39.1 235 4.1 11.8 23:09:55 70.495 -149.89 78 0.01
The times had no date portion, so they were read as "today", but the file header indocates that they're from 6/11/2016. Modify the two time variables accordingly:
>> t.Time = datetime(2016,6,11) + timeofday(t.Time);
>> t.UTCTime = datetime(2016,6,11) + timeofday(t.UTCTime)
t =
Time Comment ModuleID Alt_m_ P_mbar_ T_degC_ x_RH Wind_degrees_ Wind_m_s_ Supply_V_ UTCTime Latitude_deg_ Longitude_deg_ Course_deg_ Speed_m_s_
____________________ _______ ________ ______ _______ _______ ____ _____________ _________ _________ ____________________ _____________ ______________ ___________ __________
11-Jun-2016 23:09:10 NaN F667D9 54.2 999.9 16.3 39.2 236 4.1 11.8 11-Jun-2016 23:09:00 70.495 -149.89 78 0.02
11-Jun-2016 23:09:11 NaN F667D9 54.2 999.9 16.3 39 237 3.9 11.8 11-Jun-2016 23:09:05 70.495 -149.89 78 0.01
11-Jun-2016 23:09:15 NaN F667D9 54.2 999.8 16.3 39.1 236 4.1 11.8 11-Jun-2016 23:09:10 70.495 -149.89 78 0.01
11-Jun-2016 23:09:20 NaN F667D9 54.2 999.9 16.3 38.9 236 4.1 11.8 11-Jun-2016 23:09:15 70.495 -149.89 78 0.01
11-Jun-2016 23:09:25 NaN F667D9 54.2 999.8 16.3 38.8 236 3.9 11.8 11-Jun-2016 23:09:20 70.495 -149.89 78 0.02
11-Jun-2016 23:09:30 NaN F667D9 54.2 999.8 16.3 38.7 236 4.1 11.8 11-Jun-2016 23:09:25 70.495 -149.89 78 0.02
11-Jun-2016 23:09:32 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 11-Jun-2016 23:09:30 70.495 -149.89 78 0.02
11-Jun-2016 23:09:36 NaN F667D9 54.2 999.8 16.3 39.4 235 4.1 11.8 11-Jun-2016 23:09:35 70.495 -149.89 78 0.01
11-Jun-2016 23:09:41 NaN F667D9 54.2 999.7 16.3 38.8 236 3.9 11.8 11-Jun-2016 23:09:40 70.495 -149.89 78 0.01
11-Jun-2016 23:09:46 NaN F667D9 54.2 999.8 16.3 39 236 4.1 11.8 11-Jun-2016 23:09:45 70.495 -149.89 78 0.01
11-Jun-2016 23:09:51 NaN F667D9 54.2 999.9 16.3 38.8 235 3.9 11.8 11-Jun-2016 23:09:50 70.495 -149.89 78 0
11-Jun-2016 23:09:56 NaN F667D9 54.2 999.8 16.3 39.1 235 4.1 11.8 11-Jun-2016 23:09:55 70.495 -149.89 78 0.01
3 comentarios
Jeremy Hughes
el 20 de Dic. de 2016
I'll add to Peter's comment here. The reason the original call to textscan fails is that it expects to find spaces between each field. The default value for 'Delimiter' in TEXTSCAN is the same as 'Whitespace'
The initial code could be improved by removing the commas from the format string passed to textscan (as Peter suggested) and also by adding 'Delimiter',',' to the arguments.
Although I like Peter's solution better, the following should work:
fid = fopen(cpc_name);
for j1 = 1:4
tmp1 = fgetl(fid);
if j1==2
dstr = tmp1(6:14);
end
end
formatSpec = '%{HH:mm:ss}D%C%f%f%f%f%f%f%f%{HH:mm:ss}D%f%f%f%f';
C = textscan(fid,formatSpec,'Delimiter',',');
fclose(fid);
Incidentally, this looks like a CSV exported from Microsoft® Excel®. (It puts extra commas for empty cells when exporting to CSV) If you have the original spreadsheet file, you might have luck reading that directly with READTABLE.
T = readtable(ssFileName,'Range','A4:O16')
Más respuestas (1)
KSSV
el 16 de Dic. de 2016
Editada: KSSV
el 16 de Dic. de 2016
fid = fopen('test1_V6.csv') ;
S = textscan(fid,'%s','delimiter','\n','HeadeRLines',3);
fclose(fid)
S = S{1}
2 comentarios
KSSV
el 16 de Dic. de 2016
fid = fopen('test1_V6.csv') ;
S = textscan(fid,'%s','delimiter','\n','HeadeRLines',4);
S = S{1} ;
fclose(fid) ;
iwant = cell(12,14) ;
for i = 1:size(S,1)
iwant(i,:) = strsplit(S{i},',') ;
end
iwant
Ver también
Categorías
Más información sobre Large Files and Big Data en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!