Error with readtable function

18 visualizaciones (últimos 30 días)
L'O.G.
L'O.G. el 1 de Abr. de 2022
Comentada: L'O.G. el 2 de Abr. de 2022
I have a non-standard file format that I wish to read and run the following:
Y = readtable(fileID.name,'FileType','text');
where fileID is a structure. However, I get an error:
Error using readtable (line 197)
An error occurred while trying to determine whether "readData" is a function
name.
Note: readtable detected the following parameters:
'Delimiter', ',', 'HeaderLines', , 'ReadVariableNames', true, 'Format', ''
Why am I getting this error? What is readData? I can't find any information about it. Any ideas about how to get around this? The function readtable runs fine, without an error, on my laptop in MATLAB r2021b, but I only have access to the older 2018 version on the cluster that I'm using and don't have admin privileges. EDIT: I attach an example in the final comment here.
  12 comentarios
L'O.G.
L'O.G. el 2 de Abr. de 2022
@per isakson I want a huge array of all of the numbers in my example from 0 to 10 for all steps.
Stephen23
Stephen23 el 2 de Abr. de 2022
Editada: Stephen23 el 2 de Abr. de 2022

Iniciar sesión para comentar.

Respuesta aceptada

per isakson
per isakson el 2 de Abr. de 2022
Editada: per isakson el 2 de Abr. de 2022
I assume that the entire text file and the result fits in your RAM.
"huge" means different things to different people. This functions reads and parses a 20K line file in 0.1 sec.
I think you can improve performance significantly by replacing for() by parfor().
%%
A = cssm_( 'test_lammpstrj.txt' );
size(A)
ans = 1×3
11 7 3
A(:,:,3)
ans = 11×7
0 1 0 50 68 52 -1 1 2 0 50 67 51 -1 2 0 0 49 68 50 -1 3 0 0 48 69 51 -1 4 2 0 48 68 51 -1 5 0 0 47 67 50 -1 6 2 0 48 66 49 -1 7 2 0 48 66 50 -1 8 2 0 48 65 51 -1 9 5 0 47 66 51 -1
%%
% tic
% A = cssm_( 'test_lammpstrj_large.txt' ); % 20,003 lines
% toc
% Elapsed time is 0.101381 seconds.
% Elapsed time is 0.101189 seconds.
%%
function num = cssm_( ffs )
%%
fid = fopen( ffs, 'rt' );
chr = reshape( fread( fid, '*char' ), 1,[] );
[~] = fclose( fid );
%%
cac = regexp( chr, 'ITEM: TIMESTEP\n', 'split' );
len = size( cac, 2 );
num = nan( 11, 7, len-1 ); % cac{1} is empty
for jj = 2 : len
ccc = textscan( cac{jj}, '%d%d%d%d%d%d%d', 'Headerlines',8, 'CollectOutput',true );
num(:,:,jj-1) = ccc{1};
end
end

Más respuestas (1)

Voss
Voss el 2 de Abr. de 2022
I'm not sure what the error with readtable is about, but here's one way to read that text file (I've given it the extension .txt here but that doesn't matter) and return a cell array of tables:
fid = fopen('test.txt');
data = fread(fid,'*char').';
fclose(fid);
C = split(data,'ITEM: ');
C = split(C(startsWith(C,'ATOMS')),newline());
var_names = split(strtrim(C{1}));
C(:,[1 end]) = [];
T = cellfun(@(x)array2table(x,'VariableNames',var_names(2:end)), ...
num2cell(permute(str2double(split(C)),[2 3 1]),[1 2]), ...
'UniformOutput',false);
T{:}
ans = 11×7 table
id type mol x y z bP __ ____ ___ __ __ __ __ 0 1 0 44 69 32 -1 1 2 0 44 68 31 -1 2 0 0 44 69 30 -1 3 0 0 44 70 30 -1 4 2 0 45 71 31 -1 5 0 0 46 71 32 -1 6 2 0 45 70 33 -1 7 2 0 44 71 33 -1 8 2 0 43 72 32 -1 9 5 0 43 73 32 -1 10 8 0 43 74 32 -1
ans = 11×7 table
id type mol x y z bP __ ____ ___ __ __ __ __ 0 1 0 50 68 52 -1 1 2 0 50 67 51 -1 2 0 0 49 68 50 -1 3 0 0 48 69 51 -1 4 2 0 48 68 51 -1 5 0 0 47 67 50 -1 6 2 0 48 66 49 -1 7 2 0 48 66 50 -1 8 2 0 48 65 51 -1 9 5 0 47 66 51 -1 10 8 0 47 65 51 -1
ans = 11×7 table
id type mol x y z bP __ ____ ___ __ __ __ __ 0 1 0 50 68 52 -1 1 2 0 50 67 51 -1 2 0 0 49 68 50 -1 3 0 0 48 69 51 -1 4 2 0 48 68 51 -1 5 0 0 47 67 50 -1 6 2 0 48 66 49 -1 7 2 0 48 66 50 -1 8 2 0 48 65 51 -1 9 5 0 47 66 51 -1 10 8 0 47 65 51 -1
Or if you want a 3D numeric array instead:
fid = fopen('test.txt');
data = fread(fid,'*char').';
fclose(fid);
C = split(data,'ITEM: ');
C = split(C(startsWith(C,'ATOMS')),newline());
C(:,[1 end]) = [];
M = permute(str2double(split(C)),[2 3 1]);
disp(M);
(:,:,1) = 0 1 0 44 69 32 -1 1 2 0 44 68 31 -1 2 0 0 44 69 30 -1 3 0 0 44 70 30 -1 4 2 0 45 71 31 -1 5 0 0 46 71 32 -1 6 2 0 45 70 33 -1 7 2 0 44 71 33 -1 8 2 0 43 72 32 -1 9 5 0 43 73 32 -1 10 8 0 43 74 32 -1 (:,:,2) = 0 1 0 50 68 52 -1 1 2 0 50 67 51 -1 2 0 0 49 68 50 -1 3 0 0 48 69 51 -1 4 2 0 48 68 51 -1 5 0 0 47 67 50 -1 6 2 0 48 66 49 -1 7 2 0 48 66 50 -1 8 2 0 48 65 51 -1 9 5 0 47 66 51 -1 10 8 0 47 65 51 -1 (:,:,3) = 0 1 0 50 68 52 -1 1 2 0 50 67 51 -1 2 0 0 49 68 50 -1 3 0 0 48 69 51 -1 4 2 0 48 68 51 -1 5 0 0 47 67 50 -1 6 2 0 48 66 49 -1 7 2 0 48 66 50 -1 8 2 0 48 65 51 -1 9 5 0 47 66 51 -1 10 8 0 47 65 51 -1
  4 comentarios
L'O.G.
L'O.G. el 2 de Abr. de 2022
Editada: L'O.G. el 2 de Abr. de 2022
These all take several minutes and then I need to terminate them. My full data set is 600 x 30000, and each cell has 7 entries as you see above. A real shame readtable won't work for some reason. On my laptop, readtable takes a few seconds on the full data set, but on the cluster, I get the error as I mention above.
L'O.G.
L'O.G. el 2 de Abr. de 2022
Thank you very much.

Iniciar sesión para comentar.

Categorías

Más información sobre Large Files and Big Data en Help Center y File Exchange.

Productos


Versión

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by