Large Cell Array Data Query (USDA FIA data)
Mostrar comentarios más antiguos
Hi, I have two large cell array data sets (USDA FIA data). Trying to connect two (Data B to A) using TRE_CN (tree numbers in string, e.g. '212152293031', 212152393031' ...).
I tried two options.
1. for loop and strcmp
Fnl_mat=cell(rows_dataA,6);
Fnl_mat(:,1:5)=dataA;
for i=1:rows_dataA
Qry_mat=strcmp([dataB{:,1}]',dataA{i,1}{1,1});
Fnl_mat(i,6)=dataB(Qry_mat,2);
end
save(filename,'Fnl_mat');
2. getnameidx
idx=getnameidx([dataB{:,1}],[dataA{:,1}]);
Fnl_mat=cell(rows_dataA,6);
Fnl_mat(:,1:5)=dataA;
Fnl_mat(:,6)=dataB(idx,2);
save(filename,'Fnl_mat');
But,,, both options take too much time (10,000 secs) in processes due to large amount of rows (>30,000 for dataA and >600,000 for dataB). How can I solve this problem?
Dataset A
% TRE_CN PLT_CN INVYR SUBP HT
% String String Number Number Number
'291024' '12312' 2009 1 60
'291124' '12312' 2009 1 38
...
...
over 30000 rows
Dataset B
% TRE_CN BIOMASS
% String Number
'220324' 800
'220424' 345
...
...
'291024' 580
'291124' 304
...
...
over 600000 rows
Respuestas (1)
SUNGHO
el 25 de Sept. de 2012
0 votos
Categorías
Más información sobre Characters and Strings en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!