Working with very big data faster ?
Mostrar comentarios más antiguos
Dear Matlab users,
I have to deal with very big data(Point clouds generally more than 30 000 000 points) using Matlab. I can read ascii data using "textscan" function. After reading, I need to detect invalid data (points with 0,0,0 coordinates) and then I need to do some mathematical operations on each point or each line in the data. In my way, first I read data with "testscan" and then I assign this data to a matrix. Secondly, I use for loops for detecting invalid points and doing some mathematical operations on each point or line in the data. A sample of my code is shown as below. Is there a way of avoiding for loops or what is the best way of speeding up this computation? I am looking forward to hearing from you
fileID = fopen('some ascii data with more than 10 000 000 points');
original_data = textscan(fileID,'%f %f %f %f %f %f %f', 'delimiter',' ');
fclose(fileID);
column = original_data{1}(1);
row = original_data{1}(2);
t_matrix = [original_data{1}(7) original_data{2}(7) original_data{3}(7) original_data{4}(7)
original_data{1}(8) original_data{2}(8) original_data{3}(8) original_data{4}(8)
original_data{1}(9) original_data{2}(9) original_data{3}(9) original_data{4}(9)
original_data{1}(10) original_data{2}(10) original_data{3}(10) original_data{4}(10)];
coordinate_list(:,1) = original_data{1}(11:length(original_data{1}));
coordinate_list(:,2) = original_data{2}(11:length(original_data{2}));
coordinate_list(:,3) = original_data{3}(11:length(original_data{3}));
coordinate_list(:,4) = 0;
coordinate_list(:,5) = original_data{4}(11:length(original_data{4}));
%detect invalid points and transform each point with t_matrix
for i = 1:length(coordinate_list)
if coordinate_list(i,1) == 0 && coordinate_list(i,2) == 0 && coordinate_list(i,3) == 0
transformed_list(i,:) = NaN;
else
%transformed_list(i,:) = coordinate_list(i,:)*t_matrix;
transformed_list((i:i),(1:4)) = coordinate_list((i:i),(1:4))*t_matrix;
transformed_list(i,5) = coordinate_list(i,5);
end
i
end
6 comentarios
KSSV
el 26 de Sept. de 2016
You have not initialized transformed_list()...this makes codes slow. You must considering initializing.
Adam
el 26 de Sept. de 2016
Have you run the profiler on your code?
doc profile
You should always do this before making any attempt at speeding up your code, otherwise how do you know which part is taking the longest time? Assumptions are generally a very bad idea!
mustafa ozendi
el 26 de Sept. de 2016
KSSV
el 26 de Sept. de 2016
does your text file have any texts inside? or only numbers? Can you attach a sample of the text file?
mustafa ozendi
el 26 de Sept. de 2016
per isakson
el 26 de Sept. de 2016
Editada: per isakson
el 26 de Sept. de 2016
Use
textscan( ..., 'CollectOutput',true )
Neither of your two samples matches
textscan(fileID,'%f %f %f %f %f %f %f', 'delimiter',' ');
Respuestas (1)
To find whether (x,y,z) are zeros, you need not to run a loop. You can find in single stretch.
id = sum(coordinate_list,2)==0 ; % this output will be logical
idx = find(sum(coordinate_list,2)==0) ; % this output will give positions where are zeros
You can achieve all the loop things with out using for loop.
Categorías
Más información sobre Large Files and Big Data en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!