Borrar filtros
Borrar filtros

For loop for large database

2 visualizaciones (últimos 30 días)
Ronaldo
Ronaldo el 26 de Mzo. de 2013
I have a large database which contains 1e7 rows and 3 columns. The columns represent the coordinate (x,y,z) of each row. I want to write a code which is able to calculate the distance between each row and all the other rows. Here are the difficulties that I have:
1) The database is too large and I do not know how to handle it.
2) For calculating the distance of each row to all the others, I used vector calculation but still I have one more loop as follow
for i=1:LastRow
do vecotr calculation to find the distance of row i to all the other rowa
end
I would highly appreciate it if you help me on this subject.
Thanks Ronaldo

Respuestas (1)

Walter Roberson
Walter Roberson el 26 de Mzo. de 2013
1e7 rows all compared to each other requires (1e7)*(1e7)/2 results (if the measure is symmetrical), each of which is 8 bytes (double precision)
1e7 * 1e7 / 2 * 2^3 = 400000000000000
log2() of that is approximate 48.51, so it requires a 49 bit address space to store the entries.
We see in the description of the intel 64 bit architecture at http://en.wikipedia.org/wiki/X86-64#Architectural_features that
Larger virtual address space: The AMD64 architecture defines a 64-bit virtual address format, of which the low-order 48 bits are used in current implementations.
So, there simply isn't any Intel x64 based system (the only kind MATLAB is supported on) that has enough virtual address space to address the table of entries that would be required to be created. Taking into account that one bit of that address space is served for kernel use, you would be short by approximately a factor of 2.8, even if you could afford to get a system with 256 terabytes of memory.
"Captain, in view of the alternatives, are you sure this is wise?"

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by