How to make my algorithm work faster

6 visualizaciones (últimos 30 días)
Urs Grigore
Urs Grigore el 18 de Ag. de 2017
Respondida: Urs Grigore el 24 de Ag. de 2017
Hello everyone! Sorry if this is a problem with a very simple solution but I'm quite new to matlab and to programming. I have an algorithm in which I need merge 2 very big tables. I need to do this merge 3 times and each time is a little bit different. I made 3 scripts for each merge with a while in which I compare the value on the first column in both tables and i need to add the lower one plus some values in the new table. This takes a lot and i would be really thankful if someone could help me make it work faster. This is my first question on this forum and any tips about how to create a question would be helpful as well.
  6 comentarios
Urs Grigore
Urs Grigore el 23 de Ag. de 2017
Hello again! Sorry for taking so long to reply. I added some comments, i hope what i'm trying to do is clear now.
%i have 2 tables with different timestamps and different values
%for each timestamp
%i need to have 1 table with all the timestamps and the values from both tables
%when i add a timestamp from the first table to the the table i need to have
%i will add the values from the HIGHEST timestamp from 2nd table which
%is LOWER or EQUAL to the timestamp from the first table
%same works when i add a timestamp from the 2nd table
hTabel1=height(Tabel1) ; %hTabel1 is the hight of the first table, i use it
%to know until where should i go with the while
hTabel2=height(Tabel2) ; %same
kTabel1=1; %kTabel1 represents the line i am working with from the first table
kTabel2=1; %kTabel2 represents the line i am working with from the second table
if Tabel1.Var1(1)>Tabel2.Var1(1)
kTabel2 = find(Tabel2.Var1 <= Tabel1.Var1(1), 1);
kTabel2=kTabel2+1;
kTabel1=kTabel1+1;
end
%on the first column in each tables i have a timestamp calculated in miliseconds
%i will bring the two variables representing the line in tables to consecutive values
%and increase them with 1 because i work with the last values i had
if Tabel1.Var1(1)<Tabel2.Var1(1)
kTabel1 = find(Tabel1.Var1 <= Tabel2.Var1(1), 1);
kTabel1=kTabel1+1;
kTabel2=kTabel2+1;
end
%same thing as above
TIMP=0;
LMAX_EURUSD_USDJPY_bid=0;
LMAX_EURUSD_USDJPY_ask=0;
Tabel3=table(TIMP,LMAX_EURUSD_USDJPY_bid,LMAX_EURUSD_USDJPY_ask);
kTabel3=1;
%again the line i am in Table3 is represented by kTable3
while kTabel1<=hTabel1 || kTabel2<=hTabel2
%while i still have values in both tables
ok=false;
if Tabel1.Var1(kTabel1) < Tabel2.Var1(kTabel2)
kTabel1=kTabel1+1;
time=Tabel1.Var1(kTabel1-1);
ok=true;
end
%check if the line i am at in the first table has a lower Timestamp
%than the line in the 2nd table
if ~ok && (Tabel1.Var1(kTabel1) == Tabel2.Var1(kTabel2))
kTabel1=kTabel1+1;
kTabel2=kTabel2+1;
time=Tabel1.Var1(kTabel1-1);
ok=true;
end
%check if the line i am at in the first table has an equal Timestamp
%to the one in the 2nd table
if ~ok && Tabel1.Var1(kTabel1) > Tabel2.Var1(kTabel2)
kTabel2=kTabel2+1;
time=Tabel2.Var1(kTabel2-1);
end
%check if the line i am at in the 2nd table has a lower Timestamp
%than the line in the 1st table
%add the values in the Table3 and increase the line counter.
cell={time,Tabel1.Var3(kTabel1-1)*Tabel2.Var3(kTabel2-1),Tabel1.Var5(kTabel1-1)*Tabel2.Var5(kTabel2-1)};
Tabel3(kTabel3,:)=cell;
kTabel3=kTabel3+1;
end
The tables with Var1 etc. are read from files and this is the value i get for them, that's why i use it this way, for the tables i create i started to use specific names for columns. Also, i want to say thanks to everyone who gave me any tips here and if someone is willing to help me via skype or any other voice chat, would be amazing. I only worked in c++ and a little bit in python and i'm having some issues getting used to Matlab.
Stephen23
Stephen23 el 23 de Ag. de 2017
" I only worked in c++ and a little bit in python and i'm having some issues getting used to Matlab"
Forget everything you know about C++ and Python: they work in totally different ways to MATLAB.
The introductory tutorials are the recommended way to learn important MATLAB concepts:

Iniciar sesión para comentar.

Respuestas (2)

Jan
Jan el 23 de Ag. de 2017
Editada: Jan el 23 de Ag. de 2017
Addressing the field of the table costs time. Because you read only in .Var1 in both tables, you can use a temporary variable efficiently:
T1 = Table1.Var1;
T2 = Table2.Var1;
The term "Table" occurs very frequently in the code such that is looks rather redundant. The naming of variables is a question of taste, but everything which improves the readability might be an advantage for understanding the code. Sometimes a patterns in the code get clear with a better readability. I prefer "k1" instead of "kTable1".
Replace:
kTabel1 = 1;
kTabel2 = 1;
if Tabel1.Var1(1)>Tabel2.Var1(1)
while Tabel1.Var1(kTabel1)>Tabel2.Var1(kTabel2)
kTabel2=kTabel2+1;
end
kTabel1=kTabel1+1;
end
by:
k1 = 1;
k2 = 1;
if T1(1) > T2(1)
k2 = find(T1(1) > T2, 1);
k1 = 2;
end
In opposite to the first version of your code, Table3 is not pre-allocated before the loop in the last version. This is slow down the processing substantially. The iterative growing of arrays requires an exponentially growing amount of resources. This was better - except for the name:
Tabel3 = zeros(hTabel1+hTabel2+5, 3);
I have only a few experiences with working in tables. I guess the creation of a double matrix is faster. Then you can create the table after the loop in one step.
"cell" is an important builtin function. Shadowing it by a local variable is not an error, but confusing.
cell = {time, Tabel1.Var3(kTabel1-1)*Tabel2.Var3(kTabel2-1), ...
Tabel1.Var5(kTabel1-1)*Tabel2.Var5(kTabel2-1)};
Tabel3(kTabel3,:)=cell;
Or I assume this is faster:
Tabel3(k3, 1) = time;
Tabel3(k3, 2) = T1V3(k1 - 1) * T2V3(k2 - 1);
Tabel3(k3, 3) = T1V5(k1 - 1) * T2V5(k2 - 1);
With "T1V3" was set as shortcut to "Table1.Var3".
Instead of:
ok = false;
if xyz
ok=true;
end
if ~ok && abc
ok=true;
end
if ~ok ...
you can write:
if xyz
...
elseif abc
...
elseif ...
This will not reduce the runtime a lot, but it is nicer to read.
  2 comentarios
Urs Grigore
Urs Grigore el 23 de Ag. de 2017
Editada: Urs Grigore el 23 de Ag. de 2017
Thanks a lot for the tips! The thing is that i also need the rest of the values from the table in the last while loop so I guess I cannot do that. I already changed the first 2 while loops and I am using the find function now.The last while is the one that takes the most time. Thanks a lot!
Jan
Jan el 23 de Ag. de 2017
I guess I cannot do that.
Cannot do what? A pre-allocation is essential.
Note that it is much easier to improve your code, when we can run it. So provide some representative inputs.

Iniciar sesión para comentar.


Urs Grigore
Urs Grigore el 24 de Ag. de 2017

This is part of my first script:

%LMAX_CFH
lim=86400000000;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
T11=LMAX_EURUSD.Var1;
T12=LMAX_EURUSD.Var3;
T13=LMAX_EURUSD.Var5;
T21=LMAX_USDJPY.Var1;
T22=LMAX_USDJPY.Var3;
T23=LMAX_USDJPY.Var5;
table_merge1();
LMAX_EURUSD_USDJPY=T3;

This is my table_merge1 :

        %i have 2 tables with different timestamps and different values
        %for each timestamp
        %i need to have 1 table with all the timestamps and the values from both tables
        %when i add a timestamp from the first table to the the table i need to have
        %i will add the values from the HIGHEST timestamp from 2nd table which 
        %is LOWER or EQUAL to the timestamp from the first table
        %same works when i add a timestamp from the 2nd table
        h1=height(LMAX_EURUSD) ; %h1 is the height of the first table, i use it 
                                 %to know until where should i go with the while
        h2=height(LMAX_USDJPY) ; %same
        k1=1; %k1 represents the line i am working with from the first table
        k2=1; %k2 represents the line i am working with from the second table
        if T11(1)>T21(1)
            k2 = find(T11(1) > T21, 1);
            k2=k2+1;
            k1=2;
            k2=k2+1;
        end
        %on the first column in each tables i have a timestamp calculated in miliseconds
        %i will bring the two variables representing the line in tables to consecutive values
        %and increase them with 1 because i work with the last values i had
        if T11(1)<T21(1)
            k1 = find(T11 <= T21(1), 1);
            k1=k1+1;
            k2=2;
        end
        %same thing as above
        TIMP=0;
        LMAX_EURUSD_USDJPY_bid=0;
        LMAX_EURUSD_USDJPY_ask=0;
        T3=table(TIMP,LMAX_EURUSD_USDJPY_bid,LMAX_EURUSD_USDJPY_ask);
        k3=1;
        %again the line i am in T3 is represented by k3
        while k1<=h1 || k2<=h2
            %while i still have values in both tables
            if  T11(k1) < T21(k2)
                k1=k1+1;
                time=T11(k1-1);
            elseif (T11(k1) == T21(k2))
                k1=k1+1;
                k2=k2+1;
                time=T11(k1-1);
            elseif T11(k1) > T21(k2)
                k2=k2+1;
                time=T21(k2-1);
            end
            %check if the line i am at in the 2nd table has a lower Timestamp 
            %than the line in the 1st table
            %add the values in the T3 and increase the line counter.
            %cell={time,T12(k1-1)*T22(k2-1),T13(k1-1)*T23(k2-1)};
            T3.TIMP(k3)=time;
            T3.LMAX_EURUSD_USDJPY_bid=T12(k1-1)*T22(k2-1);
            T3.LMAX_EURUSD_USDJPY_ask=T13(k1-1)*T23(k2-1);
            k3=k3+1;
        end

The files are too big to add them here so I uploaded them to dropbox LMAX_EURUSD and LMAX_USDJPY

Categorías

Más información sobre Tables en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by