Nested Loops and efficiency

Hi,
I have the following function, and there are some nested loops with multiple if statements inside. The code, runs extremely slow in Matlab ( fortran executes the whole code in 15 min, matlab needs a couple of hours). Nonetheless, this is one part of the code, that Matlab needs some time to complete
function [ystar, cstar, plterm]=decision_rules(r,w, brolillas, model, pl_flag)
global nshock one update tinies aproxdr e a
du=zeros(length(a)+1);
errrel = 3.0e-30;
rules_converge =false;
iter=0;
%*************************************************************
% First Get last term
%*************************************************************
if(iter>1)
iterflag=1;
end
plterm=0.0;
% reset plterm
if(brolillas<1)
plterm0=0.0d0;
end
if(model==2 && pl_flag >= 1 && brolillas >= 1)
[plterm]=exterm(fdist, w, r, ystar) ;
if(abs(plterm-plterm0)<1.0d-6)
update_flag=1;
end
if(abs(plbnd(1)-plbnd(2))<1.0d-9)
update_flag=1;
end
if(plterm>=plterm0)
plbnd(1)=max(plterm0,plbnd(1));
end
if(plterm<=plterm0)
plbnd(2)=min(plterm0,plbnd(2));
end
if(brolillas >= 3)
plterm=update_pl.*plterm+(1.0d0-update_pl).*plterm0;
if(plterm>plbnd(2) || plterm<plbnd(1))
plterm=sum(sum(plbnd))./2.;
end
else
plterm=update_pl.*plterm+(1.0d0-update_pl).*plterm0;
end;
elseif(model==2 && brolillas==0)
plbnd(1)=-1.;
plbnd(2)=1.;
plterm=plterm0;
end
pl_flag=pl_flag+1;
brolillas=brolillas+1;
% Initialization 2: construct cstar from ystar
y0=zeros(length(a),nshock);
c0=zeros(length(a),nshock);
ystar=zeros(length(a),nshock);
cstar=zeros(length(a),nshock);
for i=1:nshock
for n=1: length(a)
cstar(n,i)=r.*a(n)+w.*e(i)-ystar(n,i);
end
end
% Initialization 3 : initial updating
y0 =(one-update).*y0+update.*ystar;
c0 =(one-update).*c0+update.*cstar;
%****************************************************************
% Main Loop
%****************************************************************
while(~rules_converge)
iter = iter + 1;
disp(['Iteration No:' num2str(iter)])
%*************************************************************
% Solve the decision rule
%*************************************************************
for i=1:nshock
for n=1:length(a)
%Find ystar
for np=1:length(a)+1
if(np>length(a))
break;
end
if(upap(n,a(np),i,r,w) > 100/tinies)
break;
end
du(np)=-upap(n,a(np),i,r,w)+ meru(a(np),i,r,c0)+plterm;
if(du(np)<=0)
break;
end
end
if(np>length(a))
yy1=a(length(a));
elseif(upap(n,a(np),i,r,w) > 100/tinies)
if(np==1)
yy1=a(1);
else
yexp=a(n)*r+w*e(i) - tinies;
dusup=upap(n,yexp,i,r,w)-meru(yexp,i,r,c0)-plterm;
if((yexp-a(np-1))>tinies && dusup>0.)
yy1=zbrent(a(np-1),yexp,errrel,i,n,r,w,c0,plterm);
else
yy1=a(np-1);
end
end
else
if(np==1)
yy1=a(1);
else
yy1=zbrent(a(np-1),a(np),errrel,i,n,r,w,c0,plterm);
end
end
ystar(n,i) = yy1;
cstar(n,i) = r.*a(n)+w.*e(i)-ystar(n,i);
end
end
%*************************************************************
% Check convergence
%*************************************************************
no_converge_y=zeros(length(a),nshock);
no_converge_y(abs(ystar-y0)>aproxdr)=1;
no_converge =fix(sum(sum(no_converge_y)));
tempmaxval=abs(ystar-y0);
error=max(tempmaxval(:));
disp(['the error in optimal plans is:' num2str(error)])
if(no_converge==0)
rules_converge=true;
else
rules_converge=false;
end
%*************************************************************
% Update y0 and c0
%*************************************************************
y0=ystar;
c0=cstar;
end
disp(['Total Iteration No:' num2str(iter)])
end
Can anybody here suggest some modifications that I could improve efficiency ?
Thanks

4 comentarios

dpb
dpb el 21 de Jun. de 2014
Editada: dpb el 21 de Jun. de 2014
Profile and post results...plus knowing precisely which areas show concentrated bottlenecks lets you look into those areas specifically.
Some specific questions--
Are r, w scalars?
Are upap and meru functions or arrays?
ADDENDUM
Intended to add, however, why not just convert the Fortran to a mex-file and call it from Matlab?
msh
msh el 22 de Jun. de 2014
Editada: msh el 22 de Jun. de 2014
Yes, r and w are scalars and upap and meru are functions giving a scalar as output. The 'a' is an array. The bottleneck, from the whole code is really what I posted, and more specifically, is the 'nested loops' within the 'while' loop, i.e the section I commented out as 'Main loop'. I will try for the mex function, although seems quite perplexed to me, as i am not that experienced. Nevertheless, I am also trying to learn a bit of Matlab, and I found it a useful exercise to translate the original code of fortran. Indeed in some parts of the code, you gave a very good advice in other posts of mine.
ADDENDUM
Profile ? what exactly I have to do? for the speed and time it takes, i guess I can use the tic-toc. I am sorry but i am not at all experienced with programming or matlab. But as it stands, do you think that I can alter the algorithm for the "Main loop" section to be faster?
dpb
dpb el 22 de Jun. de 2014
Editada: dpb el 23 de Jun. de 2014
See
doc profile
for running the profiler. It's not the overall time taken, it's where within the code precisely the largest fraction of time is taken up that's important so as to know precisely where in the overall function one should concentrate efforts.
The difficulty of this code in Matlab to generate speed while guaranteeing the same result is that the loops all have all the break conditions in them and the later results are dependent upon the point within the loop at which the exit occurred. A prime example is that condition on the first loop end index Jan and I were just discussing. The next step is dependent on where that loop exits and why.
To get best performance in Matlab one needs to be able to vectorize the loops but it would take a significant amount of detailed investigation of the algorithm to be able to tell what, specifically, would be the proper action after the vectorization of one of these loops that otherwise would have exited before completion. That level of effort is just too much to expect for a respondent here to be able to infer just how that should look.
There's the snippit section
if(upap(n,a(np),i,r,w) > 100/tinies), break; end
du(np)=-upap(n,a(np),i,r,w)+ meru(a(np),i,r,c0)+plterm;
if(du(np)<=0), break; end
One could easily enough write the function upap to take the vectors a and n and return a logical array or the first index of the condition but one might then find that doing the whole thing is as or more expensive than shortening the loop because the condition occurs early on in a high percentage of typical cases while the function could be relatively costly. As Jan points out, whether those functions are significant computation-wise is one of the things the profiler will tell you.
If one had a functional descripton of the inputs/expected outputs, one could possibly write a Matlab function that does that that does use typical Matlab constructs effectively, but to start from translating Fortran to Matlab for such a case is very difficult task other than the literal transliteration process you've got here and the resulting inefficiency that comes with that.
Jan
Jan el 22 de Jun. de 2014
Editada: Jan el 22 de Jun. de 2014
@Safis: To profile the code use the command profiler. Lokk in the documentation to learn more about it.
This function will reveal, if the loops are the main problem, of if the time is spent in meru and upap. Without profiling it is possible, that we optimize the nested loops, which need only some percent of the total computing time.

Iniciar sesión para comentar.

Respuestas (2)

Jan
Jan el 22 de Jun. de 2014

1 voto

The code can be improved. E.g. this is not useful:
for np=1:length(a)+1
if(np>length(a))
break;
end
...
What about:
for np=1:length(a)
...
This should be vectorized for a fair speed in Matlab:
for i=1:nshock
for n=1: length(a)
cstar(n,i)=r.*a(n)+w.*e(i)-ystar(n,i);
end
end
But I don't assume, that this is a bottleneck of the code.
Nevertheless, questions about improving the speed of the code cannot be answered only by seeing the code, but you need the profiler and the readers of the forum need a running example with matching testdata.
A general but a little bit disappointing advice: Avoid expressions like 0. and 0.0d0 in Matlab.Although they work currently and should work for reasons of compatibility, this is not documented.

5 comentarios

dpb
dpb el 22 de Jun. de 2014
Editada: dpb el 22 de Jun. de 2014
I initially posted the same comment, Jan, but--the +1 on the loop index does do something; now whether it's useful or needed or not I don't know but it can make a difference owing to the test in the following section. If you remove the +1 from the loop limit, then first clause of the following test
if(np>length(a))
yy1=a(length(a));
elseif(upap(n,a(np),i,r,w) > 100/tinies)
...
could never be true but it can be as written if the other loop exit conditions above don't ever come to pass. At the point of realizing that and not having any other info of purpose to try to recast to end objective instead of translate I suggested just turn it into a mex-function if need to use from Matlab.
I'd think the cautionary about d/D as exponents for floating point literals is excessively conservative--I don't think there's any way TMW could ever remove that from the language w/o it causing such a huge issue in compatibility that I'd certainly not worry over it.
Granted there's no express need for it in Matlab as there is in Fortran as the default casting over everything to double makes the expression superfluous.
msh
msh el 22 de Jun. de 2014
@Jan, thanks for you time. Can you please check my answer to dpb above. Thanks
Jan
Jan el 22 de Jun. de 2014
@dpb and Safis: It is even more tricky with the index:
for np=1:length(a)+1
if(np>length(a))
break;
end
% ... other reasons to break
end
if np>length(a)
But using a loop counter outside a loop is not exactly defined in Matlab. So this programming style is either confusing or prone to unexpected behaviour (also known as bugs).
dpb
dpb el 22 de Jun. de 2014
Editada: dpb el 22 de Jun. de 2014
Since the index condition of
np=length(a)+1
causes an immediate break the value of np is defined on exit of the loop. It's not the case the index has "run off the end".
That the behavior of the iterator variable after normal loop completion isn't documented at all (whether it is undefined or retains the value at the end of the loop as Fortran) is another case where the documentation of Matlab syntax is (and always has been) lacking. I've harped on this point of needing a definitive "Standard" (for lack of better nomenclature) for going on 30 yr now.
As a practical programming matter it should be defined and be the value at the last iteration but if it's TMW's desire that it not be relied on then that should definitely be documented as being so. It's this "black hole" of no definition that is a quagmire.
dpb
dpb el 26 de Jun. de 2014
I submitted a Service Request on the definition of the for iterator after loop completion being undocumented behavior. The "interpretation" is that it is defined and retains the value it had during at the exit of the loop. The text of the response is included below--
I am writing in reference to your Service Request, Case #01012664 regarding 'For iterator status on completion of loop'.
The iterator is always defined at the end of a loop. Whether a variable is defined or not in any context depends upon the workspace that you are looking at and the workspace that the variable exists in.
In MATLAB, loops run in their parent workspace. So, if a for loop is defined in a script, the for loop utilizes the base workspace (the workspace used by scripts) and the iterator is defined in the base workspace.
If the for loop is running in a function, the for loop utilizes the function's workspace and hence the iterator is also defined in the function workspace.
At the end of the for loop, the iterator will retain the final value that it reached during the execution of the for loop for the last time. [emphasis added as this was our point of uncertainty and the point of the SR being filed]
You are right in your observation that this particular behavior is not clearly documented and it should be. I will go ahead and put in an enhancement request so that the documentation team can consider adding this information to the documentation in a future release of MATLAB.

Iniciar sesión para comentar.

msh
msh el 27 de Jun. de 2014

0 votos

Sorry for my late reply, but really the whole code took ages to complete so i was trying to do something to get some feeling of the performance. I have attached the file from the results of the whole code and not only of this function. I would be very pleased to get your insights or further guidance.

Categorías

Más información sobre Loops and Conditional Statements en Centro de ayuda y File Exchange.

Etiquetas

Preguntada:

msh
el 21 de Jun. de 2014

Respondida:

msh
el 27 de Jun. de 2014

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by