Odd Numerical Issue - Can I select a Processor Type?

Question

0 votos

I am getting odd behavior with the following code

 load question
disp(['isequal(xi,xl):        ' num2str(isequal(xi,xl))]);
for itr = 1:size(xi,2); 
  test1(itr) = xl(:,itr).'*f; 
  test2(itr) = xi(:,itr).'*f; 
end
disp(['isequal(test1,test2):  ' num2str(isequal(test1,test2))]);
for itr = 1:size(xi,2); 
  test3(itr) = xl(:,itr).'*f; 
  test4(itr) = xl(:,itr).'*f; 
end
disp(['isequal(test3,test4):  ' num2str(isequal(test3,test4))]);
disp(['isequal(test3,test1):  ' num2str(isequal(test3,test1))]);
disp(['isequal(test4,test2):  ' num2str(isequal(test4,test2))]);
for itr = 1:size(xi,2); 
  test5(itr) = xi(:,itr).'*f; 
end
for itr = 1:size(xi,2); 
  test6(itr) = xl(:,itr).'*f; 
end
disp(['isequal(test5,test6):  ' num2str(isequal(test5,test6))]);

The output, (MATLAB 7.13.0.564 (R2011b), Win 7 Pro, 4 Core 2nd Gen i5 Processor) is

 question
 isequal(xi,xl):        1
 isequal(test1,test2):  0
 isequal(test3,test4):  0
 isequal(test3,test1):  1
 isequal(test4,test2):  1
 isequal(test5,test6):  1

The expected result is that all isequal calls should return true. However, on this machine, it appears that order of execution in the for loop matters.

The PC passes memtest86, and prime95. This code, using the accompanying .mat file, gives the expected result on a 2 Core 2nd Gen i7 Win 7 Pro laptop running 2012b, and on another Win7 Pro machine in the office.

I talked it over with a friend, and he started discussing ieee754's guard digits and the way intermediate results are stored. But, to me, this behavior is wrong. It's the sort of thing that would have happened in the bad old days, but which should not happen in 2012. test3 and test4 coming out different is particularly egregious.

The only thing I can think of is that maybe MATLAB is set to optimize for the wrong processor type. I know that with some numerical software, you get to select the processor architecture to optimize for. Is there any way to do that, or check such a setting, with MATLAB? Alternatively, can anyone else reproduce the problem or explain what is going on with this configuration?

Thanks,

Andrew

question.mat:

https://docs.google.com/open?id=0B8DYPdWIOdpyZHZUU2p5RDYzdU0

image:

https://docs.google.com/open?id=0B8DYPdWIOdpyOG84dl9vZHVtWmM

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Walter Roberson el 18 de Dic. de 2012

Does the same problem occur if you drop the number of elements in the array down to less than (around) 5000 ?

Sean de Wolski el 18 de Dic. de 2012

Abrir en MATLAB Online

Hi Andrew,

Could you please try two things for me to test Jan's theories:

If you turn off the JIT using:

feature accel off

Does the obscurity still occur?

Next, start MATLAB in -singleCompThread mode. Is the behavior the same?

Follow Question

Answer 1

Andrew el 19 de Dic. de 2012

1 voto

MathWorks tech support looked into the issue, and they have said that the most likely cause is a known bug,

http://www.mathworks.com/support/bugreports/755531

"Linear algebra functions may return slightly different but correct results for each evaluation on machines with Intel® Sandy Bridge processors."

This bug is fixed is R2012a and later. Also, single thread computation and disabling java acceleration (feature accel off) do not affect the result

Thanks to those who helped.

Andrew

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Answer 2

Roger Stafford el 18 de Dic. de 2012

Editada: Walter Roberson el 18 de Dic. de 2012

Abrir en MATLAB Online

0 votos

Your results are surprising to me, Andrew. Are you consistently using double precision for all quantities involved? If so, it gives the appearance of some kind of compiler optimization trick gone awry. It would be interesting to see if the inequalities you observed possessed some kind of pattern such as occurring only on the first elements of the 'test' arrays, or are distributed over their full lengths. You can determine that by using "find(test3~=test4)" rather than 'isequal'. Perhaps it only occurs on the first matrix multiplication within a for-loop. Try writing

for itr = 1:size(xi,2);
    testa(itr) = xl(:,itr).'*f;
    testb(itr) = xl(:,itr).'*f; 
    testc(itr) = xl(:,itr).'*f; 
    testd(itr) = xl(:,itr).'*f; 
end

and testing all six pairs for inequalities. Also would this picture change if some other operations were interspersed between each of these lines?

The 754 Standard only guarantees that a result from a single addition, subtraction, multiplication, or division operation will differ from the precise answer by no more than half the least bit. This is accomplished by using more than one guard bit, which is a requirement in the Standard. However changing the sequence of operations can nevertheless alter results, as for example doing a+(b+c) as compared with (a+b)+c. Also some processors will carry out a series of successive additions and multiplications such as occur in matrix multiplication using higher precision temporarily rather than the regular double precision at each stage of computation, and that can affect the results when the final values are returned to double precision.

However, I cannot think of any reasonable difference that would be a result of a different location in a for-loop list. One would think that when one line of code was completed all side effects would be removed and would not carry over to the next line. It has the flavor of a true bug. It sounds like something that should be presented to MathWorks' support people.

Roger Stafford

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Answer 3

Jan el 18 de Dic. de 2012

Editada: Jan el 18 de Dic. de 2012

Abrir en MATLAB Online

0 votos

In addition to Roger's answer, what is max(abs(test1(:) - test2(:)))? If this is far from eps, a bug in the JIT acceleration is more likely than a processor related floating point problem. But if it is near to eps this is not a hard evidence for the source of the problem.

It would be helpful if you post the dimensions of the arrays also, because this is more convenient than downloading and analysing your MAT file. When xl(:,itr).'*f is a large dot product, I guess a value of > 1e5 elements, there could be a multi-threading effect: The dot product is distributed to different threads, the order of the addition of the results matter. This problem occurred in Matlab 2009a for sum() with more than 89000 elements, see Bugreport 532399:

x = rand(1, 1e6);
y1 = sum(x);
y2 = sum(x);
isequal(y1, y2)   % Not always TRUE

You can start Matlab in single-thread mode an check if the problems still appear. If so, please contact the technical support, because this problem might be undocumented yet.

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Answer 4

Andrew el 18 de Dic. de 2012

Abrir en MATLAB Online

0 votos

Thank you Roger and Jan for your replies.

On Roger's suggested code, testa and testc are equal, testb and testd are equal, and the other 4 tests return false.

In reply to Jan,

 max(abs(test1(:) - test2(:)))
 ans =
  3.1402e-016
 eps(max(abs([test1(:);test2(:)])))
 ans =
   2.2204e-016

Also, the vectors involved are small. The .mat file is 40 KB. I'm basically doing 16, 64 point windowed Discrete Fourier Transforms on 1024 samples of data.

I really liked Jan's explanation because it made alot of sense. I could see how multi-threading could break up the work differently in each case, resulting in a different final sum (with the .'* operation, I am doing a symmetric inner product/summation). But then, with maxNumCompThreads(1), the issue remains. See the second image.

I had been debugging something else when I stumbled across this. I don't think this is the actual problem with my larger code - I'm not branching on isequal results for arrays or anything - but, it threw me for a loop because its so weird. As I mentioned, the kicker is that it works on other machines. I guess I will do a bug report. And return to debugging.

image2

https://docs.google.com/open?id=0B8DYPdWIOdpyNDZqdVpBaUlfUUE

Andrew

Odd Numerical Issue - Can I select a Processor Type?

2 comentarios
Mostrar Ninguno Ocultar Ninguno

Respuestas (4)

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Productos

Etiquetas

Community Treasure Hunt

Odd Numerical Issue - Can I select a Processor Type?

2 comentarios Mostrar Ninguno Ocultar Ninguno

Respuestas (4)

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Productos

Etiquetas

Ver también

Community Treasure Hunt

2 comentarios
Mostrar Ninguno Ocultar Ninguno

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos