Backslash does not provided the solution with the smallest 2-norm

32 visualizaciones (últimos 30 días)
I was debugging a constraint solver when I encountered a problem with the backslash operator in MATLAB.
I have isolated the problem into the attached minimal working example. The issue occurs when solving a consistent but under-determined linear system Ax=b. In this case, we expect that MATLAB returns the solution x which has the smallest 2-norm, right? I believe that the attached script demonstrates that this is not the case and that the computed result depends on whether A is treated as a sparse or a dense matrix.
Would you please take a look and see if I have missed something or if this is indeed an issue that should be fixed?
Kind regards
Carl Christian K. Mikkelsen.
% The backslash operator and underdetermined linear systems
%
% The result of A\b depends on whether A is dense or sparse and
% the computed results do not minimize the 2-norm over the set
% of solutions of Ax=b.
%
% PROGRAMMING by Carl Christian Kjelgaard Mikkelsen (spock@cs.umu.se)
% 2022-08-04 Initial programming and testing
% Define a wide matrix
A=[1 1 0; 0 1 1];
% Define a solution
t=(1:3)';
% Define the right hand side so that Ax=b is consistent
b=A*t;
% Solve the underdetermined linear equation Ax=b
% using a dense representation of A
x=A\b;
% Solve the underdetermined linear equation Ax=b
% using a sparse representation of A
y=sparse(A)\b;
% Solve the underdetermined linear equation Ax=b
% in the linear least squares sense
z=A'*((A*A')\b);
% Compute the residuals
rx=b-A*x;
ry=b-A*y;
rz=b-A*z;
% Compute the 2-norm of the solutions
nx=norm(x);
ny=norm(y);
nz=norm(z);
% Display the results
fprintf('The solutions\n');
The solutions
display([x y z]);
-2.0000 3.0000 0.3333 5.0000 0 2.6667 0 5.0000 2.3333
fprintf('The residuals\n');
The residuals
display([rx ry rz]);
1.0e-14 * 0.1776 0 0.0444 0.1776 0 -0.1776
fprintf('The norm of the solutions\n');
The norm of the solutions
display([nx ny nz]);
5.3852 5.8310 3.5590
  15 comentarios
Bruno Luong
Bruno Luong el 5 de Ag. de 2022
"a solution" and "the solution" are not gramatically equivalent."
Granted, but they are not exclusive.
Given that
  • "the solution" (in a proper context) is "a solution".
They can perfectly both used, without any contradiction. So I don't see the reason they cannot be coexist.
Les Beckham
Les Beckham el 5 de Ag. de 2022
Holy cow, guys. This has pretty much degenerated into "that depends on what the definition of "is" is". Let it go, already. Have a good night. :)

Iniciar sesión para comentar.

Respuesta aceptada

Matt J
Matt J el 4 de Ag. de 2022
Editada: Matt J el 4 de Ag. de 2022
In this case, we expect that MATLAB returns the solution x which has the smallest 2-norm, right?
No, the backslash operator will use a QR-solver to produce a solution in that case. This won't necessarily be the least 2-norm solution.
  12 comentarios
Carl Christian Kjelgaard Mikkelsen
@Matt J No, I am afraid that I cannot agree with you. I believe that we have established that the documentation of backslash can be improved to clarify which problem is being solved and that the terminology is not consistent across the international community or even within the USA, see the link to Wiscon state university that I provided above.
I have already explained to @Steven Lord how the current documentation can be read to support my interpretation. I have also stated clearly that if the desired behavior is to produce different solution for the same matrix depending on the data-type, then this should figure prominently in the documentation.
Matt J
Matt J el 5 de Ag. de 2022
Editada: Matt J el 5 de Ag. de 2022
@Carl Christian Kjelgaard Mikkelsen Even if you disagree or think the documentation should be modified, your question has been answered right?
It should now be clear that A\b purports to do no more than to minimize . When there exists more than one solution to this problem, there is no commitment on the MathWorks' part as to which of the non-unique solutions you will get. Furthermore, there is no commitment that the solutions will be the same from full-to-sparse or from computer-to-computer. None of this is anything the MathWorks regards as a bug.

Iniciar sesión para comentar.

Más respuestas (2)

Bruno Luong
Bruno Luong el 4 de Ag. de 2022
Editada: Bruno Luong el 4 de Ag. de 2022
Few other methods to get least-norm solution
A = rand(3,6)
A = 3×6
0.2748 0.9301 0.9530 0.4196 0.5850 0.3316 0.4791 0.8315 0.6904 0.6636 0.2678 0.6999 0.4191 0.4202 0.0789 0.2761 0.9976 0.3208
b = rand(3,1)
b = 3×1
0.4586 0.9898 0.8222
% Methode 1, Christine, recommended
x = lsqminnorm(A,b)
x = 6×1
0.5627 0.0942 -0.3752 0.5117 0.2032 0.7243
% Method 2, Pseudo inverse
x = pinv(A)*b
x = 6×1
0.5627 0.0942 -0.3752 0.5117 0.2032 0.7243
% Method 3, BS on KKT
[m,n] = size(A);
y=[eye(n), A'; A, zeros(m)] \ [zeros(n,1); b];
x = y(1:n)
x = 6×1
0.5627 0.0942 -0.3752 0.5117 0.2032 0.7243
% Method 4, QR on A'
[Q,R,p] = qr(A',0);
x = Q*(R'\b(p,:))
x = 6×1
0.5627 0.0942 -0.3752 0.5117 0.2032 0.7243
  2 comentarios
Carl Christian Kjelgaard Mikkelsen
Thank you for reading and responding to my question.
Please note that my issue is not the problem of computing the least squares solution of a consistent underdetermined linear system, but the fact that the backslash operator provides very different results depending on the matrix representation and that the computed results do not minimize the 2-norm over the set of solutions.
Bruno Luong
Bruno Luong el 4 de Ag. de 2022
Editada: Bruno Luong el 4 de Ag. de 2022
"Please note that my issue is not the problem of computing the least squares solution"
I think you need to reset the definition of what is "least squares solution". I guess all the confusion is that you associate with minimum norm solusion, which is not in general.
I know, and guess most of people working sometime with MATLAB also know it.

Iniciar sesión para comentar.


John D'Errico
John D'Errico el 5 de Ag. de 2022
Editada: John D'Errico el 5 de Ag. de 2022
It seems the gist of your question comes down to the idea that for a wide matrix, MATLAB should (in your eyes only) always produce a solution that minimizes the norm of x. For square matrices, or matrices that are taller than wide, it should do something else? That alone seems highly inconsistent to me.
So first, where have you ever seen the claim that this is true? No place in the documentation is this claimed. I even took a quick look. Maybe you have confused the idea of a solution that minimizes the norm of the residiuals to a solution that minimizes the norm of x itself. We can't really know where you got the idea.
Should something be fixed? Of course not! MATLAB already provides multiple solutions that yield the minimum norm of x when they are used.
A = [1 2 3;4 5 7];
b = [2;3];
A\b
ans = 3×1
-1.0000 0 1.0000
So backslash produces a basic solution, as expected for the underdetermined case, with one element exactly zero. For a solution that is minimum norm in x, we already have:
lsqminnorm(A,b)
ans = 3×1
-1.0571 0.2857 0.8286
pinv(A)*b
ans = 3×1
-1.0571 0.2857 0.8286
lsqr(A,b)
lsqr converged at iteration 2 to a solution with relative residual 1.7e-14.
ans = 3×1
-1.0571 0.2857 0.8286
  5 comentarios
John D'Errico
John D'Errico el 6 de Ag. de 2022
The only place I can see where you claim the documentation says the solution is a minimum norm solution is here:
"The key is the use of the article "the" as in "the solution". It implies uniqueness. Therefore, the reader will believe that x=A\b minimizes 2-norm of the residual when A is tall matrix of full column rank and that x=A\b has the smallest 2-norm of all the solutions of Ax=b when A is a wide matrix of full row rank. Why? Because these two problems connected to the 2-norm have unique solutions."
NO place in the documentation for backslash has it said anything about minimum norm for x. You have made a jump to an unfounded conclusion from nothing more than their use of the word "THE", to decide that this implies your version of what you want to be true.
The problem is, there are certainly other ways to choose a UNIQUE solution. For example, why assume a minimum 2-norm?
And to jump to a conclusion from the inclusion of a three letter word in the documentation is a bit strange. Surely if they were going to SPECIFICALLY do something different for one shape matrix than for another shape matrix, they would have said something more than just an obique reference with the word "the"?
We see this line:
"If A is an M-by-N matrix with M < or > N and B is a column vector with M components, or a matrix with several such columns then X = A\B is the solution in the least squares sense to the under- or overdetermined system of equations A*X = B"
The phrase "In the least squares sense" means a solution that minimizes norm(A*X-b). It says nothing about norm(x).
Honestly, I think you are grasping at straws here, based only on the use of the word "the".
Bruno Luong
Bruno Luong el 6 de Ag. de 2022
" grasping at straws"
I disagree, he only grasping at the straw. ;-)

Iniciar sesión para comentar.

Categorías

Más información sobre Logical en Help Center y File Exchange.

Productos


Versión

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by