Backslash does not provided the solution with the smallest 2-norm

18 views (last 30 days)
I was debugging a constraint solver when I encountered a problem with the backslash operator in MATLAB.
I have isolated the problem into the attached minimal working example. The issue occurs when solving a consistent but under-determined linear system Ax=b. In this case, we expect that MATLAB returns the solution x which has the smallest 2-norm, right? I believe that the attached script demonstrates that this is not the case and that the computed result depends on whether A is treated as a sparse or a dense matrix.
Would you please take a look and see if I have missed something or if this is indeed an issue that should be fixed?
Kind regards
Carl Christian K. Mikkelsen.
% The backslash operator and underdetermined linear systems
%
% The result of A\b depends on whether A is dense or sparse and
% the computed results do not minimize the 2-norm over the set
% of solutions of Ax=b.
%
% PROGRAMMING by Carl Christian Kjelgaard Mikkelsen (spock@cs.umu.se)
% 2022-08-04 Initial programming and testing
% Define a wide matrix
A=[1 1 0; 0 1 1];
% Define a solution
t=(1:3)';
% Define the right hand side so that Ax=b is consistent
b=A*t;
% Solve the underdetermined linear equation Ax=b
% using a dense representation of A
x=A\b;
% Solve the underdetermined linear equation Ax=b
% using a sparse representation of A
y=sparse(A)\b;
% Solve the underdetermined linear equation Ax=b
% in the linear least squares sense
z=A'*((A*A')\b);
% Compute the residuals
rx=b-A*x;
ry=b-A*y;
rz=b-A*z;
% Compute the 2-norm of the solutions
nx=norm(x);
ny=norm(y);
nz=norm(z);
% Display the results
fprintf('The solutions\n');
The solutions
display([x y z]);
-2.0000 3.0000 0.3333 5.0000 0 2.6667 0 5.0000 2.3333
fprintf('The residuals\n');
The residuals
display([rx ry rz]);
1.0e-14 * 0.1776 0 0.0444 0.1776 0 -0.1776
fprintf('The norm of the solutions\n');
The norm of the solutions
display([nx ny nz]);
5.3852 5.8310 3.5590
  15 Comments
Les Beckham
Les Beckham on 5 Aug 2022
Holy cow, guys. This has pretty much degenerated into "that depends on what the definition of "is" is". Let it go, already. Have a good night. :)

Sign in to comment.

Accepted Answer

Matt J
Matt J on 4 Aug 2022
Edited: Matt J on 4 Aug 2022
In this case, we expect that MATLAB returns the solution x which has the smallest 2-norm, right?
No, the backslash operator will use a QR-solver to produce a solution in that case. This won't necessarily be the least 2-norm solution.
  12 Comments
Matt J
Matt J on 5 Aug 2022
Edited: Matt J on 5 Aug 2022
@Carl Christian Kjelgaard Mikkelsen Even if you disagree or think the documentation should be modified, your question has been answered right?
It should now be clear that A\b purports to do no more than to minimize . When there exists more than one solution to this problem, there is no commitment on the MathWorks' part as to which of the non-unique solutions you will get. Furthermore, there is no commitment that the solutions will be the same from full-to-sparse or from computer-to-computer. None of this is anything the MathWorks regards as a bug.

Sign in to comment.

More Answers (2)

Bruno Luong
Bruno Luong on 4 Aug 2022
Edited: Bruno Luong on 4 Aug 2022
Few other methods to get least-norm solution
A = rand(3,6)
A = 3×6
0.2748 0.9301 0.9530 0.4196 0.5850 0.3316 0.4791 0.8315 0.6904 0.6636 0.2678 0.6999 0.4191 0.4202 0.0789 0.2761 0.9976 0.3208
b = rand(3,1)
b = 3×1
0.4586 0.9898 0.8222
% Methode 1, Christine, recommended
x = lsqminnorm(A,b)
x = 6×1
0.5627 0.0942 -0.3752 0.5117 0.2032 0.7243
% Method 2, Pseudo inverse
x = pinv(A)*b
x = 6×1
0.5627 0.0942 -0.3752 0.5117 0.2032 0.7243
% Method 3, BS on KKT
[m,n] = size(A);
y=[eye(n), A'; A, zeros(m)] \ [zeros(n,1); b];
x = y(1:n)
x = 6×1
0.5627 0.0942 -0.3752 0.5117 0.2032 0.7243
% Method 4, QR on A'
[Q,R,p] = qr(A',0);
x = Q*(R'\b(p,:))
x = 6×1
0.5627 0.0942 -0.3752 0.5117 0.2032 0.7243
  2 Comments
Bruno Luong
Bruno Luong on 4 Aug 2022
Edited: Bruno Luong on 4 Aug 2022
"Please note that my issue is not the problem of computing the least squares solution"
I think you need to reset the definition of what is "least squares solution". I guess all the confusion is that you associate with minimum norm solusion, which is not in general.
I know, and guess most of people working sometime with MATLAB also know it.

Sign in to comment.


John D'Errico
John D'Errico on 5 Aug 2022
Edited: John D'Errico on 5 Aug 2022
It seems the gist of your question comes down to the idea that for a wide matrix, MATLAB should (in your eyes only) always produce a solution that minimizes the norm of x. For square matrices, or matrices that are taller than wide, it should do something else? That alone seems highly inconsistent to me.
So first, where have you ever seen the claim that this is true? No place in the documentation is this claimed. I even took a quick look. Maybe you have confused the idea of a solution that minimizes the norm of the residiuals to a solution that minimizes the norm of x itself. We can't really know where you got the idea.
Should something be fixed? Of course not! MATLAB already provides multiple solutions that yield the minimum norm of x when they are used.
A = [1 2 3;4 5 7];
b = [2;3];
A\b
ans = 3×1
-1.0000 0 1.0000
So backslash produces a basic solution, as expected for the underdetermined case, with one element exactly zero. For a solution that is minimum norm in x, we already have:
lsqminnorm(A,b)
ans = 3×1
-1.0571 0.2857 0.8286
pinv(A)*b
ans = 3×1
-1.0571 0.2857 0.8286
lsqr(A,b)
lsqr converged at iteration 2 to a solution with relative residual 1.7e-14.
ans = 3×1
-1.0571 0.2857 0.8286
  5 Comments
Bruno Luong
Bruno Luong on 6 Aug 2022
" grasping at straws"
I disagree, he only grasping at the straw. ;-)

Sign in to comment.

Products


Release

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by