Most Efficient Way to Construct the Matrices to Extract the Lower and Upper Triangle from a Vectorized Matrix

Question

Royi Avital el 20 de Abr. de 2020

1
Enlazar

Enlace directo a esta pregunta

https://la.mathworks.com/matlabcentral/answers/519408-most-efficient-way-to-construct-the-matrices-to-extract-the-lower-and-upper-triangle-from-a-vectoriz

Comentada: Royi Avital el 24 de Abr. de 2020

Respuesta aceptada: James Tursa

Abrir en MATLAB Online

Given a matrix X and its vector form

I am after the most efficient way to build the matrices L and U which extracts the lower and upper triangle from X.

So in MATLAB code it would be something like that:

clear();
numRows = 3;
numCols = numRows;
mX = randn(numRows, numCols);
vX = mX(:);
% Lower Triangle are indices 2, 3, 6
mL = [  0, 1, 0, 0, 0, 0, 0, 0, 0   ; ...
        0, 0, 1, 0, 0, 0, 0, 0, 0   ; ...
        0, 0, 0, 0, 0, 1, 0, 0, 0   ];
% Upper Triangle are indices 4, 7, 8
mU = [  0, 0, 0, 1, 0, 0, 0, 0, 0   ; ...
        0, 0, 0, 0, 0, 0, 1, 0, 0   ; ...
        0, 0, 0, 0, 0, 0, 0, 1, 0   ];
assert(isequal(mL * vX, mX(logical(tril(mX, -1)))));
assert(isequal(mU * vX, mX(logical(triu(mX, 1)))));

I am after sparse represenation of mU and mL in the most efficient way.

My current implementation is given by:

function [ mLU ] = GenerateTriangleExtractorMatrix( numRows, triangleFlag, diagFlag )
EXTRACT_LOWER_TRIANGLE = 1;
EXTRACT_UPPER_TRIANGLE = 2;
INCLUDE_DIAGONAL = 1;
EXCLUDE_DIAGONAL = 2;
switch(diagFlag)
    case(INCLUDE_DIAGONAL)
        numElements = 0.5 * numRows * (numRows + 1);
        diagIdx = 0;
    case(EXCLUDE_DIAGONAL)
        numElements = 0.5 * (numRows - 1) * numRows;
        diagIdx = 1;
end
vJ = zeros(numElements, 1);
if(triangleFlag == EXTRACT_LOWER_TRIANGLE)
    elmntIdx = 0;
    for jj = 1:numRows
        for ii = (jj + diagIdx):numRows
            elmntIdx = elmntIdx + 1;
            vJ(elmntIdx) = ((jj - 1) * numRows) + ii;
        end
    end
elseif(triangleFlag == EXTRACT_UPPER_TRIANGLE)
    elmntIdx = numElements + 1;
    for jj = numRows:-1:1
        for ii = (jj - diagIdx):-1:1
            elmntIdx = elmntIdx - 1;
            vJ(elmntIdx) = ((jj - 1) * numRows) + ii;
        end
    end
end
mLU = sparse(1:numElements, vJ, 1, numElements, numRows * numRows, numElements);
end

Is there a more efficient way to generate vJ without extensive allocation of memory (In order to allow generating really large matrices)?

Thank You.

24 comentarios
Mostrar 22 comentarios más antiguosOcultar 22 comentarios más antiguos

Matt J el 23 de Abr. de 2020

Editada: Matt J el 23 de Abr. de 2020

Abrir en MATLAB Online

@Matt, there are many cases for using those matrices.

I can't think of any. You can pursue this for hypothetical interest if you want, of course.

Regarding fmincon(), Solving problem which have dedicate solver with general solver is usually a really bad chice.

The situations when that is true are those where calculating the objective and derivatives are faster in matrix form than in operator form. For large problems that won't always be the case, because the computational cost of implementing things in matrix form can start to outweigh the benefits of using a specialized algorithm.

Not to speak defining Linear Constarint in the form of Non Linear (Think time spent on calculating the Derivataive when it is so well defined).

I don't think mU and mL are helpful for defining either linear or nonlinear constraints. A linear constraint on the lower triangle of your unknown matrix X will always be of the form sum(T.*X, 'all')<=b, where T is some lower triangular matrix that you know in advance. The matrix form of the constraint gradient is simply T(:), which doesn't require mL at all to set up.

For non-linear constraints c(mL*X)<=0 on the lower triangular part of X, the gradient can be expressed mL.'*gradc(mL*X), but this could implemented efficeintly and without mL as follows:

B=B=tril(true(numRows),-1);
Bd=double(B);
g=Bd;
g(B)=gradc(X(B));

So, for this, you really only need to pre-compute B and Bd, which can be done with much less time and memory allocation than mL:

N=3000;
tic;
B=tril(true(N),-1);
Bd=double(B);
toc
%Elapsed time is 0.056492 seconds.
tic;
mL=GenerateTriangleExtractorMatrix( N, 1, 2);
toc
%Elapsed time is 0.264385 seconds.

>> whos B Bd mL
  Name   Size                Kilobytes     Class     Attributes
                                                               
  B      3000x3000                8790     logical             
  Bd     3000x3000               70313     double              
  mL     4498500x9000000        140602     double    sparse    

Matt J el 23 de Abr. de 2020

But that would mean your constraints are of the form mL*X(:)<=b. But since each row of mL contains only a single non-zero element, this means the constraint is equivalent to a simple bound X(j)<=b. In Matlab, you would never have to construct a matrix to represent such a constraint. You would use the vector input arguments lb and ub to specify those. I assume Gurobi has something similar.

Royi Avital el 23 de Abr. de 2020

@Matt, I know that. Whenever I can use other features of the solver I user. I have cases I need those extractors in Matrix Form. I appericiate the dialogue. I think other who will read it will gain something. I still hope someone will bring a different point of view to the pattern of vJ. Though I guess @James' solution as practically as good as it gets (Also appericate if there is something to make it even faster).

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

James Tursa el 22 de Abr. de 2020

2
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/519408-most-efficient-way-to-construct-the-matrices-to-extract-the-lower-and-upper-triangle-from-a-vectoriz#answer_427698

Editada: James Tursa el 22 de Abr. de 2020

Abrir en MATLAB Online

Here is a mex routine that generates the sparse double matrices mL and mU directly, so no wasted memory in creating them. Seems to run about 3x-5x faster than m-code for somewhat large sizes.

/* S = GenerateTriangleExtractorMatrixMex(numRows,triangleFlag,diagFlag)
 *
 * S = double sparse matrix
 * numRows = integer > 0
 * triangleFlag = 1 , extract lower triangle
 *                2 , extract upper triangle
 * diagFlag = 1 , include diagonal
 *            2 , exclude diagonal
 * where
 *
 * M = an numRows X numRows matrix of non-zero terms
 * assert(isequal(S * M(:), mX(logical(tril(M, -1))))); % for lower
 * assert(isequal(S * M(:), mX(logical(triu(M,  1))))); % for upper
 *
 * Programmer: James Tursa
 * Date: 2020-April-22
*/
        
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    mwSize numRows, triangleFlag, diagFlag, numElements;
    mwIndex *Ir, *Jc;
    mwIndex i, j, k, m;
    double *pr;
    
    if( nrhs != 3 || !mxIsNumeric(prhs[0]) || !mxIsNumeric(prhs[1]) || !mxIsNumeric(prhs[2]) ||
        mxGetNumberOfElements(prhs[0]) != 1 || mxGetNumberOfElements(prhs[1]) != 1 ||
        mxGetNumberOfElements(prhs[2]) != 1 ) {
        mexErrMsgTxt("Need three numeric scalar inputs");
    }
    if( nlhs > 1 ) {
        mexErrMsgTxt("Too many outputs");
    }
    numRows = mxGetScalar(prhs[0]);
    triangleFlag = mxGetScalar(prhs[1]);
    diagFlag = mxGetScalar(prhs[2]);
    if( numRows < 1 ) {
        mexErrMsgTxt("Invalid numRows, should be > 0");
    }
    if( triangleFlag != 1 && triangleFlag != 2 ) {
        mexErrMsgTxt("Invalid triangleFlag, should be 1 or 2");
    }
    if( diagFlag != 1 && diagFlag != 2 ) {
        mexErrMsgTxt("Invalid diagFlag, should be 1 or 2");
    }
    if( diagFlag == 1 ) {
        numElements = numRows * (numRows + 1) / 2; /* include diagonal */
    } else {
        numElements = (numRows - 1) * numRows / 2; /* exclude diagonal */
    }
    plhs[0] = mxCreateSparse(numElements, numRows*numRows, numElements, mxREAL);
    pr = (double *) mxGetData(plhs[0]);
    Ir = mxGetIr(plhs[0]);
    Jc = mxGetJc(plhs[0]);
    Jc[0] = 0;
    diagFlag--;
    k = 0;
    m = 1;
    if( triangleFlag == 1 ) { /* Lower */
        for( j=0; j<numRows; j++ ) {
            for( i=0; i<numRows; i++ ) {
                if( i >= j+diagFlag ) {
                    *pr++ = 1.0;
                    *Ir++ = k++;
                    Jc[m] = Jc[m-1] + 1;
                } else {
                    Jc[m] = Jc[m-1];
                }
                m++;
            }
        }
    } else { /* Upper */
        for( j=0; j<numRows; j++ ) {
            for( i=0; i<numRows; i++ ) {
                if( i+diagFlag <= j ) {
                    *pr++ = 1.0;
                    *Ir++ = k++;
                    Jc[m] = Jc[m-1] + 1;
                } else {
                    Jc[m] = Jc[m-1];
                }
                m++;
            }
        }
    }
}

You mex the routine as follows (you need a supported C compiler installed):

mex GenerateTriangleExtractorMatrixMex.c

And some test code:

% GenerateTriangleExtractorMatrix_test.m
n = 300;
disp('m-code timing')
tic
GenerateTriangleExtractorMatrix(10000,1,1);
toc
disp('mex code timing')
tic
GenerateTriangleExtractorMatrixMex(10000,1,1);
toc
for k=1:n
    numRows = ceil(rand*5000+100);
    numCols = numRows;
    triangleFlag = (rand<0.5) + 1;
    diagFlag = (rand<0.5) + 1;
    Mm = GenerateTriangleExtractorMatrix(numRows,triangleFlag,diagFlag);
    Mx = GenerateTriangleExtractorMatrixMex(numRows,triangleFlag,diagFlag);
    if( ~isequal(Mm,Mx) )
        error('Not equal');
    end
end
disp('Random tests passed')

With a sample run:

>> GenerateTriangleExtractorMatrix_test
m-code timing
Elapsed time is 9.964882 seconds.
mex code timing
Elapsed time is 1.901741 seconds.
Random tests passed

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Royi Avital el 22 de Abr. de 2020

@James, I meant I want to use all the abstractions of the MATLAB C API. I just want to use it in my own C files. Not for MEX but for general computing. Yet I guess MATLAB blocks that kind of use.

Royi Avital el 23 de Abr. de 2020

Editada: Royi Avital el 23 de Abr. de 2020

Abrir en MATLAB Online

By the way, I tried optimizing the code:

	if( triangleFlag == 1 ) { // Lower Triangle
		for( jj = 1; jj < numRows + 1; jj++ ) {
			for( ii = 1; ii < jj + diagFlag; ii++ ) {
				ll++;
				Jc[ll] = Jc[ll - 1];
			}
			for( ii = jj + diagFlag; ii < numRows + 1; ii++ ) {
				ll++;
				Jc[ll] = Jc[ll - 1] + 1;
				vV[kk] = 1.0;
				Ir[kk] = kk;
				kk++;
			}
		}
	} else { // Upper Triangle
		for( jj = 1; jj < numRows + 1; jj++ ) {
			for( ii = 1; ii < jj + 1 - diagFlag; ii++ ) {
				ll++;
				Jc[ll] = Jc[ll - 1] + 1;
				vV[kk] = 1.0;
				Ir[kk] = kk;
				kk++;
			}
			for( ii = jj + 1 - diagFlag; ii < numRows + 1; ii++ ) {
				ll++;
				Jc[ll] = Jc[ll - 1];
			}
		}
	}

But for some reason even removing the branching inside the loop didn't improve results.

Really Nice! If nothing comes up I will mark this as an answer. Thank You!

Iniciar sesión para comentar.

Answer 2

Matt J el 23 de Abr. de 2020

1
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/519408-most-efficient-way-to-construct-the-matrices-to-extract-the-lower-and-upper-triangle-from-a-vectoriz#answer_428017

Editada: Matt J el 23 de Abr. de 2020

Abrir en MATLAB Online

Another approach to consider is to use my MatrixObj class

https://www.mathworks.com/matlabcentral/fileexchange/26611-on-the-fly-definition-of-custom-matrix-objects

to construct an object that has the same effect as the operations mL*X and mL.'*Y, but doesn't require you to actually build the matrix,

    N=5000;
    
    tic;
     mL0=GenerateTriangleExtractorMatrix( N, 1, 2);
    toc
    %Elapsed time is 0.678702 seconds.
         
    tic;
    
        B=tril(true(N),-1);
        Bd=double(B(:));
        
        mL=MatrixObj;
        mL.Params.B=B;
        mL.Params.Bd=Bd;
        mL.Ops.mtimes=@(obj,z) z(obj.Params.B);
        mL.Trans.mtimes=@mtimesT;
    
    toc;
    %Elapsed time is 0.086228 seconds.
    
    function out=mtimesT(obj,z) 
        
        out=obj.Params.Bd;
        out(obj.Params.B)=z;
        
    end

In addition to requiring less time to construct, you can verify that it gives the same results as multiplications with mL and mL.',

        >> X=rand(N^2,1);   isequal(mL0.'*(mL0*X),mL.'*(mL*X))
        
        ans =
    
          logical
        
           1

but with considerably less memory consumption:

  >> whos mL mL0
  
  Name   Size                  Kilobytes     Class       Attributes
                                                                   
  mL     1x1                      219739     MatrixObj             
  mL0    12497500x25000000        390586     double      sparse    

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Royi Avital el 24 de Abr. de 2020

This is really nice. Thank you for the effort!

Iniciar sesión para comentar.

Answer 3

Royi Avital el 21 de Abr. de 2020

0
Enlazar

Enlace directo a esta respuesta

https://la.mathworks.com/matlabcentral/answers/519408-most-efficient-way-to-construct-the-matrices-to-extract-the-lower-and-upper-triangle-from-a-vectoriz#answer_427294

Abrir en MATLAB Online

My current solution:

function [ mLU ] = GenerateTriangleExtractorMatrix( numRows, triangleFlag, diagFlag )
EXTRACT_LOWER_TRIANGLE = 1;
EXTRACT_UPPER_TRIANGLE = 2;
INCLUDE_DIAGONAL = 1;
EXCLUDE_DIAGONAL = 2;
switch(diagFlag)
    case(INCLUDE_DIAGONAL)
        numElements = 0.5 * numRows * (numRows + 1);
        diagIdx = 0;
    case(EXCLUDE_DIAGONAL)
        numElements = 0.5 * (numRows - 1) * numRows;
        diagIdx = 1;
end
vJ = zeros(numElements, 1);
if(triangleFlag == EXTRACT_LOWER_TRIANGLE)
    elmntIdx = 0;
    for jj = 1:numRows
        for ii = (jj + diagIdx):numRows
            elmntIdx = elmntIdx + 1;
            vJ(elmntIdx) = ((jj - 1) * numRows) + ii;
        end
    end
elseif(triangleFlag == EXTRACT_UPPER_TRIANGLE)
    elmntIdx = numElements + 1;
    for jj = numRows:-1:1
        for ii = (jj - diagIdx):-1:1
            elmntIdx = elmntIdx - 1;
            vJ(elmntIdx) = ((jj - 1) * numRows) + ii;
        end
    end
end
mLU = sparse(1:numElements, vJ, 1, numElements, numRows * numRows, numElements);
end

I like the memory allocation is kept to a minimum.

I wonder if there is a more efficient way to generate vJ. It is trivial to remove the inner loop and just count the number of elements yet in MATLAB it will mean each iteration will allocate memory (As we don't have iterators).

2 comentarios
Mostrar NingunoOcultar Ninguno

Tommy el 21 de Abr. de 2020

The two methods are fairly similar - I also like that yours minimizes memory allocation. I ran a few simple fun tests:

I didn't dare try higher than 20,000 for numRows. It seems that your code may possibly perform better at higher values of numRows. In the second case (calculating both the upper and lower triangles) I had your code running both sets of for loops, one after the other (shown in red). In green is the result from your code if only the first set of loops runs, and you recognize that vJ for one triangle is easy to determine if you have vJ for the other triangle (N^2+1-flip(vJ)). So the only thing I'll conclude from this is, if you will eventually calculate both the lower and upper triangle matrices for a given size, it might be better to calculate them together and only find vJ once. I suppose it depends on how expensive N^2+1-flip(vJ) is.

Royi Avital el 21 de Abr. de 2020

@Tommy, Really liked your analysis. Yes, when dealing with sparse matrices the whole point it making sure allocation is kept to minimum. I agree if one wants both, it is better to do the trick you mentioned.

Let's see if someone can think on a different pattern to populate vJ which is more efficient.

Iniciar sesión para comentar.

Most Efficient Way to Construct the Matrices to Extract the Lower and Upper Triangle from a Vectorized Matrix

24 comentarios
Mostrar 22 comentarios más antiguosOcultar 22 comentarios más antiguos

Respuesta aceptada

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Más respuestas (2)

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

Most Efficient Way to Construct the Matrices to Extract the Lower and Upper Triangle from a Vectorized Matrix

24 comentarios Mostrar 22 comentarios más antiguosOcultar 22 comentarios más antiguos

Respuesta aceptada

4 comentarios Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Más respuestas (2)

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

24 comentarios
Mostrar 22 comentarios más antiguosOcultar 22 comentarios más antiguos

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno