Automatic sparsification algorithm and zero-threshold

Question

2 votos

Matlab uses an automatic sparsification algorithm to convert an array from double to sparse and also to propagate sparsity during arithmetic operations. There is a sparsity threshold value to decide if an element should be regarded as 0. Unfortunately this threshold seems that can't be changed from the user. Also it is really small: ~1e-324 (even smaller than realmin=~1e-308).

e.g,

A=bucky; % non-zeros are 1
subplot(131), spy(A), title('A')
subplot(132), spy(A*1e-323),title('A*1e-323')
subplot(133), spy(A*1e-324),title('A*1e-324')

This has an effect on the propagation of sparsity after arithmetic operations on the initial sparse array. Although some resulted arrays are practically sparse with many values smaller that EPS, these values are stored as non-zeros and the memory requirements explode.

So I guess my question is the following: Is there any way to change that tinny threshold?

Thank you for your time, Petros

9 comentarios
Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

Matt J el 20 de Jun. de 2014

Editada: Matt J el 20 de Jun. de 2014

You don't need to know in advance if the solution is sparse. Just to discard on the fly values smaller than a certain threshold during the calculations.

You have to know in advance that the (post-truncation) result has a high density of zeros in order for the truncation you're talking about to have any benefit. If you're wrong, then representing the result in sparse form will result in around 3 times the memory consumption as full form. Representing it in full form will save you no memory over the non-truncated result and, if truncation is desired, you may as well just post-truncate.

In any case i see no point in storing (and spending time by considering them in calculations) values smaller that the floating point precision.

Keep in mind that the relevant threshold will depend on how small the non-truncated matrix values are. For example I guess the difference between eye(N) and eye(N)+eps is insignificant, but the difference between eye(N)*1e-10 and eye(N)*1e-10+eps is more significant. You would have to have foreknowledge about the expected size of your resulting values to select an appropriate threshold.

The value of N is also important. sum(eye(N)+eps) can be significantly different from sum(eye(N)) for large N. So, whether the influence of the small entries can be ignored depends on both dimension and the subsequent operations that you plan to do.

James Tursa el 20 de Jun. de 2014

@Petros: realmin = 2.2251e-308 is the smallest full precision double number in IEEE, but it is not the smallest number in IEEE double. Below realmin there is a range of denormalized numbers available with less precision (all the way down to only 1 bit of precision). The smallest denormalized number is 4.9407e-324. All of the normalized and denormalized numbers are non-zero, so they propagate in the sparse matrix calculations. I don't know how to change the behavior from strict non-zero to a tolerance. There is an FEX submission that can clean sparse matrices of small values here:

http://www.mathworks.com/matlabcentral/fileexchange/40791-sparseclean-efficiently-cleans-a-sparse-matrix-of-small-values-or-nan

Petros Bogiatzis el 24 de Jun. de 2014

@James: Thank you for your comments and your suggestion. Unfortunately any cleaning after creating he matrix is not applicable in my case. I'm working with large matrices, so once they become full (or pseudo-full) the memory explodes and can't be stored.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Matt J el 24 de Jun. de 2014

Editada: Matt J el 24 de Jun. de 2014

Abrir en MATLAB Online

2 votos

About the only generic thing I can think of is to do the mldivide operation A\B column-by-column and apply the tolerance to each result. Then, at the very end, recompose the matrix:

    [ma,na]=size(A);
    [mb,nb]=size(B);
    I=cell(1,nb); J=I, S=I;
    for j=1:nb
     tmp=A\B(:,j);
      idx=find(abs(tmp)>tol).';
     I{i}=idx;
     J{i}(1:length(idx))=j;
     S{i}=tmp(idx);
    end
    result=sparse([I{:}], [J{:}], [S{:}], na,nb);

There are potentially more efficient ways to do the intermediate mldivides A\B(:,j) by pre-decomposing A. Depending on A's structure, for example, an LU decomposition is used, as described here, and that can be done once prior to the loop.

Again, though, if you have enough foreknowledge about the solution to make such truncation both useful and safe, you should probably describe to us what it is about the problem structure that makes that foreknowledge possible. It's the kind of foreknowledge that often leads to good customized solutions.

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Automatic sparsification algorithm and zero-threshold

9 comentarios
Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

Respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Productos

Etiquetas

Community Treasure Hunt

Automatic sparsification algorithm and zero-threshold

9 comentarios Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

Respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Productos

Etiquetas

Ver también

Community Treasure Hunt

9 comentarios
Mostrar 7 comentarios más antiguos Ocultar 7 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos