Automatic sparsification algorithm and zero-threshold
Mostrar comentarios más antiguos
Matlab uses an automatic sparsification algorithm to convert an array from double to sparse and also to propagate sparsity during arithmetic operations. There is a sparsity threshold value to decide if an element should be regarded as 0. Unfortunately this threshold seems that can't be changed from the user. Also it is really small: ~1e-324 (even smaller than realmin=~1e-308).
e.g,
A=bucky; % non-zeros are 1
subplot(131), spy(A), title('A')
subplot(132), spy(A*1e-323),title('A*1e-323')
subplot(133), spy(A*1e-324),title('A*1e-324')
This has an effect on the propagation of sparsity after arithmetic operations on the initial sparse array. Although some resulted arrays are practically sparse with many values smaller that EPS, these values are stored as non-zeros and the memory requirements explode.
So I guess my question is the following: Is there any way to change that tinny threshold?
Thank you for your time, Petros
9 comentarios
Petros Bogiatzis
el 20 de Jun. de 2014
Petros Bogiatzis
el 20 de Jun. de 2014
I don't think there's a general way. Normally, you wouldn't know in advance that the solution is sparse. If you did know that, then it means you have some special foreknowledge due to the structure of the problem. You would exploit that special structure to get the solution in an indirect way.
Petros Bogiatzis
el 20 de Jun. de 2014
You don't need to know in advance if the solution is sparse. Just to discard on the fly values smaller than a certain threshold during the calculations.
You have to know in advance that the (post-truncation) result has a high density of zeros in order for the truncation you're talking about to have any benefit. If you're wrong, then representing the result in sparse form will result in around 3 times the memory consumption as full form. Representing it in full form will save you no memory over the non-truncated result and, if truncation is desired, you may as well just post-truncate.
In any case i see no point in storing (and spending time by considering them in calculations) values smaller that the floating point precision.
Keep in mind that the relevant threshold will depend on how small the non-truncated matrix values are. For example I guess the difference between eye(N) and eye(N)+eps is insignificant, but the difference between eye(N)*1e-10 and eye(N)*1e-10+eps is more significant. You would have to have foreknowledge about the expected size of your resulting values to select an appropriate threshold.
The value of N is also important. sum(eye(N)+eps) can be significantly different from sum(eye(N)) for large N. So, whether the influence of the small entries can be ignored depends on both dimension and the subsequent operations that you plan to do.
James Tursa
el 20 de Jun. de 2014
@Petros: realmin = 2.2251e-308 is the smallest full precision double number in IEEE, but it is not the smallest number in IEEE double. Below realmin there is a range of denormalized numbers available with less precision (all the way down to only 1 bit of precision). The smallest denormalized number is 4.9407e-324. All of the normalized and denormalized numbers are non-zero, so they propagate in the sparse matrix calculations. I don't know how to change the behavior from strict non-zero to a tolerance. There is an FEX submission that can clean sparse matrices of small values here:
Petros Bogiatzis
el 24 de Jun. de 2014
Respuestas (1)
About the only generic thing I can think of is to do the mldivide operation A\B column-by-column and apply the tolerance to each result. Then, at the very end, recompose the matrix:
[ma,na]=size(A);
[mb,nb]=size(B);
I=cell(1,nb); J=I, S=I;
for j=1:nb
tmp=A\B(:,j);
idx=find(abs(tmp)>tol).';
I{i}=idx;
J{i}(1:length(idx))=j;
S{i}=tmp(idx);
end
result=sparse([I{:}], [J{:}], [S{:}], na,nb);
There are potentially more efficient ways to do the intermediate mldivides A\B(:,j) by pre-decomposing A. Depending on A's structure, for example, an LU decomposition is used, as described here, and that can be done once prior to the loop.
Again, though, if you have enough foreknowledge about the solution to make such truncation both useful and safe, you should probably describe to us what it is about the problem structure that makes that foreknowledge possible. It's the kind of foreknowledge that often leads to good customized solutions.
Categorías
Más información sobre Number Theory en Centro de ayuda y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!