Matlab Tools for Multiple Instruction Single Data (MISD)?

Hello,          I have a problem that has more than one million functions to evaluate, all different, for the same data group. Morphologically this type of problem is called MISD (Multiple Instructions Single Data), each mathematical function can be written as a matlab function, with input and output arguments. What is the most efficient way to solve this type of problems with matlab ?.

2 comentarios

Jan
Jan el 8 de Mayo de 2017
It depends on how the functions are stored. Please give us a small example.
Jan,     
Thanks for your interest in the problem, for example: the main program declares the global variables,
Name Size Bytes Class Attributes
N1 1x1 8 double global
N6 1x1 8 double global
Px 82251x1 658008 double global
S 4496388x6 215826624 double global
V = test0001;
For higher execution speed all the functions were explicitly generated:
% Funtion test0001
Function vpe = test0001
Global s px n1 n6
Vpe = - ((s (px, 2) -n6 + 1). * (S (px, 2) -n6 + 2) * (s (px, 2) -n6 + 3)) / 6;
Return

Iniciar sesión para comentar.

Respuestas (1)

Walter Roberson
Walter Roberson el 8 de Mayo de 2017
global variables are the slowest kind of variables. You would be better to parameterize your functions

7 comentarios

Walter, thank you very much for the comment, but I would like guidance on how to solve MISD (Multiple Instruction Sigle Data)   We have a machine of 4000 processors (cores) and multiple GPUs and I want to know if there is a strategy for this type of problems. GPUs are single instruction multiple data (SIMD). So we can only use the cpus, however I do need to know some parallel programming strategies for MISD with Matlab.
Are the functions written in their own .m files, or are they anonymous functions, perhaps even created dynamically?
Are the same functions used over and over? If it is, then it might be practical to do some kind of transformation into another programming language that might be more efficient to execute.
Your sample function uses s(px,2) multiple times, which appears to be taking a subset of s. Would all of the functions be taking the same subset, or would px be varying over the different functions? Because if it is the same subset you could gain a lot of efficiency by extracting the subset before proceeding.
Is there a general pattern that the computations follow? If there is then that might make it easier to transform for higher efficiency.
I am thinking, for example, that maybe you could mechanically generate .cu files for GPU use; there would be overhead associated with that, but in some situations the overhead would be worth it.
By the way: none of the parallel processing facilities copy global variables.
Is it correct that what you show as global variables are essentially constants to the batch of computations?
walter,
a) I have tested both types of functions (anonymous and dynamically created).
b) Unfortunately, in a hypergeometric structure the functions are unique in each node and I have 1114235 nodes, each with a different function for an small problem.
c) Depending on the node (function type) different columns of the vector s (1: n, 1: m) are occupied.
d) All functions are different, as a combination of the previous layer.
c) Currently, I'm doing it with matlab, a code generator program writes the 1114235 functions as text, starting from symbolical matlab.
By the way: none of the parallel processing installations copy global variables.
D) OK, they use shared memory, it is less efficient than transfer of direct parameters ?.
E) In example n1 and n6 are constants, s (px, 1: m) are data to be analysed.
global variables are not transferred at all; any global variable will show up as empty. Also, there is no shared memory documented (if shared memory is used it is an implementation detail that might not be the same on different operating systems or different hosts.)
Walter Roberson
Walter Roberson el 9 de Mayo de 2017
Editada: Walter Roberson el 9 de Mayo de 2017
Under the circumstances you describe, I am suspecting that communications overhead would drown performance if you were to attempt to use one function per worker. I am suspecting that for efficiency it might make more sense to batch several computations into a single function, returning a matrix of results.
As you are calculating the functions symbolically, I would try with creating a vector of symbolic expressions (for example, take them 9689 at a time), matlabFunction that, writing to a file (and allowing optimization to occur); then to run, https://www.mathworks.com/help/distcomp/addattachedfiles.html and invoke
Walter, many thanks to you for the advice, you have some examples of similar code or you can recommend me bibliography related to this subject.
I reiterate my gratitude, sincerely J.P.
If you have a vector of symbolic expressions, I would probably be tempted to factor() the length of the vector, reshape() with the last (largest) factor as the first dimension, then use matlabFunction on each column, asking to generate files. The purpose here is to batch enough work together to make it worth while to talk to a worker considering communications overhead.
If the expressions are to be reused multiple times, leave the default matlabFunction optimization on; otherwise, the optimization phase might be more expense than it is worth.
After that you can create workers, AddAttachedFiles, and parfeval() to invoke the functions on the current data.
Note: matlabFunction does not currently do well on vectors of functions where the components might require looping or piecewise(): in current implementations it tries put 'if' statements inside the horzcat or vertcat operator. matlabFunction also currently does not do well on int() with a vector of locations to be integrated.

Iniciar sesión para comentar.

Preguntada:

el 8 de Mayo de 2017

Comentada:

el 9 de Mayo de 2017

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by