How best to parallelize this simple code/algorithm?
4 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Problem Description: suppose that I have an input vector V (V = [v1,v2,...,vk]) with dyadic length (k=2^i, i>2). The algorithm involves the following two steps:
- Divide V into i sub-vectors of increasing dyadic lengths as follows: V = [v1,V_1,V_2,...,V_i]. For instance, if V = [v1,v2,...,v16] (k = 16, i = 4) then the sub vectors will be: V_1 = v2; V_2 = [v3,v4]; V_3 = [v5,v6,v7,v8]; and V_4 = [v9,v10,...,v16].
- For each of the sub-vectors with length greater than one, transform them using some function (call it f(x)). That is V'_j = f(V_j) (j = 2,3,...,i) where V' is the output vector for the algorithm. Note that, V' = [v1,v2,f(V_j)] for j = 2,3,...,i.
Problem Question (My Question): which is the best way to parallelize this algorithm code. Keep in mind that, processing of any sub-vector is independent from any other sub-vector. Here is the serial code for the algorithm:
V = rand(1,16); % The input vector
V_out = zeros(1,16); % The output vector initialized
V_out(1:2) = V(1:2); % The first two elements are assigned directly to the output
% The following is the loop for dealing with the sub-vectors
for k = 1:3 % loop over k = 1,2,...,i-1 (for this example, i = 4)
lower_index = 2^k + 1;
upper_index = 2^(k+1);
for jj = lower_index:1:upper_index
V_out(jj) = f(V(jj);
end
end
print(V_out);
0 comentarios
Respuestas (1)
Edric Ellis
el 15 de Oct. de 2015
For the code as written, since jj takes on each value in turn, you can simply flatten this into a single parfor loop, unless I'm missing something:
parfor jj = 3:(1 + 2^3)
V_out(jj) = f(V(jj));
end
0 comentarios
Ver también
Categorías
Más información sobre Graphics Performance en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!