SPMD Loop Iteration Job Not Submitting on MATLAB Distributed Computing Server

I have implemented SPMD on a loop iteration which runs about 3000 times. I want to run the job on a MATLAB DCS to see if I can get even better execution speeds. However, when I try to submit the job, I get an error: Unable to determine job requirements.
I suspect the reason I am receiving this error is that I didn't specify particular values for the input arguments when creating the task. But the iteration case here is such that the values of the input variables change at every iteration. Thus, I cannot input a specific value.
Please, how can I get my code to submit on the DCS server. My code is outlined below OMP is a user-defined function written within the same MATLAB script):
c = parcluster ('LegionProfile');
myJob = createCommunicatingJob(c, 'Type', 'SPMD');
num_workers = 24;
for iter = 1 : MAX_ITER
for col = 1:n
task = createTask (myJob, @OMP, 1, {D,Y(:,col), k(col)});
submit(myJob);
wait(myJob);
X(:,col) = fetchOutputs(myJob);
end

5 comentarios

I'm a little confused here. A job of type 'SPMD' can have only a single task - so in your loop you should be doing something like:
for ...
myJob = createCommunicatingJob(c, 'Type', 'SPMD');
task = createTask(myJob, ...);
submit(myJob);
...
end
I'm not sure quite where that error comes from. When you encounter it, could you please execute:
disp(MException.last)
and post the output?
Thanks Edric for your suggestion. The output from disp(MException.last) is:
disp(MException.last)
MException with properties:
identifier: 'parallel:cluster:GenericSubmissionFailed'
message: [1x104 char]
cause: {[1x1 MException]}
stack: [3x1 struct]
Please do you know what this means and how I can fix it? Thanks again for your help.
What version of MATLAB/PCT are you using? Could you also post the output of
getReport(MException.last)
In more recent MATLAB releases, the GenericSubmissionFailed error identifier should correspond to an error message something like
Job submission failed because the user supplied SubmitFcn (...) errored
Hi Edric,
Thanks again for your contribution. I am using Matlab/PCT R2013a. The result from the getReport(MException.last) is outlined below:
getReport(MException.last)
ans =
Error using parallel.Job/submit (line 304)
Job submission failed because the user supplied CommunicatingSubmitFcn (communicatingSubmitFcn) errored.
Error in MOD_OMP1 (line 146)
submit(myJob);
Error in Test1 (line 66)
[D_out, X_out] = MOD_OMP1(Y, r, s*ones(n,1), 'MAX_ITER', MAX_ITER, ...
Caused by:
Error using communicatingSubmitFcn (line 101)
Submit failed with the following message:
Unable to run job: Rejected by ucl_jsv4h Reason:Unable to determine job requirements.
Exiting.
Aha! The "Unable to determine job requirements" is coming from your underlying scheduling system. You'll probably need to work with your cluster admin to work out why that error is showing up.

Iniciar sesión para comentar.

Respuestas (1)

Take a look at this UCL FAQ as it mentions the same final error message you are encountering:
Does the Validation of your cluster profile pass successfully? If not then the issue is not with your particular script, but with the setup of the cluster profile. Cluster profile validation is described here:

Categorías

Más información sobre MATLAB Parallel Server en Centro de ayuda y File Exchange.

Etiquetas

Preguntada:

el 17 de Jun. de 2015

Comentada:

el 19 de Jun. de 2015

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by