Transfer Data with Job Methods and Properties
You can transfer data to a cloud cluster using the AttachedFiles
property of the parallel.Job (Parallel Computing Toolbox) object, as you do for other clusters. For
example:
Place all required executable and data files in the same folder.
Specify that folder in the
AttachedFilesproperty of the job.Submit your job. This transfers the files to the cloud and makes them available to the workers running on the cloud cluster.
You can access your task or batch function results from the finished job by using the
fetchOutputs(Parallel Computing Toolbox) function. For batch jobs running on the cloud, you can also access the job workspace variables with theload(Parallel Computing Toolbox) function in your client session.
This example shows you how to run a batch job with files on your machine and a function
divideData on clusters in Cloud Center.
Load Data
Copy the data for this example to your current working folder by opening the supporting
function prepareSupportingFiles and using the code inside.
openExample("parallel/RunBatchJobAndAccessFilesFromWorkersExample", ... supportingFile="prepareSupportingFiles.m")
Your current working folder now contains 4 files: A.dat,
B1.dat, B2.dat, and B3.dat.
Run Batch Job
Create and discover your Cloud Center profile on MATLAB. Specify this profile as your default cluster profile. For more details, see Create and Discover Clusters.
Create a cluster object using parcluster (Parallel Computing Toolbox).
c = parcluster;
batch (Parallel Computing Toolbox). Use
the AttachedFiles name-value argument to transfer files from your local
machine to the workers. For example, use a parallel pool with three workers and offload the
computations in the divideData
function.filenames = "B" + string(1:3) + ".dat"; job = batch(c,@divideData,1,{}, ... Pool=3, ... AttachedFiles=filenames);
To block MATLAB until the job completes, use the wait (Parallel Computing Toolbox)
function on the job object.
wait(job);
Retrieve Results and Clean Up Data
To retrieve the results of a batch job, use the fetchOutputs (Parallel Computing Toolbox)
function. fetchOutputs returns a cell array containing the outputs of the
function run in the batch job. You can also access the job workspace variables with the
load (Parallel Computing Toolbox) function.
X = fetchOutputs(job)
X = 1×1 cell array
{40×207 double}When you have retrieved all the required outputs and do not need the job object anymore, delete it to clean up its data and avoid consuming resources unnecessarily.
delete(job)
clear jobFor more details, see Run Batch Job and Access Files from Workers (Parallel Computing Toolbox). To learn more about sharing data with workers, see Pass Data to and from Worker Sessions (Parallel Computing Toolbox).