BATCH OUTPUT from SLURM

15 visualizaciones (últimos 30 días)
AR
AR el 1 de Nov. de 2019
Comentada: Shenjie Zhou el 18 de Mayo de 2020
Q: I used SLURM to run my MATLAB script without error on SLURM via BATCH mode but I do not know how to access the 2 data files and 3 fig files my script should have generated that I assume are somehow within the single output file, myfilename.out, that my SLURM matlab script generated?
Any suggestions appreciated as I could not find anything via online searches that addressed this seemingly simple question.

Respuesta aceptada

AR
AR el 10 de Nov. de 2019
I figured out problem, so for those experienceing similar on their univsity compute clusters see below. My university's cluster includes head nodes to run matlab in real-time (mainly for testing/troubleshooting) and clustered nodes for batch processing. Since my program was too large (main function w/11 embedded functions, ~2000 lines of code), I had to use batch processing. I'm still optimizing how I run this but hopefully below steps I take may help those with similar querstions that I had.
1.I had to insert my my university's cluster directory in my Matlab script to ensure my matlab output was written to it. I discovered my university's files could not be written to my home directory on head node during batch processing so must be first written to my scratch directory on batch compute cluster nodes. I modified my matlab script as shown below
dir='/scratch/myusername/'; % I had not previously defined a directory so my Matlab output was not getting written based
% on batch processing clusters but was still completing without errors
ver='_1_';
%% File Names (for archieving results)
filename1 = [dir,date,ver,'Inputs_' instance '.mat']; % instance is a call from SLURM script
filename4 = [dir,date,ver,'Groups_' instance '.mat'];
filename5 = [dir,date,ver,'Var_' instance '.mat'];
filename6a = [dir,date,ver,'SpikeFigs_' instance '.fig'];
NOTE: I usually modify my code on my PC then upload via X-term to my University head node. I use MobaXterm and PuTTY.
2. Log onto my university head node and change to my subfolder on this head node
$ cd ./Oct_2019_Models
3. Create/modify my SLURM batch script on my head node subfolder above. My first SLURM script below, still being improved.
#!/bin/sh
#SBATCH --job-name=myBatchJobname
#SBATCH --partition=all-LoPri
## NOTE: %u=userID, %x=jobName, %N=nodeID, %j=jobID, %A=arrayID, %a=arrayTaskID
#SBATCH --output=/scratch/%u/%x-%N-%j.out # Output file
#SBATCH --error=/scratch/%u/%x-%N-%j.err # Error file
#SBATCH --mail-type=BEGIN,END,FAIL # ALL,NONE,BEGIN,END,FAIL,REQUEUE,..
#SBATCH --nodes=1
#SBATCH --tasks=1
#SBATCH --cpus-per-task=16
## Load the relevant modules needed for the job
module load MATLAB/v95/matlab_compiler_runtime
## Run your program or script
##./run_myCompiledMatlabScriptName.sh
##./run_myCompiledMatlabScriptName.sh for 3 instances (I declared an epoch instance vector in my Matlab script)
## these instances run sequentially as I've not yet figured out parrallel computing via batch
./run_myCompiledMatlabScriptName.sh $MATLAB_RUN 1
./run_myCompiledMatlabScriptName.sh $MATLAB_RUN 2
./run_myCompiledMatlabScriptName.sh $MATLAB_RUN 3
4. Load matlab on head node in background for my session and review my matlab script/function on my head node
$ module load matlab
5. Load matlab compiler on head node in background for my session and compile my matlab script/function
$ module load MATLAB/v95/matlab_compiler_runtime
$ mcc -v -R -nodisplay -R -singleCompThread -m myMatlabScriptName.m
6. Submit batch job to SLURM
$ sbatch mySlurmFileName.slurm
7. Change directories from my head node to my clustered compute nodes, aka my $SCRATCH directory, check completion, and ensure matlab output files on $SCRATCH
$ cd $SCRATCH
$ sacct -X
$ ls -l
8. Change directories back to my head node then copy matlab output files (see names above) to that subdirectory
$ cd /home/username/Oct_2019_Models
$ cp /scratch/username/myMatlabOutputFileName.mat myMatlabOutputFileName.mat
Hope this helps those with similar issues.
  1 comentario
Shenjie Zhou
Shenjie Zhou el 18 de Mayo de 2020
I had my matlab code to generate standard output file for each time step for me to monitor the progress of the program via batch mode, but it seems that all the information is written altogether after the program has been executed. Is there anyway to have the output file updated instantaneously?

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Third-Party Cluster Configuration en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by