Indice
User Guide (SLURM version)

Project Description (it)
Access / Login
In order to access the resources, you must be included in the LDAP database of the HPC management server. Requests for access or general assistance must be sent to: es_calcolo@unipr.it.
Once enabled, the login is done through SSH on the login host:
ssh <name.surname>@login.hpc.unipr.it
Password access is allowed only within the University network (160.78.0.0/16). Outside this context it is necessary to use the University VPN or access with public key authentication.
Access password-less between nodes
In order to use the cluster it is necessary to eliminate the need to use the password between nodes, using public key authentication. It is necessary to generate on login.hpc.unipr.it the pair of keys, without passphrase, and add the public key in the authorization file (authorized_keys):
Key generation. Accept the defaults by pressing enter:
ssh-keygen -t rsa
Copy of the public key into authorized_keys:
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
External Access with public key Authentication
The key pair must be generated with the SSH client. The private key should be protected by an appropriate passphrase (it is not mandarory but recommended). The public key must be included in your authorized_keys file on the login host.
If you use the SSH client for Windows PuTTY (http://www.putty.org), you need to generate the public and private key pair with PuTTYgen and save them into a file. The private key must be included in the Putty (or WinSCP) configuration panel:
Configuration -> Connection -> SSH -> Auth -> Private key file for authentication
The public key must be included in the .ssh/authorized_keys file on login.hpc.unipr.it
Useful links for SSH clients configuration: Linux, MacOS X, PuTTY , Windows SSH Secure Shell
The public key of the client (example client_id_rsa.pub
) must be inserted in the file ~/.ssh/authorized_keys
of login computer:
Copy of the public key into authorized_keys:
cat client_id_rsa.pub >> ~/.ssh/authorized_keys
File transfer
SSH is the only protocol for external communication and can also be used for file transfer.
If you use a Unix-like client (Linux, MacOS X) you can use the command scp or sftp.
On Windows systems, the most used tool is WinSCP (https://winscp.net/eng/docs/introduction). During the installation of WinSCP it is possible to import Putty profiles.
SSH can also be used to mount a remote file-system using SshFS (see http://www.fis.unipr.it/dokuwiki/doku.php?id=calcoloscientifico:guidautente_slurm_en#sshfs)
Hardware
The current cluster is composed of the new computing nodes.
New computing nodes
- Cluster1 ( BDW)
- 8 nodes with 2 Intel Xeon E5-2683v4 (2x16 cores, 2.1GHz, 40MB smartcache), 128 GB RAM (E4)
- 9 nodes with 2 Intel Xeon E5-2680v4 (2x14 cores, 2.4GHz, 35MB smartcache), 128 GB RAM (DELL R730)
- 1 nodes with 2 Intel Xeon E5-2683v4 (2x16 cores, 2.1GHz, 40MB smartcache), 1024 GB RAM (E4 - FAT MEM)
- 1 nodes with 4 Intel Xeon E7-8880v4 (4x22 cores, 2.2GHz, 55MB smartcache), 512 GB RAM (HP - FAT CORES)
- Cluster2 ( GPU)
- 2 nodes with 2 Intel Xeon E5-2683v4 (2x16 cores, 2.1GHz), 128 GB RAM, 7 GPU NVIDIA P100-PCIE-12GB (Pascal architecture).
- Cluster3 ( KNL)
- 4 nodes with 1 Intel Xeon PHI 7250 (1x68 cores, 1.4GHz, 16GB MCDRAM), 192 GB RAM.
Nodes details:
Node list - Usage (intranet only)
Peak performance (double precision):
1 Node BDW -> 2x16 (cores) x 2.1 (GHz) x 16 (AVX2) = 1 TFlops, Max memory Bandwidth = 76.8 GB/s 1 GPU P100 -> 4.7 TFlops 1 node KNL -> 68 (cores) x 1.4 (GHz) x 32 (AVX512) = 3 TFlops, Max memory bandwidth = 115.2 GB/s
Interconnection with Intel OmniPath
Peak performance:
Bandwidth: 100 Gb/s, Latency: 100 ns.
Software
The operating system for all types of nodes is CentOS 7.X.
Environment Software (libraries, compilers e tools): List
Some software components must be loaded in order to be used.
To list the available modules:
module avail
To upload / download a module (example intel):
module load intel module unload intel
To list the loaded modules:
module list
Storage
The login node and computing nodes share the following storage areas:
Mount Point | Env. Var. | Backup | Quota | Note | Support |
---|---|---|---|---|---|
/hpc/home | $HOME | yes | 50 GB | Programs and data | SAN nearline |
/hpc/group (/hpc/account ?) | $GROUP | yes | 100 GB | Programs and data | SAN nearline |
/hpc/share | Application software and database | SAN nearline | |||
/hpc/scratch | $SCRATCH | no | 1? TB, max 1 month | run-time data | SAN |
/hpc/archive | $ARCHIVE | no | Archive | NAS/tape/cloud (1) |
Acknowledgement
This research benefits from the HPC (High Performance Computing) facility of the University of Parma, Italy
Old sentence, do no use: Part of this research is conducted using the High Performance Computing (HPC) facility of the University of Parma.
The authors are requested to communicate the references of the publications, which will be listed on the site.
Job Submission with Slurm
The queues are scheduled with Slurm Workload Manager.
Slurm Partitions
Cluster | Partition | job resources | TIMELIMIT | Max Running per user |
---|---|---|---|---|
BDW | bdw | 2-256 core | 10-00:00:00 | |
KNL | knl | 2- core | 10-00:00:00 | |
GPU | gpu | 1-10 GPU ?? | 0-24:00:00 | 6 |
vrt | 1 core | 10-00:00:00 |
Global configurations:
- Global Max job running per user : ??
- ..
- Other partitions can be defined for special needs (ethrogeneous jobs, dedicated resources, ..)
Useful commands
Display the status of the queues in a synthetic way:
sinfo
Display the status of the individual queues in detail:
scontrol show partition
List of nodes and their status:
sinfo -all
Submission of a job:
srun <options> # interactive mode sbatch <options> script.sh # batch mode squeue # Display jobs in the queue: sprio # show dynamic priority
Main options
This option selects the partition (queue) to use:
-p <partition name> ( The default partition is bdw ?? )
Other options:
- -Nx: where x is the number of chunk (cores group on the same node)
- -ny: where y is the number of cores per each node (default 1)
- –gres=gpu:tesla:X: where X is the number of GPU for each node (consumable resources)
- –mem=<size{units}>: requested memory for node
- –ntasks=Y: where Y is the number of processes MPI for each node
- –cpus-per-task=Z: where Z is the number of thread OpenMP for each process
- –exclusive: allocate hosts exclusively (not shared with other jobs)
Example of resource selection:
-p bdw -N1 -n2
-t <days-hours:minutes:seconds> Maximum execution time of the job. This data selects the queue to be used. ( Default: 0-00:72:00 verificare)
Example:
-t 0-00:30:00
-A <account name>
–account=<nameaccount>
Specifies the account to be charged for using resources. (Mandatory ??)
Example:
-A T_HPC17A
-oe
redirects the standard error to standard output.
–mail-user=<mail address>
The option –mail-user allows to indicate one or more e-mail addresses, separated by commas, that will receive the notifications of the queue manager.
–mail-type=<FAIL, BEGIN, END, NONE, ALL>
The option –mail-type allows to indicate the events that generate the sending of the notification:
- FAIL: notification in case of interruption of the job
- BEGIN: notification when job starts
- END: notification when job stops
- NONE: no notification
- ALL: all notifications
Example:
--mail-user=john.smith@unipr.it --mail-type=BEGIN,END
Priority
The priority (from queue to execution) is dynamically defined by three paramters:
- Timelimit
- Aging (waiting time in partition)
- Fair share (amount of resources used in last 14 days)
Advance reservation
It is possible to define an advance reservation for teaching activities or special requests
Advance reservation policy: ToDo
For a request send an e-mail to es_calcolo@unipr.it
Accounting
Reporting Example:
accbilling.sh -a <accountname> -s 2018-01-01 -e 2018-04-10
accbilling.sh -u <username> -s 2018-01-01 -e 2018-04-10
Interactive jobs
Per verificare l'elenco delle risorse assegnate si può utilizzare la sottomissione interattiva con opzione -I. Una volta entrati in modo interattivo il comando cat $SLURM_JOB_NODELIST visualizza l'elenco delle risorse assegnate. Il comando squeue -al lista maggiori dettagli riguardo le risorse assegnate.
srun -N<nodes number> -n<cores number> -q <QOS> -C <node type> -t <wall time> -L <file system> cat $SLURM_JOB_NODELIST scontrol show job <jobID> exit
Examples:
# 1 group (chunk) of 2 CPU type BDW and file system Scratch srun -N1 -n2 -p bdw -L SCRATCH # 2 chunks of 2 CPU type KNL and file system Scratch (they can stay on the same node) srun -N2 -n2 -p knl -L SCRATCH # The chunks must be on different nodes srun -N2 -n2 -p knl --scatter # 1 chunk with 2 GPU on GPU Cluster srun -N1 -p gpu --gres=gpu:2 -L SCRATCH # 2 chunks each with 2 GPU on different nodes srun -N2 --gres=gpu:2 -p gpu --scatter # --ntask=Y defines MPI how many processes need to be activated for each chunk srun -N2 -n1 –ntasks=1: -p bdw
Batch job
A shell script must be created that includes the SLURM options and the commands that must be executed on the nodes.
to submit the job and related resource charge:
sbatch -A <account name> scriptname.sh
Each job is assigned a unique numeric identifier <Job Id>.
At the end of the execution the two files containing stdout and stderr will be created in the directory from which the job was submitted.
By default, the two files are named after the script with an additional extension:
Stdout: <script.sh>.o<job id> Stderr: <script.sh>.e<job id>
Serial jobs, compiler GNU
Compilation of the example mm.c for the calculation of the product of two matrices:
cp /hpc/share/samples/serial/mm.* . g++ mm.cpp -o mm
Script mm.bash
for the submission of the serial executable mm
:
#!/bin/bash #< Request a chunk with a 1 CPU #SBATCH -p bdw -N1 -n32 #< Its declares that the job will last at most 30 minutes (days-hours:minutes:seconds) #SBATCH --time 0-00:30:00 #< Charge resources to own account #SBATCH $SBATCH_ACCOUNT #< Print the node name assigned cat $SLURM_JOB_NODELIST #< Enter the directory that contains the script cd "$SLURM_SUBMIT_DIR" #< Executes the program ./mm
Submission:
sbatch mm.bash
See <job id> and the state:
squeue
To cancel the job in progress:
scancel <Job id>
Serial jobs, compiler Intel
Compiling the cpi_mc.c example for the calculation of Pi:
cp /hpc/share/samples/serial/cpi/cpi_mc.c . module load intel icc cpi_mc.c -o cpi_mc_int
Script cpi_mc.bash
for the submission of the serial executable cpi_mc_int
:
#!/bin/bash #< Print the node name assigned cat $SLURM_JOB_NODELIST #< Charge resources to own account #SBATCH $SBATCH_ACCOUNT #< Load the compiler module Intel module load intel #< Enter the directory that contains the script cd "$SLURM_SUBMIT_DIR" #< Executes the program N=10000000 ./cpi_mc_int -n $N
Submission:
sbatch cpi_mc.bash
Serial job, compiler PGI
Compiling the cpi_sqrt.c example for the computing of Pi:
cp /hpc/share/samples/serial/cpi/cpi_sqrt.c . module load pgi pgcc cpi_sqrt.c -o cpi_sqrt_pgi
Script cpi_sqrt_pgi.bash
for the submission of the serial executable cpi_sqrt_pgi
:
#!/bin/bash #< Options SLURM default. They can be omitted #SBATCH -p bdw -N1 -n32 #SBATCH --time 0-00:30:00 #< Charge resources to own account #SBATCH $SBATCH_ACCOUNT #< Print name node assigned cat $SLURM_JOB_NODELIST module load pgi #< Enter the directory that contains the script cd "$SLURM_SUBMIT_DIR" N=10000000 ./cpi_sqrt_pgi -n $N
sbatch cpi_sqrt_pgi.bash
Job OpenMP with GNU 4.8
cp /hpc/share/samples/omp/omp_hello.c . gcc -fopenmp omp_hello.c -o omp_hello
Script omp_hello.bash
with the request for 32 CPUs in exclusive use.
#!/bin/bash #SBATCH -p bdw -N1 -n32 #SBATCH --exclusive #SBATCH -t 0-00:30:00 #SBATCH $SBATCH_ACCOUNT #< Merge strerr with stdout #SBATCH -oe cat $SLURM_JOB_NODELIST echo #OMP_NUM_THREADS : $OMP_NUM_THREADS cd "$SLURM_SUBMIT_DIR" ./omp_hello
Job OpenMP with Intel ==== ==== SLURM FATTO
module load intel cp /hpc/share/samples/omp/mm/omp_mm.cpp .
Script mm_omp.bash
with the request of 1 whole node with at least 32 cores:
#!/bin/bash #SBATCH -p bdw_debug -N1 -n32 #SBATCH --time 0-00:30:00 #SBATCH -oe #SBATCH --account=<account> cat $SLURM_JOB_NODELIST cd "$SLURM_SUBMIT_DIR" module load intel icpc -qopenmp omp_mm.cpp -o omp_mm # To change the number of threads: export OMP_NUM_THREADS=8 echo OMP_NUM_THREADS : $OMP_NUM_THREADS ./omp_mm
Job OpenMP with PGI ==== ===== FATTO SLURM
cp /hpc/share/samples/omp/mm/omp_mm.cpp .
Script omp_mm_pgi.bash
. The BDW cluster consists of nodes with 32 cores.
The OMP_NUM_THREADS variable is by default equal to the number of cores. If we want a different thread number we can indicate it in the row –cpus-per-task
#!/bin/sh #SBATCH -p bdw_debug -N1 -n32 #SBATCH --cpus-per-task=4 #SBATCH --time 0-00:30:00 SBATCH -oe cat $SLURM_JOB_NODELIST cd "$SLURM_SUBMIT_DIR" module load pgi pgc++ -mp omp_mm.cpp -o omp_mm_pgi echo OMP_NUM_THREADS : $OMP_NUM_THREADS ./omp_mm_pgi
sbatch -A <name account> omp_mm_pgi.bash
Job OpenMP with GNU 5.4 ==== ====SLURM FATTO
cp /hpc/share/samples/omp/cpi/* .
sbatch -A <name account> cpi2_omp.bash
python cpi2_omp.py
Job MPI, GNU OpenMPI ==== ==== SLURM FATTO
module load gnu openmpi cp /hpc/share/samples/mpi/mpi_hello.c . mpicc mpi_hello.c -o mpi_hello
Script mpi_hello.sh
for using GNU OpenMPI:
#!/bin/bash # 4 chunk of 16 CPU each. Executes a process MPI for each CPU #SBATCH -p bdw_debug -N4 -n16 #SBATCH -n 16 #SBATCH --time 0-00:30:00 #SBATCH -oe echo "### SLURM_JOB_NODELIST ###" cat $SLURM_JOB_NODELIST echo "####################" module load gnu openmpi cd "$SLURM_SUBMIT_DIR" mpirun mpi_hello
sbatch -A <name account> mpi_hello.bash
Job MPI with Intel MPI ==== ==== SLURM FATTO
module load intel intelmpi which mpicc cp /hpc/share/samples/mpi/mpi_mm.c . mpicc mpi_mm.c -o mpi_mm_int
Script mpi_mm_int.sh
for using Intel MPI:
#!/bin/sh # 4 chunk of 16 CPU each. Executes one process MPI for each CPU #SBATCH -p bdw_debug -N4 -n16 #SBATCH -n 16 #SBACTH --time 0-00:30:00 #SBATCH -oe echo "### SLURM_JOB_NODELIST ###" cat $SLURM_JOB_NODELIST echo "####################" module load intel intelmpi cd "$SLURM_SUBMIT_DIR" mpirun mpi_mm_int
Job MPI with PGI ==== ==== SLURM FATTO
module load pgi openmpi which mpicc cp /hpc/share/samples/mpi/mpi_hello.c . mpicc mpi_hello.c -o mpi_hello_pgi
Script mpi_hello_pgi.sh
for using OpenMpi di PGI:
#!/bin/sh #SBATCH -p bdw_debug -N4 -n16 #SBATCH --time 0-00:30:00 #SBATCH -oe echo "### SLURM_JOB_NODELIST ###" cat $SLURM_JOB_NODELIST echo "####################" NPUSER=$(cat $SLURM_JOB_NODELIST | wc -l) module load cuda pgi openmpi cd "$SLURM_SUBMIT_DIR" mpirun -hostfile $SLURM_JOB_NODELIST --npernode 1 mpi_hello_pgi
Job MPI + OpenMP with GNU OpenMPI ==== ==== SLURM FATTO
module load gnu openmpi cp -p /hpc/share/samples/mpi+omp/mpiomp_hello.c . mpicc -fopenmp mpiomp_hello.c -o mpiomp_hello_gnu
Script mpiomp_hello_gnu for using OpenMPI di PGI:
#!/bin/sh # 4 chunk of 16 CPU each, 1 process MPI for each chunk, 16 thread OpenMP for process #SBATCH -p bdw_debug -N4 -n16 #SBATCH -n 4 #SBATCH --cpus-per-task=16 # Number of threads OpenMP for each process MPI #SBATCH --time 0-00:30:00 #SBATCH -oe echo "### SLURM_JOB_NODELIST ###" cat $SLURM_JOB_NODELIST echo "####################" module load gnu openmpi cd "$SLURM_SUBMIT_DIR" mpirun mpiomp_hello_gnu
Job MPI + OpenMP with Intel MPI ==== ==== SLURM FATTO
module load intel intelmpi cp /hpc/share/samples/mpi+omp/mpiomp-hello.c . mpicc -qopenmp mpiomp_hello.c -o mpiomp_hello_int
#!/bin/sh # 4 chunk of 16 CPU each, 1 process MPI for each chunk, 16 thread OpenMP for process #SBATCH -p bdw_debug -N4 -n16 #SBATCH -n 1 #SBATCH --cpus-per-task=16 # Number of threads OpenMP for each process MPI #SBATCH --time 0-00:30:00 #SBATCH -oe echo "### SLURM_JOB_NODELIST ###" cat $SLURM_JOB_NODELIST echo "####################" module load intel intelmpi cd "$SLURM_SUBMIT_DIR" mpirun mpiomp_hello_int
Use of cluster KNL ===== ==== SLURM FATTO SE VIENE ATTIVATO KNL
The compiler to use is Intel.
The selection of the KNL cluster is done by specifying -p knl_<debug, pro ..> as required resources.
The maximum number of cores (ncpus) selectable per node is 68. Each physical core includes 4 virtual cores with hyperthreading technology, for a total of 272 per node.
#!/bin/sh # 4 whole nodes. Executes one process MPI for each node and 128 threads for process #SBATCH -p knl_debug -N4 -n1 #SBATCH -n 4 #SBATCH --cpus-per-task=128 # Number of threads OpenMP for each process MPI #SBATCH --time 0-00:30:00 #SBATCH -oe #SBATCH --time 0-00:30:00 #SBATCH -oe echo "### SLURM_JOB_NODELIST ###" cat $SLURM_JOB_NODELIST echo "####################" module load intel intelmpi cd "$SLURM_SUBMIT_DIR" cp /hpc/share/samples/mpi+omp/mpiomp_hello.c . mpicc -qopenmp mpiomp_hello.c -o mpiomp_hello_knl mpirun mpiomp_hello_knl
Use of cluster GPU ===== == SLURM FATTO MA DA RIVEDERE BENE
The GPU cluster consists of 2 machines with 7 GPUs each. The GPUs of a single machine are identified by an integer ID that goes from 0 to 6.
The compiler to use is nvcc:
Compilation example:
cp /hpc/share/samples/cuda/hello_cuda.cu . module load cuda nvcc hello_cuda.cu -o hello_cuda
The GPU cluster selection is done by specifying -p gpu_<debug-pro - ..> and –gres = gpu: <0-6> among the required resources.
Example of submission on 1 of the 7 GPUs available on a single node of the GPU cluster:
#!/bin/sh # 1 node with 1 GPU #SBATCH -p gpu_debug -N1 #SBATCH --gres=gpu:tesla:1 #SBATCH --time 0-00:30:00 #SBATCH -oe echo "### SLURM_JOB_NODEFILE ###" cat $SLURM_JOB_NODEFILE echo "####################" module load cuda cd "$SLURM_SUBMIT_DIR" ./hello_cuda
Example of submission of the N-BODY benchmark on all 7 GPUs available in a single node of the GPU cluster:
#!/bin/sh # 1 node with 7 GPU #SBATCH -p gpu_debug -N1 #SBATCH --gres=gpu:tesla:7 #SBATCH --time 0-00:30:00 #SBATCH -oe echo "### SLURM_JOB_NODEFILE ###" cat $SLURM_JOB_NODEFILE echo "####################" module load cuda cd "$SLURM_SUBMIT_DIR" /hpc/share/tools/cuda-9.0.176/samples/5_Simulations/nbody/nbody -benchmark -numbodies 1024000 -numdevices=5
In the case of N-BODY, the number of GPUs to be used is specified using the -numdevices option (the specified value must not exceed the number of GPUs required with the ngpus option).
In general, the GPU IDs to be used are derived from the value of the CUDA_VISIBLE_DEVICES environment variable.
In the case of the last example we have:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6
Teamwork
To share files among members of a group it is necessary to distinguish the type of activity
In the interactive mode on the login node the command newgrp
modifies the primary group and the permissions of the new files:
newgrp <groupname>
The newgrp command on the HPC cluster also automatically executes the command to enter the group directory (/ hpc / group / <groupname>):
cd "$GROUP"
In the Batch mode you must indicate the group to be used with the following directive:
#SBATCH --account=<account>
Scaling test ===== ==== SLURM FATTO
To sequentially launch a series run battery in the same job, for example to check the scaling of an algorithm:
cp /hpc/share/samples/serial/cpi/cpi_mc.c . gcc cpi_mc.c o cpi_mc
Script launch_single.sh
#!/bin/bash cd "$SLURM_SUBMIT_DIR" for N in $(seq 1000000 1000000 10000000) do CMD="./cpi_mc -n $N" echo "# $CMD" eval $CMD >> cpi_mc_scaling.dat done
sbatch -A <account name> launch_single.sh
The outputs of the different runs are written to the cpi_mc_scaling.dat file.
To generate a scaling plot we can use the python matplotlib library:
cp /hpc/share/samples/serial/cpi/cpi_mc_scaling.py .
python cpi_mc_scaling.py
Job Array ===== === SLURM FATTO
Using a single SLURM script it is possible to subdue a battery of Jobs, which can be executed in parallel, specifying a different numerical parameter for each submissive job.
The -J option specifies the numeric sequence of parameters. At each launch the value of the parameter is contained in the $ SLURM_ARRAY_TASK_ID variable
Example:
Starts N job for the computing of Pi with a number of intervals increasing from 100000 to 900000 with increment of 10000:
cp /hpc/share/samples/serial/cpi/cpi_mc . gcc cpi_mc.c o cpi_mc
Script slurm_launch_parallel.sh
#!/bin/sh #SBATCH -J 100000-900000:10000 cd "SLURM_SUBMIT_DIR" CMD="./cpi_mc -n ${SLURM_ARRAY_TASK_ID}" echo "# $CMD" eval $CMD
sbatch -A <name account> slurm_launch_parallel.sh
Gather the outputs:
grep -vh '^#' slurm_launch_parallel.sh.o*.*
Job MATLAB ===== === SLURM FATTO
Execution of a MATLAB serial program
cp /hpc/share/samples/matlab/pi_greco.m .
Script matlab.sh
#!/bin/sh #SBATCH -p bdw_debug -N1 -n1 #SBATCH --time 0-00:30:00 cd "$SLURM_SUBMIT_DIR" module load matlab matlab -nodisplay -r pi_greco
sbatch -A <account name> matlab.sh
Execution of a parallel job with MATLAB
cp /hpc/share/samples/matlab/pi_greco_parallel.m .
Script matlab_parallel.sh
. La versione di Matlab installata sul cluster consente l'utilizzo massimo di cores dello stesso nodo. qua bisogna specificare quanti core utilizzabili…ho messo 4 per il momento
#!/bin/sh #SBATCH -p bdw_debug -N1 -n4 #SBATCH --time 0-00:30:00 cd "$SLURM_SUBMIT_DIR" module load matlab matlab -nodisplay -r pi_greco_parallel
sbatch -A <account name> matlab_parallel.sh
Execution of a program MATLAB on GPU
cp /hpc/share/samples/matlab/matlabGPU.m . # ---- da fare ----
Script matlabGPU.sh
#!/bin/bash #SBATCH -p bdw_debug -N1 -n1 #SBATCH --gres=gpu:1 #SBATCH --time 0-00:30:00 cd "$SLURM_SUBMIT_DIR" module load matlab cuda matlab -nodisplay -r matlabGPU.m
sbatch -A <account name> matlabGPU.sh
Job MPI Crystal14
Script crystal14.sh
for submitting the MPI version of Crystal14. Requires 4 nodes from 8 cores and starts 8 MPI processes per node:
#!/bin/sh #SBATCH --job-name="crystal14" #Job name #SBATCH -p bdw_debug -N4 -n8 #Resource request #SBATCH -n8 #SBATCH --time 0-168:00:00 # input files directory CRY14_INP_DIR='input' # output files directory CRY14_OUT_DIR='output' # input files prefix CRY14_INP_PREFIX='test' # input wave function file prefix CRY14_F9_PREFIX='test' source /hpc/share/applications/crystal14
We recommend creating a folder for each simulation. In each folder there must be a copy of the crystal14.sh
script.
- CRY14_INP_DIR: the input file or files must be in the 'input' subfolder of the current directory. To use the current directory, comment the line with the definition of the CRY14_INP_DIR variable. To change subfolder, change the value of the CRY14_INP_DIR variable.
- CRY14_OUT_DIR: the output files will be created in the 'output' subfolder of the current folder. To use the current directory, comment the line with the definition of the CRY14_OUT_DIR variable. To change subfolder modify the value of the variable CRY14_OUT_DIR.
- CRY14_INP_PREFIX: the file or input files have a prefix that must coincide with the value of the CRY14_INP_PREFIX variable. The string 'test' is purely indicative and does not correspond to a real case.
- CRY14_F9_PREFIX: the input file, with extension 'F9', is the result of a previous processing and must coincide with the value of the variable CRY14_F9_PREFIX. The string 'test' is purely indicative and does not correspond to a real case.
The crystal14.sh
script includes, in turn, the system script / hpc / software / bin / hpc-pbs-crystal14
. The latter can not be changed by the user.
Submission of the shell script
Navigate to the folder containing crystal14.sh
and run the following command to submit the script to the job scheduler:
sbatch ./crystal14.sh
Analysis of files produced by Crystal14 during job execution
During execution of the job a temporary tmp
folder is created which contains the two files:
nodes.par machines.LINUX
The nodes.par
file contains the names of the nodes that participate in the parallel computing.
The machines.LINUX
file contains the names of the nodes that participate in the parallel computing with a multiplicity equal to the number of MPI processes started on the node.
To locate the temporary folders produced by Crystal14 during the execution of the job, run the following command directly from the login node:
eval ls -d1 /hpc/node/wn{$(seq -s, 81 95)}/$USER/crystal/* 2>/dev/null
To check the contents of the files produced by Crystal14 during the execution of the job, the user can move to one of the folders highlighted by the previous command.
At the end of the execution of the job, the two files machines.LINUX
and nodes.par
are deleted. The temporary folder tmp
is deleted only if it is empty.
It is therefore not necessary to log in with SSH to the nodes participating in the processing to check the contents of the files produced by Crystal14.
Job Gromacs ===== === INVARIATO
To define the GMXLIB environment variable, add the following lines to the file $HOME/.bash_profile
:
GMXLIB=$HOME/gromacs/top export GMXLIB
$ HOME / gromacs / top
is purely indicative. Modify it according to your preferences.
Job Gromacs OpenMP ==== === SLURM FATTO
Script mdrun_omp.sh
to exclusively request a node with 32 cores and start 16 OpenMP threads:
#!/bin/sh #SBATCH -p bdw_debug -N1 -n32 #SBATCH --cpus-per-task=16 # Number of threads OpenMP #SBATCH --exclusive #SBATCH --time 0-24:00:00 test "$SLURM_ENVIRONMENT" = 'SLURM_BATCH' || exit cd "$SLURM_SUBMIT_DIR" module load gnu openmpi source '/hpc/share/applications/gromacs/5.1.4/mpi_bdw/bin/GMXRC' gmx mdrun -s topology.tpr -pin on
Job Gromacs MPI ed OpenMP ==== === SLURM FATTO
Script mdrun_mpi_omp.sh
to exclusively request a node with 32 cores and start 8 MPI processes (the number of OpenMP threads will be calculated automatically):
#!/bin/sh #SBATCH -p bdw_debug -N2 -n32 #SBATCH -n 8 #SBATCH --exclusive #SBATCH --time 0-24:00:00 test "$SLURM_ENVIRONMENT" = 'SLURM_BATCH' || exit cd "$SLURM_SUBMIT_DIR" module load gnu openmpi source '/hpc/share/applications/gromacs/5.1.4/mpi_bdw/bin/GMXRC' NNODES=$(cat $SLURM_JOB_NODELIST | sort -u | wc -l) NPUSER=$(cat $SLURM_JOB_NODEFILE | wc -l) OMP_NUM_THREADS=$((OMP_NUM_THREADS/(NPUSER/NNODES))) mpirun gmx mdrun -s topology.tpr -pin on
Job Abaqus
Job Abaqus MPI ==== === SLURM FATTO
Example script abaqus.sh
to run Abacus on 1 node, 32 cores, 0 GPUs:
#!/bin/bash # walltime --time : estimated execution time, max 240 hours (better an estimate for excess than effective) #SBATCH -p bdw_debug -N1 -n32 #SBATCH --time 0-240:00:00 cat $SLURM_JOB_NODELIST # Modules necessary for the execution of Abacus module load gnu intel openmpi cd "$SLURM_SUBMIT_DIR" abaqus j=testverita cpus=32 # j= nomefile.inp
Job Abaqus MPI with GPU ==== ==== SLURM FATTO SE ATTIVATO CLUSTER GPU ???
Example script abaqus-gpu.sh
to run Abacus on 1 node, 6 cores, 1 GPU:
#!/bin/bash # walltime --time : estimated running time, max 240 hours (better than a slightly higher than actual estimate) #SBATCH -p gpu_dbg -N1 -n6 #SBATCH --gres=gpu:1 #SBATCH --time 0-00:30:00 cat $SLURM_JOB_NODELIST # Modules necessary for the execution of Abacus module load gnu intel openmpi cuda cd "$SLURM_SUBMIT_DIR" abaqus j=testverita cpus=6 gpus=1 # j= filename.inp
SSHFS ===== === DA FARE
To exchange data with a remote machine on which an ssh server is installed, you can use it SSHFS .
SSHFS is a file-system for Unix-like operating systems (MacOsX, Linux, BDS). This file system allows you to locally mount a folder located on a host running SSH server. This software implements the FUSE Kernel module.
Currently it is only installed on login.pr.infn.it. Alternatively it can be installed on the remote linux machine to access its data on the cluster.
To using it:
mkdir remote # create the mount point sshfs <remote-user>@<remote-host>:<remote-dir> remote # mount of remote file-system df -h # see mounted files system ls remote/ fusermount -u remote # umount the file system
VTune
VTune is a performance profiler from Intel and is available on the HPC cluster.
General information from Intel: https://software.intel.com/en-us/get-started-with-vtune-linux-os
Local Guide vtune (work in progress)