calcoloscientifico:slurm_for_users
Slurm for Users
Partizioni
Organizzazione partizioni al CINECA:
( https://wiki.u-gov.it/confluence/display/SCAIUS/Migration+from+PBS+to+SLURM+@+Cineca )
sinfo> PARTITION AVAIL TIMELIMIT NODES STATE NODELIST bdw_usr_dbg up 30:00 26 idle r033c01s[01-04,08-12],r033c02s[01-12],r033c03s[01-05] bdw_usr_prod up 1-00:00:00 26 idle r033c01s[01-04,08-12],r033c02s[01-12],r033c03s[01-05] knl_usr_dbg up 30:00 32 idle r067c01s[01-04],r067c02s[01-04],r067c03s[01-04],r067c04s[01-04],.. knl_usr_prod up 1-00:00:00 32 idle r067c01s[01-04],r067c02s[01-04],r067c03s[01-04],r067c04s[0104],.. skl_usr_dbg up 30:00 31 idle r149c09s[01-04],r149c10s[01-04],r149c11s[01-04],r149c12s[01-04],.. ..
Possibili code clusterE4:
bdw_dbg, bdw_prod, ...
Comandi utente
squeue -u ralfieri 108489 bdw_usr_d BDW ralfieri PD 0:00 1 (None) R - Job is running on compute nodes PD - Job is waiting on compute nodes CG - Job is completing
scancel -u ralfieri # cancella tutti i job di ralfieri
Job interattivi:
srun -N1 -n1 -A INF18_neumatt --pty bash srun -N1 -n68 -A INF18_teongrav_0 -p knl_usr_dbg --pty bash
env | grep SLURM SLURM_CHECKPOINT_IMAGE_DIR=/var/slurm/checkpoint SLURM_NODELIST=r000u10l05 SLURM_JOB_NAME=bash SLURMD_NODENAME=r000u10l05 SLURM_TOPOLOGY_ADDR=r000u10l05 SLURM_PRIO_PROCESS=0 SLURM_SRUN_COMM_PORT=37737 SLURM_JOB_QOS=normal SLURM_PTY_WIN_ROW=24 SLURM_TOPOLOGY_ADDR_PATTERN=node SLURM_CPU_BIND_VERBOSE=quiet SLURM_CPU_BIND_LIST=0x000000002 SLURM_NNODES=1 SLURM_STEP_NUM_NODES=1 SLURM_JOBID=108468 SLURM_NTASKS=1 SLURM_LAUNCH_NODE_IPADDR=10.27.0.108 SLURM_STEP_ID=0 SLURM_STEP_LAUNCHER_PORT=37737 SLURM_TASKS_PER_NODE=1 SLURM_WORKING_CLUSTER=marconi:10.27.0.117:6817:8192 SLURM_JOB_ID=108468 SLURM_JOB_USER=ralfieri SLURM_STEPID=0 SLURM_SRUN_COMM_HOST=10.27.0.108 SLURM_CPU_BIND_TYPE=mask_cpu: SLURM_PTY_WIN_COL=188 SLURM_UMASK=0022 SLURM_JOB_UID=26362 SLURM_NODEID=0 SLURM_SUBMIT_DIR=/marconi/home/userexternal/ralfieri/gravity/parma SLURM_TASK_PID=26529 SLURM_NPROCS=1 SLURM_CPUS_ON_NODE=1 SLURM_DISTRIBUTION=cyclic SLURM_PROCID=0 SLURM_JOB_NODELIST=r000u10l05 SLURM_PTY_PORT=42411 SLURM_LOCALID=0 SLURM_JOB_GID=25200 SLURM_JOB_CPUS_PER_NODE=1 SLURM_CLUSTER_NAME=marconi SLURM_GTIDS=0 SLURM_SUBMIT_HOST=r000u08l03 SLURM_JOB_PARTITION=bdw_all_serial SLURM_STEP_NUM_TASKS=1 SLURM_JOB_ACCOUNT=inf18_neumatt SLURM_JOB_NUM_NODES=1 SLURM_STEP_TASKS_PER_NODE=1 SLURM_STEP_NODELIST=r000u10l05 SLURM_CPU_BIND=quiet,mask_cpu:0x000000002
sinfo -d bdw_all_serial* up 4:00:00 0 n/a bdw_meteo_prod up 1-00:00:00 0 n/a bdw_usr_dbg up 30:00 0 n/a bdw_usr_prod up 1-00:00:00 0 n/a bdw_fua_gwdbg up 30:00 0 n/a bdw_fua_gw up 2-00:00:00 0 n/a knl_usr_dbg up 30:00 0 n/a knl_usr_prod up 1-00:00:00 3 drain* r091c11s03,r096c18s01,r106c16s03 knl_usr_prod up 1-00:00:00 1 down* r103c11s01 knl_fua_prod up 1-00:00:00 2 drain* r086c06s01,r108c17s02 skl_fua_dbg up 2:00:00 1 drain* r129c01s01 skl_fua_prod up 1-00:00:00 8 drain* r130c02s04,r134c11s02,r136c08s04,r136c09s04,r144c14s01,r147c09s[01-02],r148c02s01 skl_fua_prod up 1-00:00:00 1 down* r130c10s02 skl_usr_dbg up 30:00 0 n/a skl_usr_prod up 1-00:00:00 6 drain* r165c14s04,r166c15s[01-04],r171c05s03 skl_usr_prod up 1-00:00:00 1 down* r163c17s03
calcoloscientifico/slurm_for_users.txt · Ultima modifica: 12/03/2018 16:43 da roberto.alfieri