Strumenti Utente

Strumenti Sito


calcoloscientifico:slurm_for_users

Slurm for Users

Partizioni

Organizzazione partizioni al CINECA:

( https://wiki.u-gov.it/confluence/display/SCAIUS/Migration+from+PBS+to+SLURM+@+Cineca )

sinfo> 
PARTITION       AVAIL  TIMELIMIT  NODES  STATE NODELIST
bdw_usr_dbg       up      30:00     26   idle r033c01s[01-04,08-12],r033c02s[01-12],r033c03s[01-05]
bdw_usr_prod      up 1-00:00:00     26   idle r033c01s[01-04,08-12],r033c02s[01-12],r033c03s[01-05]
knl_usr_dbg       up      30:00     32   idle r067c01s[01-04],r067c02s[01-04],r067c03s[01-04],r067c04s[01-04],..
knl_usr_prod      up 1-00:00:00     32   idle r067c01s[01-04],r067c02s[01-04],r067c03s[01-04],r067c04s[0104],..
skl_usr_dbg       up      30:00     31   idle r149c09s[01-04],r149c10s[01-04],r149c11s[01-04],r149c12s[01-04],..
..

Possibili code clusterE4:

bdw_dbg, bdw_prod, ...

Comandi utente

squeue -u ralfieri

108489 bdw_usr_d      BDW ralfieri PD       0:00      1 (None)
R - Job is running on compute nodes
PD - Job is waiting on compute nodes
CG - Job is completing
scancel -u ralfieri # cancella tutti i job di ralfieri

Job interattivi:

srun -N1 -n1   -A INF18_neumatt --pty bash
srun -N1 -n68  -A INF18_teongrav_0 -p knl_usr_dbg --pty bash
env | grep SLURM
SLURM_CHECKPOINT_IMAGE_DIR=/var/slurm/checkpoint
SLURM_NODELIST=r000u10l05
SLURM_JOB_NAME=bash
SLURMD_NODENAME=r000u10l05
SLURM_TOPOLOGY_ADDR=r000u10l05
SLURM_PRIO_PROCESS=0
SLURM_SRUN_COMM_PORT=37737
SLURM_JOB_QOS=normal
SLURM_PTY_WIN_ROW=24
SLURM_TOPOLOGY_ADDR_PATTERN=node
SLURM_CPU_BIND_VERBOSE=quiet
SLURM_CPU_BIND_LIST=0x000000002
SLURM_NNODES=1
SLURM_STEP_NUM_NODES=1
SLURM_JOBID=108468
SLURM_NTASKS=1
SLURM_LAUNCH_NODE_IPADDR=10.27.0.108
SLURM_STEP_ID=0
SLURM_STEP_LAUNCHER_PORT=37737
SLURM_TASKS_PER_NODE=1
SLURM_WORKING_CLUSTER=marconi:10.27.0.117:6817:8192
SLURM_JOB_ID=108468
SLURM_JOB_USER=ralfieri
SLURM_STEPID=0
SLURM_SRUN_COMM_HOST=10.27.0.108
SLURM_CPU_BIND_TYPE=mask_cpu:
SLURM_PTY_WIN_COL=188
SLURM_UMASK=0022
SLURM_JOB_UID=26362
SLURM_NODEID=0
SLURM_SUBMIT_DIR=/marconi/home/userexternal/ralfieri/gravity/parma
SLURM_TASK_PID=26529
SLURM_NPROCS=1
SLURM_CPUS_ON_NODE=1
SLURM_DISTRIBUTION=cyclic
SLURM_PROCID=0
SLURM_JOB_NODELIST=r000u10l05
SLURM_PTY_PORT=42411
SLURM_LOCALID=0
SLURM_JOB_GID=25200
SLURM_JOB_CPUS_PER_NODE=1
SLURM_CLUSTER_NAME=marconi
SLURM_GTIDS=0
SLURM_SUBMIT_HOST=r000u08l03
SLURM_JOB_PARTITION=bdw_all_serial
SLURM_STEP_NUM_TASKS=1
SLURM_JOB_ACCOUNT=inf18_neumatt
SLURM_JOB_NUM_NODES=1
SLURM_STEP_TASKS_PER_NODE=1
SLURM_STEP_NODELIST=r000u10l05
SLURM_CPU_BIND=quiet,mask_cpu:0x000000002
 sinfo -d

bdw_all_serial*    up    4:00:00      0    n/a
bdw_meteo_prod     up 1-00:00:00      0    n/a
bdw_usr_dbg        up      30:00      0    n/a
bdw_usr_prod       up 1-00:00:00      0    n/a
bdw_fua_gwdbg      up      30:00      0    n/a
bdw_fua_gw         up 2-00:00:00      0    n/a
knl_usr_dbg        up      30:00      0    n/a
knl_usr_prod       up 1-00:00:00      3 drain* r091c11s03,r096c18s01,r106c16s03
knl_usr_prod       up 1-00:00:00      1  down* r103c11s01
knl_fua_prod       up 1-00:00:00      2 drain* r086c06s01,r108c17s02
skl_fua_dbg        up    2:00:00      1 drain* r129c01s01
skl_fua_prod       up 1-00:00:00      8 drain* r130c02s04,r134c11s02,r136c08s04,r136c09s04,r144c14s01,r147c09s[01-02],r148c02s01
skl_fua_prod       up 1-00:00:00      1  down* r130c10s02
skl_usr_dbg        up      30:00      0    n/a
skl_usr_prod       up 1-00:00:00      6 drain* r165c14s04,r166c15s[01-04],r171c05s03
skl_usr_prod       up 1-00:00:00      1  down* r163c17s03
calcoloscientifico/slurm_for_users.txt · Ultima modifica: 12/03/2018 16:43 da roberto.alfieri