Hp XC System 3.x Software Instrukcja Użytkownika Strona 87

  • Pobierz
  • Dodaj do moich podręczników
  • Drukuj
  • Strona
    / 133
  • Spis treści
  • BOOKMARKI
  • Oceniono. / 5. Na podstawie oceny klientów
Przeglądanie stron 86
Pseudo-parallel job A job that requests only one slot but specifies any of these constraints:
mem
tmp
nodes=1
mincpus > 1
Pseudo-parallel jobs are allocated one node for their exclusive use.
NOTE: Do NOT rely on this feature to provide node-level allocation
for small jobs in job scripts. Use the SLURM[nodes] specification instead,
along with mem, tmp, mincpus allocation options.
LSF-HPC considers this job type as a parallel job because the job requests
explicit node resources. LSF-HPC does not monitor these additional
resources, so it cannot schedule any other jobs to the node without risking
resource contention. Therefore LSF-HPC allocates the appropriate whole
node for exclusive use by the serial job in the same manner as it does for
parallel jobs, hence the name “pseudo-parallel”.
Parallel job A job that requests more than one slot, regardless of any other constraints.
Parallel jobs are allocated up to the maximum number of nodes specified
by the following specifications:
SLURM[nodes=min-max] (if specified)
SLURM[nodelist=node_list] (if specified)
bsub -n
Parallel jobs and serial jobs cannot run on the same node.
Small job A parallel job that can potentially fit into a single node, and does not
explicitly request more than one node (SLURM[nodes] or
SLURM[node_list] specification). LSF-HPC tries to allocate a single
node for a small job.
10.5 Using LSF-HPC Integrated with SLURM in the HP XC Environment
This section provides some additional information that should be noted about using LSF-HPC in the HP
XC Environment.
10.5.1 Useful Commands
The following describe useful commands for LSF-HPC Integrated with SLURM:
Use the bjobs -l and bhist -l commands to see the components of the actual SLURM allocation
command.
Use the bkill command to kill jobs.
Use the bjobs command to monitor job status in LSF-HPC integrated with SLURM.
Use the bqueues command to list the configured job queues in LSF-HPC integrated with SLURM.
10.5.2 Job Startup and Job Control
When LSF-HPC starts a SLURM job, it sets SLURM_JOBID to associate the job with the SLURM allocation.
While a job is running, all LSF-HPC supported operating-system-enforced resource limits are supported,
including core limit, CPU time limit, data limit, file size limit, memory limit, and stack limit. If the user
kills a job, LSF-HPC propagates signals to entire job, including the job file running on the local node and
all tasks running on remote nodes.
10.5.3 Preemption
LSF-HPC uses the SLURM "node share" feature to facilitate preemption. When a low-priority is job
preempted, job processes are suspended on allocated nodes, and LSF-HPC places the high-priority job on
the same node. After the high-priority job completes, LSF-HPC resumes suspended low-priority jobs.
10.5 Using LSF-HPC Integrated with SLURM in the HP XC Environment 87
Przeglądanie stron 86
1 2 ... 82 83 84 85 86 87 88 89 90 91 92 ... 132 133

Komentarze do niniejszej Instrukcji

Brak uwag