-
Notifications
You must be signed in to change notification settings - Fork 175
COMDIRAC Job Management
COMDIRAC job management commands are designed to ease day to day use of the DIRAC Workload Management System. Many of these commands are inspired from cluster batch systems (such as Torque or Slurm).
After Session Initialization, you have a valid local proxy that allows you to connect to DIRAC servers.
The dsub command has different semantics as described in the following. Let'submit your first COMDIRAC job with the dsub
command:
$ dsub /bin/hostname
6723938
The dsub
command has done two things:
- build a JDL formatted file describing your job
- submit this JDL to the DIRAC WMS, along with needed files (executable, eventually input sanbox, ...)
When the job is submitted, dsub
prints its JobID on the terminal.
You can ask the server to print a list of active jobs with the dstat
command:
$ dstat
JobID Owner JobName OwnerGroup JobGroup Site Status MinorStatus SubmissionTime
------- ----- ------- ---------------- -------- ---- ------- ---------------------- -------------------
6723938 pgay Unknown frangrilles_user NoGroup ANY Waiting Pilot Agent Submission 2013-10-26 15:44:31
Here, we see the only active job we just submitted described by a list of fields.
If you don't see any job in the list, it is probably because the job we just launched is already finished. Actually, since we sent a fairly simple command, it may be finished shortly anyway. To print inactive jobs too, you can add the -a
option to dstat
:
JobID Owner JobName OwnerGroup JobGroup Site Status MinorStatus SubmissionTime
------- ----- ------- ---------------- -------- --------------- ------ ------------------ -------------------
6723823 pgay Unknown frangrilles_user NoGroup LCG.LPNHE.fr Done Execution Complete 2013-10-26 15:39:44
6723938 pgay Unknown frangrilles_user NoGroup LCG.DATAGRID.fr Done Execution Complete 2013-10-26 15:44:31
Here, we see a list of recent jobs (10 day by default, but you can change this with the --jobDate
command line option).
Back to the dsub
command. In the previous example, we used /bin/hostname
as an argument. The first argument to dsub
is the executable, optionally followed by executable's arguments. There are a few things to note about the executable and its arguments:
-
dsub
will send executable file within the InputSandbox if it is a relative path (not beginning with/
), or if the--ForceExecUpload
flag is used. Otherwise, an absolute path indicates that the (eventually) locally present command will be used on the Grid Worker Node. - You may want to issue a command with arguments that begin with
-
or--
. Without precaution,dsub
will try to parse them as its own arguments. To avoid this, you can place--
alone before the executable's options on the commandline. Example:
$ dsub /bin/hostname -f
Error when parsing command line arguments: option -f not recognized
Submit jobs to DIRAC WMS
Usage:
dsub [option|cfgfile] [executable [--] [arguments...]]
Arguments:
... snip ...
$
$
$
$ dsub /bin/hostname -- -f
6726366
- in case you don't supply an executable as an argument on the command line and no
Executable
field can be found in the provided or autogenerated JDL,dsub
will read a script content on the standard input:
$ dsub
echo Hello from `/bin/hostname -f`
... finish with <Ctrl+D> ...
6728192
There are many options to the dsub
command. Most usefull are probably --JobName
and --JobGroup
, --Site
and --Parametric
. As usual, you can check all available options by issuing the command dsub -h
.
You can also specify your own JDL file (or even inline) with the --JDL
option. Additionally, if the default generated JDL doesn't fit your needs, you can customize it by placing your own version in your COMDIRAC configuration file profile with the jdl
variable. Here is an example where all jobs submitted will require ten hours maximum CPU time by default:
[frangrilles_user]
group_name = frangrilles_user
jdl = [CPUTime = 36000; OutputSandbox = {"std.out","std.err"};]
When you submit a job with an interpreted script (shell, Python, etc.), you can insert JDL directives directly inside of your script with specially formatted comment lines. dsub
command will parse the lines beginning with #JDL
(notice the white space after JDL) and include the followin of the line in the DIRAC JDL generated for the job. For example:
#! /bin/bash
#JDL StdError = std_%s.err
#JDL StdOutput = std_%s.out
#JDL OutputSandbox={"std_%s.out","std_%s.err"}
#JDL Arguments="%s"
#JDL JobName = param_%s
#JDL JobGroup = myparametric
#JDL ParameterStart = 1
#JDL ParameterStep = 1
#JDL Parameters = 2
echo command line: \"$@\"
/bin/hostname
/bin/date
This is especially useful as it allows you to store your job configuration within the payload script itself.
Once the job is finished, you can easily get its ouput from DIRAC server:
$ doutput 6726366
#
$ cat 6726366/std.out
grid103.lal.in2p3.fr
You can alternatively download output from all jobs in a group (JobGroup
JDL directive) with the command:
doutput -g <group name>
In this mode, and if all jobs in the group will have output files with different file names, it is generally more convenient to avoid to have one directory by job retrieved. For this, you can add the -n
flag to download all output files within the current directory.
Several additional commands are available:
-
dinput
- retrieve InputSandbox (and optionally JDL file) of a given job -
dlogging
- print status history of a given job -
dkill
- delete a job