GLite Job Management
From EUAGwiki
Contents |
[edit] Lectured by

Jingya You
Academia SINICA Grid Computing
Taiwan
mailto:jingya.you@twgrid.org
[edit] Slides
[edit] Objective
This tutorial will take you through the stages of running simple jobs. Before continuing, be sure to have a valid proxy, and if not create one. Look here if you need help on this.
[edit] Simple job Submission: glite-wms commands
- Simple JDL
- Credentials delegation
- Job List Match
- Job Submission
- Job Status
- Job Output
- Job cancel
- References
[edit] Login Gilda User Interface
Please login the Gilda user interface to do this practical training about job submission.
The URL of gilda User Interface is : ui.euag.org
Username : UPM01 ~ UPM40
Password : <Your OS Password>
The GRID pass phrase : <Your PassPhrase>
[edit] Simple JDL
- To submit a job to the Workload management System, a text file containing Job Description Language is used. The JDL describes the job and its requirements.
Here is the simplest example of a JDL file, to run a simple job on the grid:
[UPM04@ui ~]$ cat hostname.jdl
Type = "Job";
JobType = "Normal";
Executable = "/bin/hostname";
StdOutput = "hostname.out";
StdError = "hostname.err";
OutputSandbox = {"hostname.err","hostname.out"};
Arguments = "-f";
ShallowRetryCount = 3;
The Executable attribute specifies the command to be run on the Worker Node. The OutputSandbox attribute indicates the files you want to be copied back after job execution; normally these are files where output and error streams are redirected;their names are determined by the StdOutput and StdError attributes respectively. Also the number of retries is specified, in case of failures.
JDL is more fully described here
The examples below assume that you have this hostname JDL file - please copy the listing above to create your own hostname.jdl.
[edit] Create VOMS proxy
Users, if already belonging to a group, or already assigned to a Role, can apply the request while creating the proxy with voms-proxy-init command. In this way, the information will be signed by the VOMS server and inserted in the proxy AC; resources will be able to parse them assigning to the user the expected rights.
[UPM04@ui ~]$ voms-proxy-init --voms gilda Cannot find file or dir: /home/UPM04/.glite/vomses Enter GRID pass phrase: Your identity: /C=IT/O=GILDA/OU=Personal Certificate/L=Kuala Lumpur/CN=KUALALUMPUR04 Creating temporary proxy ..................................................... Done Contacting voms.ct.infn.it:15001 [/C=IT/O=INFN/OU=Host/L=Catania/CN=voms.ct.infn.it] "gilda" Done Creating proxy................................................................................... Done Your proxy is valid until Fri Jul 31 16:55:04 2009
[edit] Credentials Delegation
The commands shown on this page use WMProxy, a service that interacts with the WMS on your behalf. The authentication model within WM Proxy is as follows. In addition to a valid proxy it is necessary to delegate credentials to the WM Proxy server.
The user can either specify the delegationId to be associated with the delegated proxy by using the --delegationid option (shortly -d):
glite-wms-job-delegate-proxy -d myfirstdelegationid
Using -d option, the delegation is created, and its name is hold, so that subsequent invocations of glite-wms-job-submit and glite-wms-job-list-match can be given that delegation name, bypassing the delegation of a new proxy. So, when calling glite-wms-job-submit and glite-wms-job-list-match the delegation name is given with the -d option.
Instead of creating a delegation, it could be used -a option, which causes a delegated proxy to be established automatically. When using -a option, you don't need to run glite-wms-job-delegate-proxy -d , but you have to specify -a option for each use of glite-wms-job-submit and glite-wms-job-list-match. However massive use of this option it's not recommended, since it delegates a new proxy for each command issued, and delegation is a time-consuming operation, so it's better to do it once with glite-wms-job-delegate-proxy and reuse it.
To continue this tutorial, create a delegation towards WMProxy using as identifier your username, that you can get from the environment variable $USER.
[UPM04@ui ~]$ echo $USER UPM04 [UPM04@ui ~]$ glite-wms-job-delegate-proxy -d $USER Connecting to the service https://wms.euag.org:7443/glite_wms_wmproxy_server ================== glite-wms-job-delegate-proxy Success ================== Your proxy has been successfully delegated to the WMProxy: https://wms.euag.org:7443/glite_wms_wmproxy_server with the delegation identifier: UPM04 ==========================================================================
[edit] Job List Match
A JDL (Job Description Language) file describes a job that can be run. Before running the job, it is useful to test which computing elements (CE's) are able to accept it. Do this with the command glite-wms-job-list-match. As you can see, with -d option allows you to specify the delegation identifier you have created. Since we did it using the username (as get from $USER), this is the value we give to the option.
[UPM04@ui ~]$ glite-wms-job-list-match -d $USER hostname.jdl Connecting to the service https://wms.euag.org:7443/glite_wms_wmproxy_server ========================================================================== COMPUTING ELEMENT IDs LIST The following CE(s) matching your job requirements have been found: *CEId* - ce.euag.org:2119/jobmanager-lcgpbs-gilda ==========================================================================
The list of Computing Elements (CE) testifies that the JDL syntax is correct and that the job can run on one or more of the shown CEs.
[edit] Job Submission
A simple job can be submitted by the command: glite-wms-job-submit -d delegationId -o jobidfile jdlname
[UPM04@ui ~]$ glite-wms-job-submit -d $USER -o jobid hostname.jdl Connecting to the service https://wms.euag.org:7443/glite_wms_wmproxy_server ====================== glite-wms-job-submit Success ====================== The job has been successfully submitted to the WMProxy Your job identifier is: https://wms.euag.org:9000/tcl6gKsrMJm8TReyXcc7tQ The job identifier has been saved in the following file: /home/UPM04/jobid ==========================================================================
The file /home/UPM04/jobid is the output of the submission process. It receives the jobID(s) returned by the submission process. If another job is submitted (by repeating the submission line) its jobID is appended to the same jobID file. Try it by yourself.
[edit] Job Status
In order to know about the job status another command is available: glite-wms-job-status; this command queries LB (Logging and Bookkeeping service) on the status of any job whose job id is present in the specified file, in this case, jobid :
[UPM04@ui ~]$ glite-wms-job-status -i jobid ------------------------------------------------------------------ 1 : https://wms.euag.org:9000/0GSFe8gObmvvyOJOvWx2-A 2 : https://wms.euag.org:9000/DNfFrL9cxnTTHzl9edVCJg 3 : https://wms.euag.org:9000/tcl6gKsrMJm8TReyXcc7tQ a : all q : quit ------------------------------------------------------------------ Choose one or more jobId(s) in the list - [1-3]all:3 ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://wms.euag.org:9000/tcl6gKsrMJm8TReyXcc7tQ Current Status: Running Status Reason: Job successfully submitted to Globus Destination: ce.euag.org:2119/jobmanager-lcgpbs-gilda Submitted: Fri Jul 31 05:20:54 2009 UTC *************************************************************
The command, on the basis of the content of jobid file, shows a list of the corresponding submitted jobs, whose status can be queried for one or all of them. The -i option describes the file from which the command takes the jobID(s) to be inspected. Current job status is done for the first two jobs, scheduled for the third one. The selected CE is also shown. Alternatively the same command can be issued directly specifying the jobID(s), as in the following case:
[UPM04@ui ~]$ glite-wms-job-status https://wms.euag.org:9000/tcl6gKsrMJm8TReyXcc7tQ ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://wms.euag.org:9000/tcl6gKsrMJm8TReyXcc7tQ Current Status: Running Status Reason: Job successfully submitted to Globus Destination: ce.euag.org:2119/jobmanager-lcgpbs-gilda Submitted: Fri Jul 31 05:20:54 2009 UTC *************************************************************
Note that this command doesn't require a delegation identifier to be specified.
[edit] Job Output
When the result of a glite-job-status is that a job has been succesfully completed, the result can be retrieved by the command: glite-wms-job-output. You don't need to specify a delegation identifier.
[UPM04@ui ~]$ glite-wms-job-output -i jobid ------------------------------------------------------------------ 1 : https://wms.euag.org:9000/0GSFe8gObmvvyOJOvWx2-A 2 : https://wms.euag.org:9000/DNfFrL9cxnTTHzl9edVCJg 3 : https://wms.euag.org:9000/tcl6gKsrMJm8TReyXcc7tQ a : all q : quit ------------------------------------------------------------------ Choose one or more jobId(s) in the list - [1-3]all (use , as separator or - for a range): 3 Connecting to the service https://wms.euag.org:7443/glite_wms_wmproxy_server ================================================================================ JOB GET OUTPUT OUTCOME Output sandbox files for the job: https://wms.euag.org:9000/tcl6gKsrMJm8TReyXcc7tQ have been successfully retrieved and stored in the directory: /tmp/jobOutput/UPM04_tcl6gKsrMJm8TReyXcc7tQ ================================================================================
In order to inspect the job output, list the files in the indicated directory and show the content of the output file(s).
[UPM04@ui ~]$ cd /tmp/jobOutput/UPM04_tcl6gKsrMJm8TReyXcc7tQ [UPM04@ui UPM04_tcl6gKsrMJm8TReyXcc7tQ]$ ls -la total 12 drwxr-xr-x 2 UPM04 UPM04 4096 Jul 31 05:51 . drwxrwxrwt 8 root root 4096 Jul 31 05:51 .. -rw-rw-r-- 1 UPM04 UPM04 0 Jul 31 05:51 hostname.err -rw-rw-r-- 1 UPM04 UPM04 13 Jul 31 05:51 hostname.out [UPM04@ui UPM04_tcl6gKsrMJm8TReyXcc7tQ]$ cat hostname.out wnc.euag.org [UPM04@ui UPM04_tcl6gKsrMJm8TReyXcc7tQ]$
The output directory can be choosen by the user by the --dir option:
[UPM04@ui ~]$ glite-wms-job-output -i jobid --dir jobdir ------------------------------------------------------------------ 1 : https://wms.euag.org:9000/0GSFe8gObmvvyOJOvWx2-A 2 : https://wms.euag.org:9000/DNfFrL9cxnTTHzl9edVCJg 3 : https://wms.euag.org:9000/tcl6gKsrMJm8TReyXcc7tQ a : all q : quit ------------------------------------------------------------------ Choose one or more jobId(s) in the list - [1-3]all (use , as separator or - for a range): 3 Connecting to the service https://wms.euag.org:7443/glite_wms_wmproxy_server ================================================================================ JOB GET OUTPUT OUTCOME Output sandbox files for the job: https://wms.euag.org:9000/tcl6gKsrMJm8TReyXcc7tQ have been successfully retrieved and stored in the directory: /home/UPM04/jobdir ================================================================================
$ ls -la jobdir/ total 12 drwxr-xr-x 2 giorgio users 112 Jun 16 09:49 . drwx------ 87 giorgio users 8536 Jun 16 09:49 .. -rw-r--r-- 1 giorgio users 0 Jun 16 09:49 testsandbox.err -rw-r--r-- 1 giorgio users 405 Jun 16 09:49 testsandbox.out
[edit] Job cancel
If anything goes wrong a job can be cancelled by the command: glite-wms-job-cancel Again the -i option is also available (especially useful in order to cancel more files with a single command):
[UPM04@ui ~]$ glite-wms-job-cancel -i jobid ------------------------------------------------------------------ 1 : https://wms.euag.org:9000/0GSFe8gObmvvyOJOvWx2-A 2 : https://wms.euag.org:9000/DNfFrL9cxnTTHzl9edVCJg 3 : https://wms.euag.org:9000/tcl6gKsrMJm8TReyXcc7tQ a : all q : quit ------------------------------------------------------------------ Choose one or more jobId(s) in the list - [1-3]all (use , as separator or - for a range): 3 Are you sure you want to remove specified job(s) [y/n]y : y Error - Cancel not allowed Current Job Status is Cleared
In this case the job was not cancelled because it has been already succesfully completed - the status "cleared" shows that we have already retrieved the output.
