-
Notifications
You must be signed in to change notification settings - Fork 176
Transformation System Tutorial
This tutorial illustrates how the Transformation System can be used to the main concepts to use the Transforma
http://dirac.readthedocs.io/en/latest/AdministratorGuide/Systems/Transformation/index.html
The workflow to be executed in this tutorial is based on the mandelbrot application to create bitmap images:
https://github.com/bregeon/mandel4ts
The workflow is composed of several steps and the final result is a bitmap image of size (7680, 4320) pixels. Each step of the workflow is realized by a transformation, which consists of several jobs. Here a brief description of the different steps.
1. The first transformation (image slices production) creates several jobs, each one producing an image slice of 200 lines. In order to produce the whole image (4320 lines), 21 jobs are needed. These jobs execute the mandelbrot application with all identical parameters except the line number parameter -L
, which varies from 1 to 21:
./mandelbrot.py -P 0.0005 -M 1000 -L 00i -N 200
Each job produces a data_00i*200.txt
file which is saved on the File Catalog.
2. The second transformation (image slices merging) merges the results of the first transformation grouping files by 7 and producing 3 merged files:
./merge_data.py
Each job produces a merged_data_*.txt
file.
3. The third transformation (image building) produces the final bitmap image starting from the merged files produced in the previous step:
./build_merged_img.py
4. Finally the fourth transformation (removal) removes the intermediate files produced by the transformations in steps 1. and 2.
For this tutorial we will use a Testbed DIRAC instance. In order to access to this instance, follow the instructions below. The link to the web portal for Job and Transformation monitoring is:
https://cctbdirac01.in2p3.fr/DIRAC/
wget --no-check-certificate https://github.com/DIRACGrid/DIRAC/raw/master/Core/scripts/dirac-install.py
python dirac-install.py -r v6r19p20 -v --no-lcg-bundle
source bashrc (or source cshrc)
dirac-proxy-init -x
dirac-configure -S Dirac-Test -C dips://cctbdirac01.in2p3.fr:9135/Configuration/Server
dirac-proxy-init
Retrieve the scripts to create the different transformations from:
git clone https://github.com/arrabito/DIRAC_TS_Tutorial
Before creating the actual transformations you can submit a simple mandelbrot job and inspect the result:
python submit_wms.py 1
This job is similar to those that will be created by the first transformation. Now you can start creating the transformations to execute the whole workflow. Note that thanks to the data-driven mechanism, you can create the first 3 transformations all toghether, without waiting for the previous one to be completed. Only the last one, which removes all intermediate produced data, must be launched of course after the third transformation is completed (all tasks 'Done').
Note that you should slightly customize the submission scripts submit_ts_step%i.py
, setting the owner
variable to your 'dirac username', e.g.:
########################################
# Modify here with your dirac username
owner = 'larrabito'
########################################
This is simply necessary to distiniguish the data produced by the different participants during the tutorial. For the same reason, we have introduced a special "owner" meta-data to be used in the File Catalog queries associated to the transformations.
-
Edit
submit_ts_step1.py
, change theowner
variable and look at the different sections (Job description, Transformation definition and submission). Observe the metadata characterising the output data:outputMetadata = json.dumps( {"application":"mandelbrot","image_type":"raw","owner":owner} )
-
Submit the transformation:
python submit_ts_step1.py
-
You will be prompted a name for the transformation. For practical reasons during the tutorial session, prepend your 'initials' to the name of the transformation, e.g.:
la_trans_step1
-
Go to the TransformationMonitor on the web portal: https://cctbdirac01.in2p3.fr/DIRAC/. You should see your transformation (and also those of the other participants). The transformation is created but there are no jobs are associated yet. Click on the transformation and go to the
Action/Extend
on the context menu. Here you can choose of how many jobs your transformation will be composed of. So extend the transformation by 21. Observe the status changes of the different columuns of your transformation (refresh clicking on theSubmit
button). When tasks are in Submitted Status_, you can also click on Show Jobs to display the individual jobs. Note, that since jobs are submitted with the Production Shifter identity, you should remove the 'Owner' selection from the JobMonitor to display the jobs.
-
Edit
submit_ts_step2.py
and observe how input data are attached to the transformation:inputMetaquery = json.dumps( {"application":"mandelbrot","image_type":"raw","owner":owner} ) t.setFileMask(inputMetaquery)
and which metadata are attached to the output data.
-
Submit the transformation:
python submit_ts_step2.py
-
Edit
submit_ts_step3.py
and observe how input data are attached to the transformation:inputMetaquery = json.dumps( {"application":"mandelbrot","image_type":"raw","owner":owner} ) t.setFileMask(inputMetaquery)
and which metadata are attached to the output data.
-
Submit the transformation:
python submit_ts_step3.py
-
Monitor the progress of the 3 transformations from the TransformationMonitor (refresh clicking the
Submit
button). You can also browse the File Catalog to look at your produced files:FC:/> ls /vo.france-grilles.fr/user/l/larrabito/mandelbrot/images/
-
When the Image building transformation (step 3) is completed you can retrieve the final image:
dirac-dms-get-file /vo.france-grilles.fr/user/[initial]/[owner]/mandelbrot/images/final/merged_image.bmp
-
Edit
submit_ts_step4.py
and observe theType
and theBody
of the transformation. In this case, no jobs are submitted to the WMS, but rather removal requests are submitted to the Request Management System. -
Submit the transformation:
python submit_ts_step4.py
-
When this transformation is completed (all tasks 'Done'), check the File Catalog again.