Skip to content

Transformation System Tutorial

arrabito edited this page May 17, 2018 · 60 revisions

This tutorial illustrates how the Transformation System can be used to the main concepts to use the Transforma

http://dirac.readthedocs.io/en/latest/AdministratorGuide/Systems/Transformation/index.html

1. Workflow description

The workflow to be executed in this tutorial is based on the mandelbrot application to create bitmap images:

https://github.com/bregeon/mandel4ts

The workflow is composed of several steps and the final result is a bitmap image of size (7680, 4320) pixels. Each step of the workflow is realized by a transformation, which consists of several jobs. Here a brief description of the different steps.

1. The first transformation (image slices production) creates several jobs, each one producing an image slice of 200 lines. In order to produce the whole image (4320 lines), 21 jobs are needed. These jobs execute the mandelbrot application with all identical parameters except the line number parameter -L, which varies from 1 to 21:

./mandelbrot.py -P 0.0005 -M 1000 -L 00i -N 200

Each job produces a data_00i*200.txt file which is saved on the File Catalog.

2. The second transformation (image slices merging) merges the results of the first transformation grouping files by 7 and producing 3 merged files:

./merge_data.py 

Each job produces a merged_data_*.txt file.

3. The third transformation (image building) produces the final bitmap image starting from the merged files produced in the previous step:

./build_merged_img.py

4. Finally the fourth transformation (removal) removes the intermediate files produced by the transformations in steps 1. and 2.

2. Practical informations

For this tutorial we will use a Testbed DIRAC instance. In order to access to this instance, follow the instructions below. The link to the web portal for Job and Transformation monitoring is:

https://cctbdirac01.in2p3.fr/DIRAC/

2.1 Client installation

wget --no-check-certificate https://github.com/DIRACGrid/DIRAC/raw/master/Core/scripts/dirac-install.py
python dirac-install.py -r v6r19p20 -v --no-lcg-bundle
source bashrc (or source cshrc)
dirac-proxy-init -x
dirac-configure -S Dirac-Test -C dips://cctbdirac01.in2p3.fr:9135/Configuration/Server 
dirac-proxy-init 

2.2 Transformations creation and monitoring

Retrieve the scripts to create the different transformations from:

git clone https://github.com/arrabito/DIRAC_TS_Tutorial

Before creating the actual transformations you can submit a simple mandelbrot job and inspect the result:

python submit_wms.py 1

This job is similar to those that will be created by the first transformation. Now you can start creating the transformations to execute the whole workflow. Note that thanks to the data-driven mechanism, you can create the first 3 transformations all toghether, without waiting for the previous one to be completed. Only the last one, which removes all intermediate produced data, must be launched of course after the third transformation is completed (all tasks 'Done'). Note that you should slightly customize the submission scripts submit_ts_step%i.py, setting the owner variable to your 'dirac username', e.g.:

########################################
# Modify here with your dirac username 
owner = 'larrabito'
########################################

This is simply necessary to distiniguish the data produced by the different participants during the tutorial. For the same reason, we have introduced a special "owner" meta-data to be used in the File Catalog queries associated to the transformations.

2.2.1 Image slices production

  • Edit submit_ts_step1.py, change the owner variable and look at the different sections (Job description, Transformation definition and submission). Observe the metadata characterising the output data:

    outputMetadata = json.dumps( {"application":"mandelbrot","image_type":"raw","owner":owner} )
    
  • Submit the transformation:

    python submit_ts_step1.py 
    
  • You will be prompted a name for the transformation. For practical reasons during the tutorial session, prepend your 'initials' to the name of the transformation, e.g.:

    la_trans_step1
    
  • Go to the TransformationMonitor on the web portal: https://cctbdirac01.in2p3.fr/DIRAC/. You should see your transformation (and also those of the other participants). The transformation is created but there are no jobs are associated yet. Click on the transformation and go to the Action/Extend on the context menu. Here you can choose of how many jobs your transformation will be composed of. So extend the transformation by 21. Observe the status changes of the different columuns of your transformation (refresh clicking on the Submit button). When tasks are in Submitted Status_, you can also click on Show Jobs to display the individual jobs. Note, that since jobs are submitted with the Production Shifter identity, you should remove the 'Owner' selection from the JobMonitor to display the jobs.

2.2.2 Image slices merging

  • Edit submit_ts_step2.py and observe how input data are attached to the transformation:

    inputMetaquery = json.dumps( {"application":"mandelbrot","image_type":"raw","owner":owner} )
    t.setFileMask(inputMetaquery)
    

    and which metadata are attached to the output data.

  • Submit the transformation:

    python submit_ts_step2.py 
    

2.2.3 Image building

  • Edit submit_ts_step3.py and observe how input data are attached to the transformation:

    inputMetaquery = json.dumps( {"application":"mandelbrot","image_type":"raw","owner":owner} )
    t.setFileMask(inputMetaquery) 
    

    and which metadata are attached to the output data.

  • Submit the transformation:

    python submit_ts_step3.py 
    

2.2.4 Monitoring

  • Monitor the progress of the 3 transformations from the TransformationMonitor (refresh clicking the Submit button). You can also browse the File Catalog to look at your produced files:

    FC:/> ls /vo.france-grilles.fr/user/l/larrabito/mandelbrot/images/
    
  • When the Image building transformation (step 3) is completed you can retrieve the final image:

    dirac-dms-get-file /vo.france-grilles.fr/user/[initial]/[owner]/mandelbrot/images/final/merged_image.bmp
    

2.2.5 Remove intermediate files

  • Edit submit_ts_step4.py and observe the Type and the Body of the transformation. In this case, no jobs are submitted to the WMS, but rather removal requests are submitted to the Request Management System.

  • Submit the transformation:

    python submit_ts_step4.py 
    
  • When this transformation is completed (all tasks 'Done'), check the File Catalog again.

Clone this wiki locally