-
Notifications
You must be signed in to change notification settings - Fork 175
DIRAC v6r12
The main point of this version is the introduction of a new type of pilot, that is, for most parts, an implementation of the points discussed within https://github.com/DIRACGrid/DIRAC/wiki/Pilots-2.0:-generic,-configurable-pilots. These changes will be transparent to VOs. Also, several changes of the Data Management system are done.
In case your VO only uses Grid resources, and the pilots are only sent by SiteDirector and TaksQueueDirector agents, and you don't plan to have any specific pilot behaviour, you can stop reading here: the new pilot won't have anything different from the old pilot that you will notice.
Instead, in case you want, for example, to install DIRAC in a different way, or you want your pilot to have some VO specific action, you should carefully read the RFC 18, and what follows. You should also keep reading if your resources include IAAS and IAAC type of resources, like Virtual Machines.
The files to consider are in https://github.com/DIRACGrid/DIRAC/tree/rel-v6r12/WorkloadManagementSystem/PilotAgent The main file in which you should look is https://github.com/DIRACGrid/DIRAC/blob/rel-v6r12/WorkloadManagementSystem/PilotAgent/dirac-pilot.py that also contains a good explanation on how the system works.
The system works with "commands", as explained in the RFC. Any command can be added. If your command is executed before the "InstallDIRAC" command, pay attention that DIRAC functionalities won't be available.
We have introduced a special command named "GetPilotVersion" in https://github.com/DIRACGrid/DIRAC/blob/rel-v6r12/WorkloadManagementSystem/PilotAgent/pilotCommands.py that you should use, and possibly extend, in case you want to send/start pilots that don't know beforehand the (VO)DIRAC version they are going to install. In this case, you have to provide a json file freely accessible that contains the pilot version. This is tipically the case for VMs in IAAS and IAAC.
Beware that, to send pilots containing a specific list of commands via SiteDirector agents need a SiteDirector extension.
There are a number of changes in the DB classes of WMS, especially for what concerns JobDB. These changes are not strictly necessary, and have actually been introduced within patch release v6r11p14, together with some changes at the code level. The changes have been introduced with PR https://github.com/DIRACGrid/DIRAC/pull/2093. These changes are anyway highly recommended, and will make sure your DB reacts faster and is more reliable. We recommend you to update the MySQL schema according to what it is in the py and JDL DB files in DIRAC.WorloadManagementSystem.DB, starting from JobDB. Note: some tables have been dropped: if your extension needs them, please open a GitHub issue.
It is now possible to set the delay at which jobs in final states are removed from the WMS, via the JobCleaningAgent CS parameters RemoveStatusDelay/Done, RemoveStatusDelay/Killed, RemoveStatusDelay/Failed (default is 7 days).
As visible in https://github.com/DIRACGrid/DIRAC/pull/1983, some fixes and improvements of the DFC requires the tables to be INNODB. It is thus necessary to update your DB so that all the tables use that engine (ALTER TABLE myTable ENGINE = INNODB;)
As committed within https://github.com/DIRACGrid/DIRAC/pull/1950 there is a new field in the DowntimeCache table: 'GOCDBServiceType' : 'VARCHAR(32) NOT NULL'
As committed within https://github.com/fstagni/DIRAC/commit/23c0c741014e0589fcdbd6ba17dabfdf3558c8e4 SystemLoggingDB.FixedTextMessages.FixedString moved to VARCHAR(767)
Since v6r12 each accounting type can be stored in a different DB. By default all accounting types data will be stored in the database defined under /Systems/Accounting/Instance/Databases/AccountingDB. To store a type data in a different database (say WMSHistory) define the data base location under the databases directory. Then define /Systems/Accounting/Instance/Databases/MultiDB and set an option with the type name and value pointing to the database to use. For instance:
Systems
{
Accounting
{
Development
{
AccountingDB
{
Host = localhost
User = dirac
Password = dirac
DBName = accounting
}
Acc2
{
Host = somewhere.internet.net
User = dirac
Password = dirac
DBName = infernus
}
MultiDB
{
WMSHistory = Acc2
}
}
}
}
With the previous configuration all accounting data will be stored and retrieved from the usual database except for the WMSHistory type that will be stored and retrieved from the Acc2 database.
There is a new RequestOperation to be added, so in the list of OperationHandlers found in CS you should add: SetFileStatus { Location = DIRAC/TransformationSystem/Agent/RequestOperations/SetFileStatus MaxAttempts = 256 }
The following agents/executors should be run with CSEC_MECH=ID (add to "run" file):
- All Optimizers
- All WorkflowTask and RequestTask agents
- TransformationAgent
- ValidateOutputDataAgent After changes done in https://github.com/DIRACGrid/DIRAC/pull/2199