SPDX-License-Identifier: Apache-2.0
Todays software projects often make use of large amounts of Open Source software. Being compliant with the license obligations of the used software components is a prerequisite for every such project. This results in different requirements that the project might need to fulfill. Those requirements can be grouped into two main categories:
-
Things that need to be done to actually fulfill license obligations
-
Things that need to be done to monitor / report fulfillment of license obligations
Most of the above activities share common points:
-
The need to have an inventory of used (open source) components and their licenses
-
Some rule based evaluation and reporting based on this inventory
While working on these easy looking tasks, they might get complex due to various aspects:
-
The number of open source components might be quite large (>> 100 for a typical webapplication based on state of the art programming frameworks)
-
Agile development and rapid changes of used components result in frequent changes of the inventory
-
Open Source usage scenarios and license obligations might be OK in one context (e.g. in the relation between a software developer and his client) but might be completely inacceptable in another context (e.g. when the client distributes the same software to end customers)
-
Legal interpretation of license conditions often differ from organisation to organisation and result in different compliance rules to be respected.
-
License information for components is often not available in a standardized form which would allow automatic processing
-
Tools for supporting the license management processes are often specific to a technology or build tool and do not support all aspects of OSS license management.
Of course there are specific commercial tool suites which address the IP rights and license domain. But due to high complexity and license costs those tools are out of reach for most projects - at least for permanent use.
Solicitor tries to address some of the issues hightlighted above. In its initial version it is a tool for programmatically executing a process which was originally defined as an Excel-supported manual process.
When running Solicitor three subsequent processing steps are executed:
-
Creating an initial component and license inventory based on technology specific input files
-
Rule based normalization and evaluation of licenses
-
Generation of output documents
Warning
|
Solicitor comes with a set of sample rules for the normalization and evaluation of licenses.
Even though these included rules are not "intentionally wrong" they are only samples and you should never rely on these builtin rules without checking and possibly modifying their content and consulting your lawyer.
Solicitor is a tool for technically supporting the management of OSS licenses within your project.
Solicitor neither gives legal advice nor is a replacement for a lawyer.
|
The Solicitor code and accompanying resources (including this userguide) as stored in the GIT Repository https://github.com/devonfw/solicitor are licensed as Open Source under Apache 2 license (https://www.apache.org/licenses/LICENSE-2.0).
Important
|
Specifically observe the "Disclaimer of Warranty" and "Limitation of Liability" which are part of the license. |
Important
|
The executable JAR file which is created by the Maven based build process includes numerous other Open Source components which are subject to different Open Source licenses. Any distribution of the Solicitor executable JAR file needs to comply with the license conditions of all those components.
If you are running Solicitor from the executable JAR you might use the -eug option to store detailed license information as file solicitor_licenseinfo.html in your current working directory (together with a copy of this user guide).
|
The following picture show a business oriented view of Solicitor.
Raw data about the components and attached licenses within an application is gathered by scanning with technology and build chain specific tools. This happens outside Solicitor.
The import step reads this data and transforms it into a common technology independent internal format.
In the normalization step the license information is completed and unified. Information not contained in the raw data is added. Where possible the applicable licenses are expressed by SPDX-IDs.
Many open source compontents are available via multi licensing models. Within qualification the finally applicable licenses are selected.
In the legal assessment the compliance of applicable licenses will be checked based on generic rules defined in company wide policies and possibly project specific project specific extensions. Defining those rules is considered as "legal advice" and possibly needs to be done by lawyers which are authorized to do so. For this step Solicitor only provides a framework / tool to support the process here but does not deliver any predefined rules.
The final export step produces documents based on the internal data model. This might be the list of licenses to be forwarded to the customer or a license compliance report. Data might also be fed into other systems.
A more technical oriented view of Solicitor is given below.
There are three major technical components: The reader and writer components are performing import and export of data. The business logic - doing normalization, qualification and legal assessment is done by a rule engine. Rules are mainly defined via decision tables. Solicitor comes with a starting set of rules for normalization and qualification but these rulesets need to be extended within the projects. Rules for legal evaluation need to be completely defined by the user.
Solicitor is working without additional persisted data: When being executed it generates the output direcly from the read input data after processing the business rules.
The internal business data model consists of 6 entities:
-
ModelRoot
: root object of the business data model which holds metadata about the data processing -
Engagement
: the masterdata of the overall project -
Application
: a deliverable within theEngagement
-
ApplicationComponent
: component within anApplication
-
RawLicense
: License info attached to anApplicationComponent
as it is read from the input data -
NormalizedLicense
: License info attached to anApplicationComponent
processed by the business rules
Property | Type | Description |
---|---|---|
modelVersion |
int |
version number of the data model |
executionTime |
String |
timestamp when the data was processed |
solicitorVersion |
String |
Solicitor version which processed the model |
solicitorGitHash |
String |
buildnumber / GitHash of the Solicitor build |
solicitorBuilddate |
String |
build date of the Solicitor build |
extensionArtifactId |
String |
artifactId of the active Solicitor Extension ("NONE" if no extension) |
extensionVersion |
String |
Version of the active Extension (or "NONE") |
extensionGitHash |
String |
Buildnumber / GitHash of the Extension (or "NONE") |
Property | Type | Description |
---|---|---|
engagementName |
String |
the engagement name |
engagementType |
EngagementType |
the engagement type; possible values: INTERN, EXTERN |
clientName |
String |
name of the client |
goToMarketModel |
GoToMarketModel |
the go-to-market-model; possible values: LICENSE |
contractAllowsOss |
boolean |
does the contract explicitely allow OSS? |
ossPolicyFollowed |
boolean |
is the companies OSS policy followed? |
customerProvidesOss |
boolean |
does the customer provide the OSS? |
Property | Type | Description |
---|---|---|
applicationName |
String |
the name of the application / deliverable |
releaseId |
String |
version identifier of the application |
releaseDate |
Sting |
release data of the application |
sourceRepo |
String |
URL of the source repo of the application (should be an URL) |
programmingEcosystem |
String |
programming ecosystem (e.g. Java8; Android/Java, iOS / Objective C) |
Property | Type | Description |
---|---|---|
usagePattern |
UsagePattern |
possible values: DYNAMIC_LINKING, STATIC_LINKING, STANDALONE_PRODUCT |
ossModified |
boolean |
is the OSS modified? |
ossHomepage |
String |
URL of the OSS homepage |
groupId |
String |
component identifier: maven group |
artifactId |
String |
component identifier: maven artifactId |
version |
String |
component identifier: Version |
repoType |
String |
component identifier: RepoType |
Property | Type | Description |
---|---|---|
declaredLicense |
String |
name of the declared license |
licenseUrl |
String |
URL of the declared license |
trace |
String |
detail info of history of this data record |
specialHandling |
boolean |
(for controlling rule processing) |
Property | Type | Description |
---|---|---|
declaredLicense |
String |
name of the declared license (copied from RawLicense) |
licenseUrl |
String |
URL of the declared license (copied from RawLicense |
declaredLicenseContent |
String |
resolved content of licenseUrl |
normalizedLicenseType |
String |
type of the license, see License types |
normalizedLicense |
String |
name of the license in normalized form (SPDX-Id) or special "pseudo license id", see Pseudo License Ids |
normalizedLicenseUrl |
String |
URL pointing to a normalized form of the license |
normalizedLicenseType |
String |
type of the license, see License types |
effectiveNormalizedLicenseType |
String |
type of the effective license, see License types |
effectiveNormalizedLicense |
String |
effective normalized license (SPDX-Id) or "pseudo license id"; this is the information after selecting the right license in case of multi licensing or any license override due to a component being redistributed under a different license |
effectiveNormalizedLicenseUrl |
String |
URL pointing to the effective normalized license |
effectiveNormalizedLicenseContent |
String |
resolved content of effectiveNormalizedLicenseUrl |
legalPreApproved |
String |
indicates whether the license is pre approved based on company standard policy |
copyLeft |
String |
indicates the type of copyleft of the license |
licenseCompliance |
String |
indicates if the license is compliant according to the default company policy |
licenseRefUrl |
String |
URL to the reference license information (TBD) |
licenseRefContent |
String |
resolved content of licenseRefUrl |
includeLicense |
String |
does the license require to include the license text ? |
includeSource |
String |
does the license require to deliver source code of OSS component ? |
reviewedForRelease |
String |
for which release was the legal evaluation done? |
comments |
String |
comments on the component/license (mainly as input to legal) |
legalApproved |
String |
indicates whether this usage is legally approved |
legalComments |
String |
comments from legal, possibly indicating additional conditions to be fulfilled |
trace |
String |
detail info of history of this data record (rule executions) |
guessedLicenseUrl |
String |
guessed (possibly improved) URL of the effective normalized license |
guessedLicenseUrlAuditInfo |
String |
audit info which documents how the guessedLicenseUrl was guessed |
guessedLicenseContent |
String |
resolved content of guessedLicenseUrl |
For the mechanism how Solicitor resolves the content of URLs and how the result might be influenced see Resolving of License URLs.
For a description of the URL guessing mechanism see Guessing of license URLs.
Defines the type of license
-
OSS-SPDX
- An OSS license which has a corresponding SPDX-Id -
OSS-OTHER
- An OSS license which has no SPDX-Id -
COMMERCIAL
- Commercial (non OSS) license; this might also include code which is owned by the project -
UNKNOWN
- License is unknown -
IGNORED
- license will be ignored (non selected license in multi licensing case; only to be used as "Effective Normalized License Type")
A "normalized" license id might be either a SPDX-Id or a "pseudo license id" which is used to indicate a specific situation. The following pseudo license ids are used:
-
OSS specific
- a nonstandard OSS license which could not be mapped to a SPDX-Id -
PublicDomain
- any form of public domain which is not represented by an explicit SPDX-Id -
Ignored
- license will be ignored (non selected license in multi licensing case; only to be used as "Effective Normalized License") -
NonOSS
- commercial license, not OSS
Solicitor is a standalone Java (Spring Boot) application. Prerequisite for running it is an existing Java 8 or 11 runtime environment. If you do not yet have a the Solicitor executable JAR (solicitor.jar
) you need to build it as given on the project GitHub homepage https://github.com/devonfw/solicitor .
Solicitor is executed with the following command:
java -jar solicitor.jar -c <configfile>
where <configfile>
is to be replaced by the location of the Project Configuration File.
To get a first idea on what Solicitor does you might call
java -jar solicitor.jar -c classpath:samples/solicitor_sample.cfg
This executes Solicitor with default configuration on it own list of internal components and produces sample output.
To get an overview of the available command line options use
java -jar solicitor.jar -h
For unique adressing of resources to be read (configuration files, input data, rule templates and decision tables) Solicitor makes use of the Spring ResourceLoader functionality, see https://docs.spring.io/spring-framework/docs/current/spring-framework-reference/core.html#resources-resourceloader . This allows to load from the classpath, the filesystem or even via http get.
If you want to reference a file in the filesystem you need to write it as follows: file:path/to/file.txt
Note that this only applies to resources being read. Output files are adressed without that prefix.
The project configuration of Solicitor is done via a configuration file in JSON format. This configuration file defines the engagements and applications master data, configures the readers for importing component and license information, references the business rules to be applied and defines the exports to be done.
The config file has the following skeleton:
{ "version" : 1, "comment" : "Sample Solicitor configuration file", "engagementName" : "devonfw", (1) . . . "applications" : [ ... ], (2) "rules" : [ ... ], (3) "writers" : [ ... ] (4) }
-
The leading data defines the engagement master data, see Header and Engagement Master Data
-
applications
defines the applications within the engagement and configures the readers to import the component/license information, see Applications -
rules
references the rules to apply to the imported data, see Business Rules -
writers
configures how the processed data should be exported, see Writers / Reporting
Note
|
The following section describes all sections of the Solicitor configuration file format. Often the configuration of writers and especially rules will be identical for projects. To facilitate the project specific configuration setup Solicitor internally provides a base configuration which contains reasonable defaults for the rules and writers section. If the project specific configuration file omits the rules and/or writers sections then the corresponding settings from the base configuration will be taken. For details see Default Base Configuration.
|
Warning
|
If locations of files are specified within the configuration files as relative
pathnames then this is always evaluated relative to the current working directory (which
might differ from the location of the configuration file). If some file location
should be given relative to the location of the configuration file this might be done
using the special placeholder ${cfgdir} as described in the following.
|
Within certain parts of the configuration file (path and filenames) special placeholders might be used to parameterize the configuration. These areas are explicitely marked in the following description.
These placeholders are available:
-
${project}
- A simplified project name (taking the engagement name, removing all non-word characters and converting to lowercase). -
${cfgdir}
- If the config file was loaded from the filesystem this denotes the directory where the config file resides,.
otherwise. This can be used to reference locations relative to the location of the config file.
The leading section of the config file defines some metadata and the engagement master data.
"version" : 1, (1) "comment" : "Sample Solicitor configuration file", (2) "engagementName" : "devonfw", (3) "engagementType" : "INTERN", (4) "clientName" : "none", (5) "goToMarketModel" : "LICENSE", (6) "contractAllowsOss" : true, (7) "ossPolicyFollowed" : true, (8) "customerProvidesOss" : false, (9)
-
version of the config file format (currently needs to be 1)
-
is a free text comment (no further function at the moment)
-
the engagement name (any string)
-
the engagement type; possible values: INTERN, EXTERN
-
name of the client (any string)
-
the go-to-market-model; possible values: LICENSE
-
does the contract explicitely allow OSS? (boolean)
-
is the companies OSS policy followed? (boolean)
-
does the customer provide the OSS? (boolean)
Within this section the different applications (=deliverables) of the engagement are defined. Furtheron for each application at least one reader needs to be defined which imports the component and license information.
"applications" : [ { "name" : "Devon4J", (1) "releaseId" : "3.1.0-SNAPSHOT", (2) "sourceRepo" : "https://github.com/devonfw/devon4j.git", (3) "programmingEcosystem" : "Java8", (4) "readers" : [ { (5) "type" : "maven", (6) "source" : "classpath:samples/licenses_devon4j.xml", (7) (10) "usagePattern" : "DYNAMIC_LINKING", (8) "repoType" : "maven" (9) } ] } ],
-
The name of the application / deliverable (any string)
-
Version identifier of the application (any string)
-
URL of the source repo of the application (string; should be an URL)
-
programming ecosystem (any string; e.g. Java8; Android/Java, iOS / Objective C)
-
multiple readers might be defined per application
-
the type of reader; for possible values see Reading License Information with Readers
-
location of the source file to read (ResourceLoader-URL)
-
usage pattern; possible values: DYNAMIC_LINKING, STATIC_LINKING, STANDALONE_PRODUCT
-
repoType: Repository to download the sources from: currently possible values: maven, npm; if omitted then "maven" will be taken as default
-
placeholder patterns might be used here
The different readers are described in chapter Reading License Information with Readers
Business rules are executed within a Drools rule engine. They are defined as a sequence of rule templates and corresponding XLS files which together represent decision tables.
"rules" : [ { "type" : "dt", (1) "optional" : false, (2) "ruleSource" : "classpath:samples/LicenseAssignmentSample.xls", (3) (7) "templateSource" : "classpath:com/.../rules/rule_templates/LicenseAssignment.drt", (4) (7) "ruleGroup" : "LicenseAssignment", (5) "description" : "setting license in case that no one was detected" (6) }, . . . ,{ "type" : "dt", "optional" : false, "ruleSource" : "classpath:samples/LegalEvaluationSample.xls", "templateSource" : "classpath:com/.../rules/rule_templates/LegalEvaluation.drt", "ruleGroup" : "LegalEvaluation", "decription" : "final legal evaluation based on the rules defined by legal" } ],
-
type of the rule; only possible value:
dt
which stands for "decision table" -
if set to
true
the processing of this group of rules will be skipped if the XLS with table data (given byruleSource
) does not exist; if set tofalse
a missing XLS table will result in program termination -
location of the tabular decision table data
-
location of the drools rule template to be used to define the rules together with the decision table data
-
id of the group of rules; used to reference it e.g. when doing logging
-
some textual description of the rule group
-
placeholder patterns might be used here
When running, Solicitor will execute the rules of each rule group separately and in the order given by the configuration. Only if there are no more rules to fire in a group Solicitor will move to the next rule group and start firing those rules.
Normally a project will only customize (part of) the data of the decision tables and thus will only change the ruleSource
and the data in the XLS. All other configuration (the different templates and processing order) is part of the Solicitor application itself and should not be changed by end users.
See Working with Decision Tables and Standard Business Rules for further information on the business rules.
The writer configuration defines how the processed data will be exported and/or reported.
"writers" : [ { "type" : "xls", (1) "templateSource" : "classpath:samples/Solicitor_Output_Template_Sample.xlsx", (2) (6) "target" : "OSS-Inventory-devonfw.xlsx", (3) (6) "description" : "The XLS OSS-Inventory document", (4) "dataTables" : { (5) "ENGAGEMENT" : "classpath:com/devonfw/tools/solicitor/sql/allden_engagements.sql", "LICENSE" : "classpath:com/devonfw/tools/solicitor/sql/allden_normalizedlicenses.sql" } } ]
-
type of writer to be selected; possible values:
xls
,velo
-
path to the template to be used
-
location of the output file
-
some textual description
-
reference to SQL statements used to transform the internal data model to data tables used for reporting
-
placeholder patterns might be used here
For details on the writer configuration see Reporting / Creating output documents.
To simplify setting up a new project Solicitor provides an option to create a project starter configuration in a given directory.
java -jar solicitor.jar -wiz some/directory/path
Besides the necessary configuration file this includes also empty XLS files for defining project
specific rules which amend the builtin rules. Furtheron a sample license.xml
file is provided to
directly enable execution of solicitor and check functionality.
This configuration then serves as starting point for project specific configuration.
When working with Solicitor it might be necessary to get access to the builtin base configuration, e.g. for reviewing the builtin sample rules or using builtin reporting templates as starting point for the creation of own templates.
The command
java -jar solicitor.jar -ec some/directory/path
will export all internal configuration to the given directory. This includes:
-
The base configuration file, which defines standard settings inherited by the Project Configuration File
-
The Drools Rule Templates
-
The builtin decision tables which are referenced in the base configuration, see Standard Business Rules
-
The SQL statements which are used for SQL transformation and filtering
-
The referenced templates for the Velocity Writer and Excel Writer
Besides the project configuration done via the above described file there are a set of technical settings in Solicitor which are done via properties. Solicitor is implemented as a Spring Boot Application and makes use of the standard configuration mechanism provided by the Spring Boot Platform which provides several ways to define/override properties.
The default property values are given in Built in Default Properties.
In case that a property shall be overridden when executing Solicitor this can easiest be done via the command line when executing Solicitor:
java -Dsome.property.name1=value -Dsome.property.name2=another_value -jar solicitor.jar <any other arguments>
Different Readers are available to import raw component / license information for different technologies. This chapter describes how to setup the different build / dependency management systems to create the required input and how to configure the corresponding reader.
For the export of the licenses from a maven based project the license-maven-plugin is used, which can directly be called without the need to change anything in the pom.xml.
To generate the input file required for Solicitor the License Plugin needs to be executed with the following command:
mvn org.codehaus.mojo:license-maven-plugin:1.14:aggregate-download-licenses -Dlicense.excludedScopes=test,provided
The generated output file named licenses.xml
(in the directory specified in the
plugin config) should look like the following:
link:licenses.xml[role=include]
In Solicitor the data is read with the following reader config:
"readers" : [ { "type" : "maven", "source" : "file:target/generated-resouces/licenses.xml", "usagePattern" : "DYNAMIC_LINKING" } ]
(the above assumes that Solicitor is executed in the maven projects main directory)
The CSV input is normally manually generated and should look like this (The csv File is ";" separated):
link:csvlicenses.csv[role=include]
In Solicitor the data is read with the following part of the config
"readers" : [ { "type" : "csv", "source" : "file:path/to/the/file.csv", "usagePattern" : "DYNAMIC_LINKING" } ]
The following 5 columns need to be contained:
-
groupId
-
artifactId
-
version
-
license name
-
license URL
In case that a component has multiple licenses attached, there needs to be a separate line in the file for each license.
For NPM based projects either the NPM License Crawler (https://www.npmjs.com/package/npm-license-crawler) or the NPM License Checker (https://www.npmjs.com/package/license-checker) might be used. The NPM License Crawler can process several node packages in one run.
To install the NPM License Crawler the following command needs to be executed.
npm i npm-license-crawler -g
To get the licenses, the crawler needs to be executed like the following example
npm-license-crawler --dependencies --csv licenses.csv
The export should look like the following (The csv file is "," separated)
link:licenses.csv[role=include]
In Solicitor the data is read with the following part of the config
"readers" : [ { "type" : "npm-license-crawler-csv", "source" : "file:path/to/licenses.csv", "usagePattern" : "DYNAMIC_LINKING", "repoType" : "npm" } ]
To install the NPM License Checker the following command needs to be executed.
npm i license-checker -g
To get the licenses, the checker needs to be executed like the following example (we require JSON output here)
license-checker --json > /path/to/licenses.json
The export should look like the following
link:licensesNpmLicenseChecker.json[role=include]
In Solicitor the data is read with the following part of the config
"readers" : [ { "type" : "npm-license-checker", "source" : "file:path/to/licenses.json", "usagePattern" : "DYNAMIC_LINKING", "repoType" : "npm" } ]
To generate the input file required for Solicitor yarn needs to be executed with the following command within the directory that contains the project’s package.json (we require JSON output here):
yarn licenses list --json > /path/to/yarnlicenses.json
The export should look like the following
link:yarnlicenses.json[role=include]
In Solicitor the data is read with the following part of the config
"readers" : [ { "type" : "yarn", "source" : "file:path/to/yarnlicenses.json", "usagePattern" : "DYNAMIC_LINKING", "repoType" : "yarn" } ]
To generate the input file required for Solicitor one has follow two steps:
-
Capsulate software with all relevant dependencies/requirements in a virtual environment (venv)
-
Install the pip-licenses plugin within this virtual environment
After that, we execute following command within the virtual environment to extract the input file (we require JSON output here):
pip-licenses --from=all --format=json --with-urls --with-license-file > piplicenses.json
The export should look like the following
link:piplicenses.json[role=include]
In Solicitor the data is read with the following part of the config
"readers" : [ { "type" : "pip", "source" : "file:path/to/piplicenses.json", "usagePattern" : "DYNAMIC_LINKING", "repoType" : "pip" } ]
For the export of the licenses from a Gradle based project the Gradle License Plugin is used.
To install the plugin some changes need to be done in build.gradle
, like following example
buildscript { repositories { maven { url 'https://oss.jfrog.org/artifactory/oss-snapshot-local/' } } dependencies { classpath 'com.jaredsburrows:gradle-license-plugin:0.8.5-SNAPSHOT' } } apply plugin: 'java-library' apply plugin: 'com.jaredsburrows.license'
Afterwards execute the following command in the console:
For Windows (Java Application)
gradlew licenseReport
The Export should look like this:
link:licenses.json[role=include]
In Solicitor the data is read with the following part of the config
"readers" : [ { "type" : "gradle2", "source" : "file:path/to/licenses.json", "usagePattern" : "DYNAMIC_LINKING" } ]
Note
|
The former reader of type gradle is deprecated and should no longer be used. See List of Deprecated Features.
|
For the Export of the the Licenses from a Gradle based Android Projects the Gradle License Plugin is used.
To install the Plugin some changes need to be done in the build.gradle of the Project, like following example
buildscript { repositories { jcenter() } dependencies { classpath 'com.jaredsburrows:gradle-license-plugin:0.8.5' } }
Also there is a change in the build.gradle of the App. Add the line in the second line
apply plugin: 'com.android.application'
Afterwards execute the following command in the Terminal of Android studio: For Windows(Android Application)
gradlew licenseDebugReport
The Export is in the following folder
$Projectfolder\app\build\reports\licenses
It should look like this:
link:licenseDebugReport.json[role=include]
In Solicitor the Data is read with the following part of the config
"readers" : [ { "type" : "gradle2", "source" : "file:$/input/licenses.json", "usagePattern" : "DYNAMIC_LINKING" } ]
Note
|
The former reader of type gradle is deprecated and should no longer be used. See List of Deprecated Features.
|
Solicitor uses the Drools rule engine to execute business rules. Business rules are defined as "extended" decision tables. Each such decision table consists of two artifacts:
-
A rule template file in specific drools template format
-
An Excel (XLSX) table which defines the decision table data
When processing, Solicitor will internally use the rule template to create one or multiple rules for every record found in the Excel sheet. The following points are important here:
-
Rule templates:
-
Rule templates should be regarded as part of the Solicitor implementation and should not be changed on an engagement level.
-
-
Excel decision table data
-
The Excel tables might be extended or changed on a per project level.
-
The rules defined by the tabular data will have decreasing "salience" (priority) from top to bottom
-
In general multiple rules defined within a table might fire for the same data to be processed; the definition of the rules within the rule template will normally ensure that once a rule from the decision table was processed no other rule from that table will be processed for the same data
-
The excel tables contain header information in the first row which is only there for documentation purposes; the first row is completely ignored when creating rules from the xls
-
The rows starting from the second row contain decision table data
-
The first "empty" row (which does not contain data in any of the defined columns) ends the decision table
-
Decision tables might use multiple condition columns which define the data that a rule matches. Often such conditions are optional: If left free in the Excel table the condition will be omitted from the rule conditions. This allows to define very specific rules (which only fire on exact data patterns) or quite general rules which get activated on large groups of data. Defining general rules further down in the table (with lower salience/priority) ensures that more specific rules get fired earlier. This even allows to define a default rule at the end of the table which gets fired if no other rule could be applied.
-
-
rule groups: Business rules are executed within groups. All rules resulting from a single decision table are assigned to the same rule group. The order of execution of the rule groups is defined by the sequence of declaration in the config file. Processing of the current group will be finished when there are no more rules to fire in that group. Processing of the next group will then start. Rule groups which have been finished processing will not be resumed even if rules within that group might have been activated again due to changes of the facts.
By default any condtions given in the fields of decision tables are simple textual comparisons: The condition is true if the property of the model is identical to the given value in the XLS sheet.
Depending on the configuration of the rule templates for some fields, an extended syntax might be available. For those fields the following syntax applies:
-
If the given value of the XLS field starts with the prefix
NOT:
then the outcome of the remaining condition is logically negated, i.e. this field condition istrue
if the rest of the condition is NOT fulfilled. -
A prefix of
REGEX:
indicates that the remainder of the field defines a Java Regular Expression. For the condition to become true the whole property needs to match the given regular expression. -
The prefix
RANGE:
indicates that the remainder of the field defines a Maven Version Range. Using this makes only sense on the artifact version property. -
If no such prefix is detected, then the behavior is identical to the normal (verbatim) comparison logic
Fields which are subject to this extended syntax are marked explicitly in the following section.
The processing of business rules is organized in different phases. Each phase might consist of multiple decision tables to be processed in order.
In this phase the license data imported via the readers is cleaned and normalized. At the end of this phase the internal data model should clearly represent all components and their assigned licenses in normalized form.
The phase itself consists of two decision tables / rule groups:
With this decision table is is possible to explicitely assign NormalizedLicenses to components. This will be used if the imported RawLicense data is either incomplete or incorrect. Items which have been processed by rules of this group will not be reprocessed by the next rule group.
-
LHS conditions:
-
Engagement.clientName
-
Engagement.engagementName
-
Application.applicationName
-
ApplicationComponent.groupId
[magic] -
ApplicationCompomnent.artifactId
[magic] -
ApplicationComponent.version
[magic] -
RawLicense.declaredLicense
[magic] -
RawLicense.url
[magic]
-
-
RHS result:
-
NormalizedLicense.normalizedLicenseType
-
NormalizedLicense.normalizedLicense
-
NormalizedLicense.normalizedLicenseUrl
-
NormalizedLicense.comment
-
[magic]: On these fields the Extended comparison syntax might be used
All RawLicenses which are in scope of fired rules will be marked so that they do not get reprocessed by the following decision table.
With this decision table the license info from the RawLicense is mapped to the NormalizedLicense. This is based on the name and/or URL of the license as imported via the readers.
-
LHS conditions:
-
RawLicense.declaredLicense
[magic] -
RawLicense.url
[magic]
-
-
RHS result:
-
NormalizedLicense.normalizedLicenseType
-
NormalizedLicense.normalizedLicense
-
[magic]: On these fields the Extended comparison syntax might be used
Within this phase the actually applicable licenses will be selected for each component.
This phase consists of two decision tables.
This group of rules has the speciality that it might match to a group of NormalizedLicenses associated to an ApplicationComponent. In case that multiple licenses are associated to an ApplicationComponent one of them might be selected as "effective" license and the others might be marked as Ignored
.
-
LHS conditions:
-
ApplicationComponent.groupId
[magic] -
ApplicationComponent.artifactId
[magic] -
ApplicationComponent.version
[magic] -
NormalizedLicense.normalizedLicense
(licenseToTake; mandatory) -
NormalizedLicense.normalizedLicense
(licenseToIgnore1; mandatory) -
NormalizedLicense.normalizedLicense
(licenseToIgnore2; optional) -
NormalizedLicense.normalizedLicense
(licenseToIgnore3; optional)
-
-
RHS result
-
license matching "licenseToTake" will get this value assigned to
effectiveNormalizedLicense
-
licenses matching "licenseToIgnoreN" will get
IGNORED
assigned toeffectiveNormalizedLicenseType
Ignored
assigned toeffectiveNormalizedLicense
-
[magic]: On these fields the Extended comparison syntax might be used
It is important to note that the rules only match, if all licenses given in the conditions actually exist and are assigned to the same ApplicationComponent.
The second decision table in this group is used to define the effectiveNormalizedLicense
(if not already handled by the decision table before).
-
LHS conditions:
-
ApplicationComponent.groupId
[magic] -
ApplicationComponent.artifactId
[magic] -
ApplicationComponent.version
[magic] -
NormalizedLicense.normalizedLicenseType
-
NormalizedLicense.normalizedLicense
-
-
RHS result:
-
NormalizedLicense.effectiveNormalizedLicenseType
(if empty in the decision table then the value ofnormalizedLicenseType
will be taken) -
NormalizedLicense.effectiveNormalizedLicense
(if empty in the decision table then the value ofnormalizedLicense
will be taken) -
NormalizedLicense.effectiveNormalizedLicenseUrl
(if empty in the decision table then the value ofnormalizedLicenseUrl
will be taken)
-
[magic]: On these fields the Extended comparison syntax might be used
The third phase ist the legal evaluation of the licenses and the check, whether OSS usage is according to defined legal policies. Again this phase comprises two decision tables.
Within the pre evaluation the license info is checked against standard OSS usage policies. This roughly qualifies the usage and might already determine licenses which are OK in any case or which need to be further evaluated. Furtheron they qualify whether the license text or source code needs to be included in the distribution. The rules in this decision table are only based on the effectiveNormalizedLicense
and do not consider any project, application of component information.
-
LHS condition:
-
NormalizedLicense.effectiveNormalizedLicenseType
-
NormalizedLicense.effectiveNormalizedLicense
-
-
RHS result:
-
NormalizedLicense.legalPreApproved
-
NormalizedLicense.copyLeft
-
NormalizedLicense.licenseCompliance
-
NormalizedLicense.licenseRefUrl
-
NormalizedLicense.includeLicense
-
NormalizedLicense.includeSource
-
The decision table for final legal evaluation defines all rules which are needed to create the result of the legal evaluation. Rules here might be general for all projects or even very specific to a project if the rule can not be applied to other projects.
-
LHS condition:
-
Engagement.clientName
-
Engagement.engagementName
-
Engagement.customerProvidesOss
-
Application.applicationName
-
ApplicationComponent.groupId
[magic] -
ApplicationComponent.artifactId
[magic] -
ApplicationComponent.version
[magic] -
ApplicationComponent.usagePattern
-
ApplicationComponent.ossModified
-
NormalizedLicense.effectiveNormalizedLicenseType
-
NormalizedLicense.effectiveNormalizedLicense
-
-
RHS result:
-
NormalizedLicense.legalApproved
-
NormalizedLicense.legalComments
-
[magic]: On these fields the Extended comparison syntax might be used
The standard process as described before consists of 6 decision tables / rule groups to be processed in sequence. When using the builtin default base configuration all those decision tables use the internal sample data / rules as contained in Solicitor.
To use your own rule data there are three approaches:
-
Include your own
rules
section in the project configuration file (so not inheriting from the builtin base configuration file) and reference your own decision tables there. -
Create your own "Solicitor Extension" which might completely redefine/replace the buitin
Solicitor
setup including all decision tables and the base configuration file. See Extending Solicitor for details. -
Make use of the optional project specific decision tables which are defined in the default base configuration: For every builtin decision table there is an optional external decision table (expected in the filesystem) which will be checked for existence. If such external decision table exists it will be processed first - before processing the builtin decision table. Thus is it possible to amend / override the builtin rules by project specific rules. When you create the starter configuration of your project as described in Starting a new project, those project specific decision tables are automatically created.
After applying the business rules the resulting data can can be used to create reports and other output documents.
Creating such reports consists of three steps:
-
transform and filter the model data by using an embedded SQL database
-
determining difference to previously stored model (optional)
-
Template based reporting via
-
Velocity templates (for textual output like e.g. HTML)
-
Excel templates
-
After the business rules have been processed (or a Solicitor data model has been loaded via
command line option -l
) the model data is stored in a dynamically created internal SQL database.
-
For each type of model object a separate table is created. The tablename is the name of model object type written in uppercase characters. (E.g. type
NormalizedLicense
stored in tableNORMALIZEDLICENSE
) -
All properties of the model objects are stored as strings in fields named like the properties within the database table. Field names are case sensitive (see note below for handling this in SQL statements).
-
An additional primary key is defined for each table, named
ID_<TABLENAME>
. -
For all model elements that belong to some parent in the object hierarchy (i.e. all objects except
ModelRoot
) a foreign key field is added namedPARENT_<TABLENAME>
which contains the unique key of the corresponding parent
Each Writer configuration (see Writers / Reporting) includes a section which references SQL select statements that are applied on the database data. The result of the SQL select statements is made accessible for the subsequent processing of the Writer via the dataTable name given in the configuration.
Before the result of the SQL select statement is handed over to the Writer the following postprocessing is done:
-
a
rowCount
column is added to the result which gives the position of the entry in the result set (starting with 1). -
Columns named
ID_<TABLENAME>
are replaced with columns namedOBJ_<TABLENAME>
. The fields of those columns are filled with the corresponding original model objects (java objects).
Warning
|
The result table column OBJ_<TABLENAME> gives access to the native Solicitor data model (java objects), e.g. in the Velocity writer. As this breaks the decoupling done via the SQL database using this feature is explicitely discouraged. It should only be used with high caution and in exceptional situations. The feature might be discontinued in future versions without prior notice.
|
When using the command line option -d
Solicitor can determine difference information between two different data models (e.g. the difference between the licenses of the current release and a former release.) The difference is calculated on the result of the above described SQL statements:
-
First the internal reporting database is created for the current data model and all defined SQL statements are executed
-
Then the internal database is recreated for the "old" data model and all defined SQL stements are executed again
-
Finally for each defined result table the difference between the current result and the "old" result is calculated
To correctly correlate corresponding rows of the two different versions of table data it is necessary to define explicit correlation keys for each table in the SQL select statement.
It is possible to define up to 10 correlation keys named CORR_KEY_X
with X in the range from 0 to 9. CORR_KEY_0
has highest priority, CORR_KEY_9
has lowest priority.
The correlation algorithm will first try to match rows using CORR_KEY_0
. It will then attempt to correlate unmatched rows using CORR_KEY_1
e.t.c.. Correlation will stop, when
-
all correlations keys
CORR_KEY_0
toCORR_KEY_9
have been processed OR -
the required correlation key column does not exist in the SQL select result OR
-
there are no unmatched "new" rows OR
-
there are no unmatched "old" rows
The result of the correlation / difference calulation is stored in the reporting table data structure. For each row the status is accessible if
-
The row is "new" (did not exist in the old data)
-
The row is unchanged (no changes in the field values representing the properties of the Solicitor data model)
-
The row is changed (at least one field corresponding to the Solicitor data model changed)
For each field of "changed" or "unchanged" rows the following status is available:
-
Field is "changed"
-
Field is "unchanged"
For each field of such rows it is furtheron possible to access the new and the old field value.
The following shows a sample SQL statement showing some join over multiple tables and the use of correlations keys.
link:../src/main/resources/com/devonfw/tools/solicitor/sql/allden_normalizedlicenses.sql[role=include]
Note
|
Above example also shows how the case sensitive column names have to be handled within the SQL |
The above dscribed SQL processing is identical for all Writers. Writers only differ in the way how the output document is created based on a template and the reporting table data obtained by the SQL transformation.
The Velocity Writer uses the Apache Velocity Templating Engine to create text based reports. The reporting data tables created by the SQL transformation are directly put to the into Velocity Context.
For further information see the
-
Velocity Documentation
-
The Solicitor JavaDoc (which also includes datails on how to access the diff information for rows and fields of reporting data tables)
-
The samples included in Solicitor
Within Excel spreadsheet templates there are two kinds of placeholders / markers possible, which control the processing:
The templating logic searches within the XLSX workbook for fields containing the names of the reporting data tables as defined in the Writer configuration like e.g.:
-
#ENGAGEMENT#
-
#LICENSE#
Whenever such a string is found in a cell this indicates that this row is a template row. For each entry in the respective resporting data table a copy of this row is created and the attribute replacement will be done with the data from that reporting table. (The pattern #…#
will be removed when copying.)
Within each row which was copied in the previous step the templating logic searches for the string pattern $someAttributeName$
where someAttributeName
corresponds to the column names of the reporting table. Any such occurence is replaced with the corresponding data value.
In case that a difference processing (new vs. old model data) was done this will be represented as follows when using the XLS templating:
-
For rows that are "new" (so no corresponding old row available) an Excel note indicating that this row is new will be attached to the field that contained the
#…#
placeholder. -
Fields in non-new rows that have changed their value will be marked with an Excel note indicating the old value.
Resolving of the content of license texts which are referenced by the URLs given in NormalizedLicense.effectiveNormalizedLicenseUrl
and NormalizedLicense.licenseRefUrl
is done in the following way:
-
If the content is found as a resource in the classpath under
licenses
this will be taken. (The Solicitor application might include a set of often used license texts and thus it is not necessary to fetch those via the net.) If the classpath does not contain the content of the URL the next step is taken. -
If the content is found as a file in subdirectory
licenses
of the current working directory this is taken. If no such file exists the content is fetched via the net. The result will be written to the file directory, so any content will only be fetched once. (The user might alter the files in that directory to change/correct its content.) A file of length zero indicates that no content could be fetched.
The determined content is available as NormalizedLicense.effectiveNormalizedLicenseContent
and NormalizedLicense.licenseRefContent
Fetching the license content NormalizedLicense.effectiveNormalizedLicenseContent
based on the URL in NormalizedLicense.effectiveNormalizedLicenseUrl
will often result in content which is in HTML format instead of plain text and is not properly rendered when included in reports.
Sometimes the URL even does not point to the license text itself but just the homepage of the project.
In general it is possible to manually correct this by editing the downloaded and cached content as described in the previous section. This approach might require a lot of manual work. Solicitor therefore includes a mechanism named license url guessing which tries to guess an alternative license URL which should point to a representation of the content better suited for rendering.
Currently license URL guessing is based solely on the URL given in NormalizedLicense.effectiveNormalizedLicenseUrl
. It will try the following approaches:
-
If the original URL is a Github-URL and matches patterns which are known to return HTML-formatted content then the URL is rewritten to point to a raw version of the content.
-
If the original URL points to a Github project page (not to a file), then the algorithm will try different typical locations (like e.g. looking for file
LICENSE
). If found it will return this URL as result. -
If no "better" URL could be guessed it will return the original URL.
The result of the license URL guessing is available via three attributes:
-
NormalizedLicense.guessedLicenseUrl
: The (possibly) improved URL pointing to the license text. -
NormalizedLicense.guessedLicenseUrlAuditInfo
: A text which gives info how the guessed url was determined (available for auditing purposes). -
NormalizedLicense.guessedLicenseContent
: The content downloaded from the guessed URL
Note
|
Downloading the license content (also including the checking if a certain resource is available when trying different possible filenames) is done using the same (caching) mechanisms as downloading the content for other URLs, see the previous section. |
The information about guessed URLs for given original URLs (also including the audit info on the guessing process) uses a caching mechanism which is mainly identical to the caching of downloaded content. The files containing the cached data are stored in directory licenseurls
(instead of licenses
for the content itself).
The file content looks as follows:
https://raw.githubusercontent.com/some/project/master/LICENSE (1) ------------------------- (2) URL changed from https://github.com/some/project/blob/master/LICENSE to https://raw.githubusercontent.com/some/project/master/LICENSE (3)
-
the guessed URL
-
a line of dashes as separator
-
the audit info (might be multiple lines)
It is possible to manually change this cached information and thus correct it - similar to manually correcting the license text as described above.
Warning
|
License guessing is a new feature as of Solicitor 1.3.0.
The guessing algorithm might be modified in future versions without further notice which might result in different outcomes for the guessed URLs.
|
Within the lifecycle of the Solicitor development features might be discontinued due to various reasons. In case that such discontinuation is expected to break existing projects a two stage deprecation mechanism is used:
-
Stage 1: Usage of a deprecated feature will produce a warning only giving details on what needs to be changed.
-
Stage 2: When a deprecated feature is used Solicitor by default will terminate with an error message giving information about the deprecation.
By setting the property solicitor.deprecated-features-allowed
to true
(e.g. via the command line, see Configuration of Technical Properties), even in second stage
the feature will still be available and only a warning will be logged. The project setup should in any
case ASAP be changed to no longer use the feature as it might soon be removed without further
notice.
Important
|
Enabling the use of deprecated feature via the above property should only be a temporary workaround and not a standard setting. |
Note
|
If usage of a feature should be discontinued immediately (e.g. because it might lead to wrong/misleading output) the first stage of deprecation will be skipped. |
The following features are deprecated via the above mechanism:
-
Reader of type "gradle" (use Reader of type "gradle2" instead); Stage 2 from Version 1.0.5 on; see devonfw#58
-
Reader of type "npm" (use type "npm-license-crawler-csv" instead); Stage 1 from Version 1.0.8 on; see devonfw#62
The builtin default base configuration contains settings for the rules
and writers
section
of the Solicitor configuration file which will be used if the project specific config file omits those sections.
link:../src/main/resources/com/devonfw/tools/solicitor/config/solicitor_base.cfg[role=include]
The following lists the default settings of technical properties as given by the built in application.properties
file.
If required these values might be overridden on the command line when starting Solicitor
:
java -Dpropertyname1=value1 -Dpropertyname2=value2 -jar solicitor.jar <any other arguments>
link:../src/main/resources/application.properties[role=include]
There are different templates that can be used for reporting. For usage, the templates have to be specified in the “writers” section of the solicitor configuration file (see Writers / Reporting). In the default solicitor configuration all templates are specified. (see [Appendix A: Default Base Configuration])
With this template a report in Excel format can be created. The spreadsheet contains data from the internal database (see [Database Structure]) which can be fetched by specifying the path to the SQL statements files in the solicitor configuration file.
This template creates a HTML document which has a table containing the relevant data from the internal database. Cells that have been changed, compared to a previous solicitor run, are marked in a different color. For usage, the option -d <filename> needs to be appended with filename being saved_latest_model.json.
This template creates an HTML document which has an overview of OSS components used in the project. The data is displayed in a table with the columns: Name, GroupId, Version, Application, License, LicenseUrl.
Similar to the above but uses guessed license URLs and content, see Guessing of license URLs.
This template creates an HTML document which contains OSS components that have been mapped to multiple licenses. The data is displayed in a table with the columns: Application, OSS Name/Product, OSS ArtifactId, OSS Version, Effective Normalized Licenses, License Count.
This template creates a script file for using ScanCode to create a JSON document which contains copyright information (statements, holders, authors) about each artifact within a project. Copyright information is displayed on file level. The output format being a JSON allows for further usage and processing as we store it in a valid syntax with an array of artifacts that contain all infos. The copyright scanner used for this report is scancode-toolkit-21.8.4. Usage information can be found in the created scancode_project.sh script after running solicitor.
Note
|
This report is an experimental feature and might be changed or removed in future versions without any notice. |
Solicitor
comes with a sample rule data set and sample reporting templates. In general it will
be required to correct, supplement and extend this data sets and templates. This can be done straightforward
by creating copies of the appropriate resources (rule data XLS and template files), adopting them and furtheron referencing those copies instead of the original resources from the project configuration file.
Even though this approach is possible it will result in hard to maintain configurations,
especially in the case of multiple projects using Solicitor
in parallel.
To support such scenarios Solicitor
provides an easy extension mechanism which allows
to package all those customized configurations into a single archive and reference it from the
command line when starting Solicitor
.
This facilitates configuration management, distribution and deployment of such extensions.
The extensions might be provided as JAR file or even as a simple ZIP file. There is only one mandatory file which contains (at least metadata) about the extension and which needs to be included in this archive in the root folder.
link:../src/main/resources/samples/application-extension.properties[role=include]
This file is included via the standard Spring Boot profile mechanism. Besides containing
naming and version info on the extension this file might override any
property values defined within Solicitor
.
Any other resources (like rule data or templates) which need to be part of the Extension can be included in the archive as well - either in the root directory or any subdirectories. If the extension is active those resources will be available on the classpath like any resources included in the Solicitor jar.
Overriding / redefining the default base configuration within the Extension enables to update all rule data and templates without the need to touch the projects configuration file.
- Changes in 1.3.0
-
-
New report ScancodeDownloadScript.vm to compile copyright information using ScanCode.
-
devonfw#75: Added license URL guessing, see Guessing of license URLs.
-
devonfw#86: In case that downloading content for a given URL fails no WARN message with stacktrace will be shown any more. Instead there will be an info message (SOLI-047 or SOLI-048) indicating that the content could not be downloaded. This change is due to the fact that failed downloads are expected - especially with the new feature license URL guessing.
-
Readers for PIP and YARN added.
-
- Changes in 1.2.3
-
-
devonfw#97: Fixed the bug which made the GradleReader and GradleReader2 skip the first entry in the file.
-
devonfw#87: GradleReader and GradleReader2 no longer fail when reading files that contain no entry. Actually this was due to bug devonfw#97.
-
- Changes in 1.2.2
-
-
Fixed bug which resulted in corrupt XLS report due to cell comment exceeding maximum allowed size.
-
- Changes in 1.2.1
-
-
devonfw#94: Fixed by making sure that formulas get evaluated when opening the workbook with excel.
-
Fixed bug when reading saved data model for delta calculation. (
repoType
was not read correctly and resulted in always reporting a difference.)
-
- Changes in 1.2.0
-
-
Added some license name mapping rules in LicenseNameMappingSample.xls.
-
devonfw#71: New "Quality Report" which might be helpful in validating the outcome of the Solicitor run. Currently this report contains a list of all application components which have more than one effective license attached. This might be helpful for spotting cases where appropriate rules for selecting the applicable license in case of dual-/multilicensing is missing.
-
- Changes in 1.1.1
-
-
Corrected order of license name mapping which prevented Unlicense, The W3C License, WTFPL, Zlib and Zope Public License 2.1 to be mapped.
-
- Changes in 1.1.0
-
-
devonfw#67: Inclusion of detailed license information for the dependencies included in the executable JAR. Use the '-eug' command line option to store this file (together with a copy of the user guide) in the current work directory.
-
Additional rules for license name mappings in decision table LicenseNameMappingSample.xls.
-
devonfw#61: Solicitor can now run with Java 8 or Java 11.
-
- Changes in 1.0.8
-
-
devonfw#62: New Reader of type
npm-license-checker
for reading component/license data collected by NPM License Checker (https://www.npmjs.com/package/license-checker). The type of the existing Reader for reading CSV data from the NPM License Crawler has been changed fromnpm
tonpm-license-crawler-csv
. (npm
is still available but deprecated.) Projects should adopt their Reader configuration and replace typenpm
bynpm-license-crawler-csv
.
-
- Changes in 1.0.7
-
-
devonfw#56: Enable continuing analysis in multiapplication projects even is some license files are unavailable.
-
Described simplified usage of license-maven-plugin without need to change pom.xml. (Documentation only)
-
Ensure consistent sorting even in case that multiple "Ignored" licenses exist for a component
-