The schema-enforcer schema
command is used to manage schemas. It can:
- List defined schemas, along with their type, location, and filename.
- Check that defined schemas are valid as schema definitions, and that schema unit tests pass.
- Generate invalid test results given a set of data which is intended to fail schema validation.
The schema enforcer schema --list
command can be used to print out a table of defined schemas. These schemas are loaded based on the directory sctructure elucidated in the README.md file at the root of the repository in the overview section.
bash$ cd examples/example3
bash$ schema-enforcer schema --list
The schema-enforcer schema --check
command does two things.
- Validates that schemas are valid according to spec (e.g. JSONSchema type schemas are validated as defined correctly according to the JSONSchema spec)
- Runs defined unit tests
bash$ cd examples/example3
bash$ schema-enforcer schema --check
ALL SCHEMAS ARE VALID
Unit tests for schemas expect a certain directory hierarchy. Tests are placed inside of a schema-id specific directory which is nested in the test_directory
(defaults to tests
). The test_directory must be nested inside of the main_directory
, which defaults to schema
.
bash$ tree schema -L 2
schema
├── definitions
│ ├── arrays
│ ├── objects
│ └── properties
├── schemas
│ ├── dns.yml
│ ├── ntp.yml
│ └── syslog.yml
└── tests
├── dns_servers
├── ntp
└── syslog_servers
9 directories, 3 files
Note: The names of the main_directory and test_directory can be configured in a pyproject.toml file if you want to override the defaults. See configuration.md for more information on how to do so.
When putting tests into a directory for a given schema ID, the short form of the schema ID is used as the directory name. The short form of the schema ID is generated by removing /
from the schema ID and anything proceeding it. For example, the name of the test directory for the schema ID schemas/ntp
is named ntp
in the example above.
In example 3, tests have been written for three different schema definitions and placed in the directories schema/tests/dns_servers
, schema/tests/ntp
, and schema/tests/syslog_servers
. We'll look at the tests written for the schemas/ntp
schema ID (in the schema/tests/ntp directory) in the following example.
bash$ cd examples/example3
bash$ tree schema/tests/ntp
├── invalid
│ ├── invalid_format
│ │ ├── data.yml
│ │ └── results.yml
│ ├── invalid_ip
│ │ ├── data.yml
│ │ └── results.yml
│ └── missing_required
│ ├── data.yml
│ └── results.yml
└── valid
├── full_implementation.json
└── partial_implementation.json
Nested inside of the tests folder for each schema ID, there is an invalid
directory and a valid
directory. Tests which purposely generate invalid results (e.g. assert that a given set of data fails to adhere to the schema definition) should be placed inside of the invalid
directory. Tests which purposely generate valid results (e.g. assert that a a given set of data adheres to the schema definition) should be placed in the valid
directory.
In the invalid directory, three different test cases exist -- invalid_format
, invalid_ip
, and missing_required
. Each of these test cases contains a data
file and a results
file. The data file must be named one of data.yml
, data.yaml
, or data.json
. Similarly, the results file must be named one of results.yml
, results.yaml
, or results.json
. When we assert that data fails to adhere to a given schema, we want not just to assert that it failed, but that it failed in the specific way we were expecting. This is why the invalid tests are each directories with data
and results
files defined.
In the valid directory, two different test cases exist -- full_implementation
and partial_implementation
. When we assert that data adheres to schema, there is no need to define the specific way in which data is schema valid, just that it is. This is why these test cases are simply files of data which should be schema valid rather than a directory with two separate files indicating data which is schema invalid and the specific anticipated way in which it should be invalid.
Note: both valid and invalid test case data and results files can be named with extensions of .yml, .yaml, or .json.
In the following example, the schemas/ntp
schema has an invalid_format
test case defined. The data for the test case is as follows.
bash$ cat schema/tests/ntp/invalid/invalid_format/data.yml
---
ntp_servers:
- "10.1.1.1"
The expected results for the test case is as follows.
bash$ cat schema/tests/ntp/invalid/invalid_format/results.yml
---
results:
- result: "FAIL"
schema_id: "schemas/ntp"
absolute_path:
- "ntp_servers"
- "0"
message: "'10.1.1.1' is not of type 'object'"
The schemas/ntp
schema ID is used to validate this data as the folder it is nested in is tests/ntp
and ntp
is the short schema name of the schema ID schemas/ntp
bash$ cat schema/schemas/ntp.yml
---
$schema: "http://json-schema.org/draft-07/schema#"
$id: "schemas/ntp"
description: "NTP Configuration schema."
type: "object"
properties:
ntp_servers:
$ref: "../definitions/arrays/ip.yml#ipv4_hosts"
ntp_authentication:
type: "boolean"
ntp_logging:
type: "boolean"
additionalProperties: false
required:
- "ntp_servers"
something: "extra"
This schema definition includes a reference to the ipv4_hosts
property inside of schemas/definitions/arrays/ip.yml
which in turn references an object defiend in the object folder, and a property defined in the properties folder.
The data is not schema valid because it should be a list (array) of dictionary (hash) type objects, of which the key address
must be defined, and the value must be an IP address. If we were to change the data defined in data.yml
to the following, data would be schema valid. Because it is in the "invalid" folder, it is expected to fail, and thus we will see an error when schema validation checks are performed and the data is schema valid (unexpected result) instead of schema invalid (expected result)
bash$ cat schema/tests/ntp/invalid/invalid_format/data.yml
---
ntp_servers:
- address: "10.1.1.1"
bash$ schema-enforcer schema --check
FAIL | [ERROR] Invalid test results do not match expected test results from /Users/ntc/schema_enforcer/examples/example3/schema/tests/ntp/invalid/invalid_format/results.yml [PROPERTY]
The schema-enforcer schema --generate-invalid
command allows a user to generate invalid unit tests results which are expected when data which does not correctly adhere to a schema definition is checked for adherence against that definition. These results are stored as JSON or YAML structured data.
In the following example we delete the results.yaml
file from the invalid_format
unit test.
bash$ cd examples/example3
bash$ rm schema/tests/ntp/invalid/invalid_format/results.yml
bash$ tree schema/tests/ntp/invalid
schema/tests/ntp/invalid
├── invalid_format
│ └── data.yml
├── invalid_ip
│ ├── data.yml
│ └── results.yml
└── missing_required
├── data.yml
└── results.yml
When we run schema-enforcer in check mode now, schema enforcer generates a warning indicating that no results are defined for the ntp invalid_format test, and thus it will skip the test.
bash$ schema-enforcer schema --check
WARNING | Could not find expected_results_file /Users/ntc/schema_enforcer/examples/example3/schema/tests/ntp/invalid/invalid_format/results. Skipping...
ALL SCHEMAS ARE VALID
By using the schema-enforcer schema --generate-invalid
command, we can generate the expected results when data is checked to see if it is schema valid. Note, we have to pass in a schema-id
for which schema-enforcer
should generate results files. When this command is run, schema-enforcer
generates results for every unit test for the given schema ID, so long as it has a data file defined.
bash$ schema-enforcer schema --generate-invalid --schema-id schemas/ntp
Generated/Updated results file: /Users/ntc/schema_enforcer/examples/example3/schema/tests/ntp/invalid/invalid_format/results.yml
Generated/Updated results file: /Users/ntc/schema_enforcer/examples/example3/schema/tests/ntp/invalid/invalid_ip/results.yml
Generated/Updated results file: /Users/ntc/schema_enforcer/examples/example3/schema/tests/ntp/invalid/missing_required/results.yml
We can see that a results.yml file has been placed inside of the ntp invalid_format unit test by the --generate-invalid command we ran above.
bash$ tree schema/tests/ntp/invalid
schema/tests/ntp/invalid
├── invalid_format
│ ├── data.yml
│ └── results.yml
├── invalid_ip
│ ├── data.yml
│ └── results.yml
└── missing_required
├── data.yml
└── results.yml
When we inspect the invalid_format results.yml file, We see that the result is "FAIL"
with a specific message. This is the result we expect in the event that the data defined in data.yml
is checked for adherence against schema.
---
results:
- result: "FAIL"
schema_id: "schemas/ntp"
absolute_path:
- "ntp_servers"
- "0"
message: "'10.1.1.1' is not of type 'object'"
Note: The results generaged by schema enforcer should be checked to ensure they are the results expected per the data defined and that they don't fail for a reason different than the one you expected.