Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vault-quota: README and design doc #533

Merged
merged 1 commit into from
Sep 8, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions vault-quota/Design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# vault quota design/algorithms

The definitive source of content-length (file size) of a DataNode coems from the
`inventory.Artifact` table and it not known until a PUT to storage is completed.
In the case of a `vault` service co-located with a single storage site (`minoc`),
the new Artifact is visible in the database as soon as the PUT to `minoc` is
completed. In the case of a `vault` service co-located with a global SI, the new
Artifact is visible in the database once it is synced from the site of the PUT to
`minoc` to the global database by `fenwick` (or worst case: `ratik`).

## TODO
The design below only takes into account incremental propagation of space used
by stored files. It is not complete/verified until we also come up with a validation
algorithm that can detect and fix discrepancies in a live `vault`.

## Event watcher algorithm:
```
track progress using HarvestState (name: `Artifact`, source: `db:{bucket range}`)
incremental query for new artifacts in lastModified order
for each new Artifact:
query for DataNode (storageID = artifact.uri)
if Artifact.contentLength != Node.size:
start txn
lock datanode
compute delta
lock parent
apply delta to parent.delta
set dataNode.size
update HarvestState
commit txn
```
The above sequence does the first step of propagation from DataNode to parent ContainerNode.
This can be done in parallel by using bucket ranges (smaller than 0-f).

## Container size propagation algorithm:
```
query for ContainerNode with non-zero delta
for each ContainerNode:
start txn
lock containernode
re-check delta
lock parent
apply delta to parent.delta
apply delta containernode.size, set containernode.delta=0
commit txn
```
The above sequence finds candidate propagations, locks (order: child-then-parent as above),
and applies the propagation. This moves the outstanding delta up the tree one level. If the
sequence acts on multiple child containers before the parent, the delta(s) naturally
_merge_ and there are fewer larger delta propagations in the upper part of the tree. It would
be optimal to do propagations depth-first but it doesn't seem practical to forcibly accomplish
that ordering.

Container size propagation will be implemented as a single sequence (thread). We could add
something to the vospace.Node table to support subdividing work and enable multiple threads,
but there is nothing there right now.

## database changes required
note: all field and column names TBD
* add `size` and `delta` fields to ContainerNode (transient)
* add `size` field to DataNode (transient)
* add `size` to the `vospace.Node` table
* add `delta` to the `vospace.Node` table
* incremental sync query/iterator (ArtifactDAO?)
* lookup DataNode by storageID (ArtifactDAO?)

59 changes: 59 additions & 0 deletions vault-quota/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Storage Inventory VOSpace quota support process (vault-quota)

Process to maintain container node sizes so that quota limits can be enforced by the
main `vault` service. This process runs in incremental mode (single process running
continuously) to update a local vospace database.

`vault-quota` is an optional process that is only needed if `vault` is configured to
enforce quotas, although it could be used to maintain container node sizes without
quota enforcement.

## configuration
See the [cadc-java](https://github.com/opencadc/docker-base/tree/master/cadc-java) image
docs for general config requirements.

Runtime configuration must be made available via the `/config` directory.

### vault-quota.properties
```
org.opencadc.vault.quota.logging = {info|debug}

# inventory database settings
org.opencadc.inventory.db.SQLGenerator=org.opencadc.inventory.db.SQLGenerator
org.opencadc.vault.quota.nodes.schema={schema for inventory database objects}
org.opencadc.vault.quota.nodes.username={username for inventory admin}
org.opencadc.vault.quota.nodes.password={password for inventory admin}
org.opencadc.vault.quota.nodes.url=jdbc:postgresql://{server}/{database}

org.opencadc.vault.quota.threads={number of threads to watch for artifact events}

# storage namespace
org.opencadc.vault.storage.namespace = {a storage inventory namespace to use}
```
The _nodes_ account owns and manages (create, alter, drop) vospace database objects and updates
content in the vospace schema. The database is specified in the JDBC URL. Failure to connect or
initialize the database will show up in logs.

The _threads_ key configures the number of threads that watch for new Artifact events and initiate
the propagation of sizes to parent containers. These threads each monitor a subset of artifacts using
`Artifact.uriBucket` filtering; for simplicity, the following values are allowed: 1, 2, 4, 8, 16.

In addition to the above threads, there is one additional thread that propagates size changes up
the tree of container nodes to the container node(s) where quotas are specified.

## building it
```
gradle clean build
docker build -t vault-quota -f Dockerfile .
```

## checking it
```
docker run -it vault-quota:latest /bin/bash
```

## running it
```
docker run --user opencadc:opencadc -v /path/to/external/config:/config:ro --name vault-quota vault-quota:latest
```

Loading