description |
---|
RecallGraph - A versioning data store for time-variant graph data. |
{% embed url="https://github.com/RecallGraph/RecallGraph" caption="Project Source @ Github" %}
RecallGraph is a versioned-graph data store - it retains all changes that its data (vertices and edges) have gone through to reach their current state. It supports point-in-time graph traversals, letting the user query any past state of the graph just as easily as the present.
It is a Foxx Microservice for ArangoDB that features VCS-like semantics in many parts of its interface, and is backed by a transactional event tracker. It is currently being developed and tested on ArangoDB v3.5 and v3.6, with support for v3.7 in the pipeline.
Example: A Time-Travelling Mind Map Tool.
To get an idea of where such a data store might be used, see:
Also check out the recordings/slides below:
{% embed url="https://www.youtube.com/watch?v=UP2KDQ\_kL4I" caption="RecallGraph presented @ ArangoDB Online Meetup" %}
{% embed url="https://docs.google.com/presentation/d/1FHNfMxNnBiR4dXdqVJqInTiXdmX-9dSEtUrHMsw-O0E/edit\#slide=id.p" caption="The associated slide deck" %}
{% embed url="https://www.youtube.com/watch?v=A953O3hT1Os" caption="A discussion on RecallGraph's development roadmap" %}
{% embed url="https://docs.google.com/presentation/d/12YkSXqh4eTiA6I3mXxK3d\_RS2vX5dpayjvYlQhs9gzE/edit?usp=sharing" caption="The associate slide deck" %}
TL;DR: RecallGraph is a potential fit for scenarios where data is best represented as a network of vertices and edges (i.e., a graph) having the following characteristics:
- Both vertices and edges can hold properties in the form of attribute/value pairs (equivalent to JSON objects).
- Documents (vertices/edges) mutate within their lifespan (both in their individual attributes/values and in their relations with each other).
- Past states of documents are as important as their present, necessitating retention and queryability of their change history.
RecallGraph's API is split into 3 top-level categories:
- Create - Create single/multiple documents (vertices/edges).
- Replace - Replace entire single/multiple documents with new content.
- Delete - Delete single/multiple documents.
- Update - Add/Update specific fields in single/multiple documents.
- Restore - Restore deleted nodes back to their last known undeleted state.
- (Planned) Materialization - Point-in-time checkouts.
- (Planned) CQRS/ES Operation Mode - Async implicit commits.
- Log - Fetch a log of events (commits) for a given path pattern (path determines scope of documents to pick). The log can be optionally grouped/sorted/sliced within a specified time interval.
- Diff - Fetch a list of forward or reverse commands (diffs) between commits for specified documents.
- Explicit Commits - Commit a document's changes separately, after it has been written to DB via other means (AQL / Core REST API / Client).
- (Planned) Branch/Tag - Create parallel versions of history, branching off from a specific event point of the main timeline. Also, tag specific points in branch+time for convenient future reference.
- Show - Fetch a set of documents, optionally grouped/sorted/sliced, that match a given path pattern, at a given point in time.
- Filter - In addition to a path pattern like in 'Show', apply an expression-based, simple/compound post-filter on the retrieved documents.
- Traverse - A point-in-time traversal (walk) of a past version of the graph, with the option to apply additional post-filters to the result.
- k Shortest Paths - Point-in-time, weighted, shortest paths between two endpoints.
- Purge - Delete all history for specified nodes.
- Although the test cases are quite extensive and have good coverage, this service has only been tested on single-instance DB deployments, and not on clusters.
- As of version 3.6, ArangoDB does not support ACID transactions for multi-document/collection writes in cluster mode. Transactional ACIDity is not guaranteed for such deployments.
- Support for absolute/relative revision-based queries on individual documents (in addition to the timestamp-based queries supported currently),
- Branching/tag support,
- Support for the valid time dimension in addition to the currently implemented transaction time dimension (https://www.researchgate.net/publication/221212735_A_Taxonomy_of_Time_in_Databases),
- Support for ArangoDB v3.7,
- Multiple, simultaneous materialized checkouts (a la
git
) of selectable sections of the database (entire DB, named graph, named collection, document list, document pattern), with eventual branch-level specificity, - CQRS/ES operation mode (async implicit commits),
- Support for ArangoDB clusters (limited at present by lack of support for multi-document ACID transactions in clusters).
- Multiple authentication and authorization mechanisms.
- Raise an issue or PR on the project repository, or
- Mail me () or
- Join the Gitter channel.
{% hint style="warning" %}
The authors and maintainers of RecallGraph are not liable for damages or indemnity (express or implied) for loss of any kind incurred directly or indirectly as a result of using this software. {% endhint %}