In databases there are two approaches for imposing integrity constraints on structured data collections: schema-on-read or schema-on-write. The schema-on-read approach assumes an implicit structure of the data where the structure is only interpreted at read level. Document databases like MongoDB or key value stores like Redis are examples for this kind of schema flexibility. In contrast, the schema-on-write approach has an explicit assumption of the data model where the database ensures that all data written is conform to a defined data model. The traditional relational databases like PostgreSQL as well as modern column-based databases like Cassandra fall under this category.
Datahike supports both approaches which can be chosen at creation time but can not be changed afterwards.
Have a look at the core
-namespace in the examples/store-intro
folder for
example configuration and transactions.
By inheriting most of the code from
Datascript the default approach was
schema-on-read where you could add any arbitrary Clojure data structures to
the database with a small set of helper definitions that added information
about references and cardinality. Even though Datahike's API moved to a
schema-on-write approach, the schema-less behavior is still supported. On
database creation you may opt out by setting the :schema-flexibility
parameter to :read
.
(require '[datahike.api :as d])
(def cfg {:store {:backend :mem :id "schemaless"} :schema-flexibility :read})
(d/create-database cfg)
(def conn (d/connect cfg))
;; now you can add any arbitrary data
(d/transact conn {:tx-data [{:any "Data"}]})
With the release of version 0.2.0
Datahike enforces by default an explicit
schema where you have to define your expected data shapes in advance. The
schema itself is present in the database index, so you can simply transact it
like any other datom.
(require '[datahike.api :as d])
;; since the :write approach is the default value we may also skip the setting
(def cfg {:store {:backend :mem :id "schema-on-write"} :schema-flexibility :write})
(d/create-database cfg)
(def conn (d/connect cfg))
;; define a simple schema
(def schema [{:db/ident :name :db/valueType :db.type/string :db/cardinality :db.cardinality/one}])
;; transact it
(d/transact conn {:tx-data schema})
;; now we can transact data based on the provided schema
(d/transact conn {:tx-data [{:name "Alice"}]})
The schema definition is for the most part compliant with Datomic's approach. Required are three attributes:
:db/ident
: the name of the attribute, defined as a keyword with optional namespace, e.g.:user/name
:db/valueType
: the type of the value associated with an attribute, e.g.db.type/string
, see below for supported types:db/cardinality
: the cardinality of the value, whether the value is a single value or a set of values, can be either:db.cardinality/one
ordb.cardinality/many
Additionally, the following optional attributes are supported:
db/doc
: the documentation for the attribute as a stringdb/unique
: a uniqueness constraint on the attribute for a given value, can be eitherdb.unique/value
(only one entity with this attribute can have the same value) ordb.unique/identity
(only one entity can have the value for this attribute with upsert enabled)db/index
: indicates whether an index for the attribute's value should be created as a booleandb/isComponent
: indicates that an attribute of type:db.type/ref
references a subcomponent of the entity that has the attribute (for cascading retractions)- if
:db/valueType
is:db.type/tuple
, one of:db/tupleAttrs
: a collection of attributes that make up the tuple (for composite tuples)db/tupleTypes
: a collection of 2-8 types that make up the tuple (for heterogeneous fixed length tuples)db/tupleType
: the type of the tuple elements (for homogeneous variable length tuples)
The following types are currently support in datahike:
Value Type | Corresponding Type |
---|---|
db.type/bigdec |
java.math.BigDecimal |
db.type/bigint |
java.math.BigInteger |
db.type/boolean |
Boolean |
db.type/double |
Double |
db.type/float |
Double or Float |
db.type/instant |
java.util.Date |
db.type/keyword |
clojure.lang.Keyword |
db.type/long |
java.lang.Long |
db.type/ref |
java.lang.Long |
db.type/string |
String |
db.type/symbol |
clojure.lang.Symbol |
db.type/uuid |
java.util.UUID |
db.type/tuple |
clojure.lang.Vector |
The schema is validated using clojure.spec.
See src/datahike/schema.cljc
for the implementation details.
Updating an existing schema is discouraged as it may lead to inconsistencies
in your data. Therefore, only schema updates for db.cardinality
and db.unique
are supported. Rather than updating an existing attribute, it is recommended to create
a new attribute and migrate data accordingly. Alternatively, if you want to maintain your
old attribute names, export your data except the schema, transform it to the new
schema, create a new database with the new schema, and import the transformed data.