Skip to content

Persistence Sharding

regunathb edited this page Nov 7, 2012 · 8 revisions

Sharding is a useful design approach for accommodating large data sets in persistence stores. Support for sharding is not a commonly supported feature, especially with RDBMSs. The Trooper persistence framework supports implicit and application influenced sharding that extends equally across all Persistence Providers.

Hibernate Shards library was considered as an implementation choice but was soon discarded for two reasons - Heavy, Support limited to only RDBMSs.

Design Approach

Sharding support in Trooper is built on the following assumptions:

  • Data distribution is influenced by data contained in Persistent Entities i.e. domain objects. Each Sharded Persistent Entity is capable of returning a "shard hint" from the data it holds.
  • Persistence Providers have the logic to validate shard hints, demarcate transactions(if supported) and perform the persistence operation. The providers also manage results aggregation from shards, where required.
  • Data Sources (or) wrappers interpret the mapping between logical shards (identified by the shard hints) to physical data-store/nodes and provide relevant handle(say connection) for executing the persistence operation.