-
Notifications
You must be signed in to change notification settings - Fork 1
Pagination
Pagination in Cassandra works a bit differently than with Relational Databases, any API that wants to provide pagination must account for this.
In a relational database a paginated query would use LIMIT
and OFFSET
to
control which page we're fetching, something like this:
-- Get first 10 element page
SELECT * FROM USERS
LIMIT 10
-- Get second 10 element page
SELECT * FROM USERS
LIMIT 10 OFFSET 10
In Cassandra we can specify a limit, that is, how big the page is. We cannot specify an offset from where to continue our pagination, unfortunately.
Whether we execute a PreparedStatement
synchronously or asynchronously, Cassandra
always paginates results through different network requests. This is more evident
in its asynchronous API.
Please see [Iterating Results](Iterating Results)
If we want to paginate over the results of a query we can use a Pager
, which
can be constructed from a ScalaPreparedStatement
:
val query = "SELECT * FROM hotels_by_country WHERE country = ?".toCQL.prepare[String].as[Hotel]
// query: ScalaPreparedStatement1[String, Hotel] = net.nmoncho.helenus.internal.cql.ScalaPreparedStatement1@cd5a5eb
val pager = query.pager("NL")
// pager: Pager[Hotel] = net.nmoncho.helenus.internal.cql.Pager@d6fc560
val (nextPager, firstPage) = pager.execute(pageSize = 2)
// nextPager: Pager[Hotel] = net.nmoncho.helenus.internal.cql.Pager@fd44505
// firstPage: Iterator[Hotel] = non-empty iterator
val (finalPager, secondPage) = pager.execute(pageSize = 10)
// finalPager: Pager[Hotel] = net.nmoncho.helenus.internal.cql.Pager@378d712d
// secondPage: Iterator[Hotel] = non-empty iterator
- A
Pager
is created from aScalaPrepareStatement
by providing the query parameters we want that query to run with. The statement won't be executed yet, but we must provide the parameters up front (more on this below). - To obtain a page, we can call one of the execution methods on the
Pager
. These methods take how many results we want for that page - The result we get from this execution is the query results as an
Iterator
and thePager
we can use to get the next page.
Cassandra relies on a PagingState
to resume execution at a later time. We can save these PagingState
s and create
a Pager
with it to resume execution
val Some(pagingState) = nextPager.pagingState
// pagingState: PagingState = 001e001000120010526f7474657264616d2048696c746f6ef07ffffffdf07ffffffd0bacc121391f44d2e77b5d0e6d99d4d60004
val Success(continuedPager) = query.pager(pagingState, "NL")
// continuedPager: Pager[Hotel] = net.nmoncho.helenus.internal.cql.Pager@6cf5564f
val (_, secondPageAgain) = continuedPager.execute(pageSize = 10)
// secondPageAgain: Iterator[Hotel] = non-empty iterator
Imagine a user is paging over the Hotels available in Rotterdam. A Stateless
Web App would require a way to send and receive a PagingState
so paging
can be resumed.
We can do this with a PagerSerializer
which takes care of serializing and
deserializing a PagingState
:
import net.nmoncho.helenus.api.cql.PagerSerializer
implicit val serializer: PagerSerializer[String] = PagerSerializer.DefaultPagingStateSerializer
// serializer: PagerSerializer[String] = net.nmoncho.helenus.api.cql.PagerSerializer$DefaultPagingStateSerializer$@16957e69
val Some(encodedState) = nextPager.encodePagingState
// encodedState: String = "001e001000120010526f7474657264616d2048696c746f6ef07ffffffdf07ffffffd0bacc121391f44d2e77b5d0e6d99d4d60004"
val Success(anotherContinuedPager) = query.pager(encodedState, "NL")
// anotherContinuedPager: Pager[Hotel] = net.nmoncho.helenus.internal.cql.Pager@350a15e4
val (_, secondPageAnotherTime) = anotherContinuedPager.execute(pageSize = 10)
// secondPageAnotherTime: Iterator[Hotel] = non-empty iterator
We can plug our own implementation if we want to perform a more sophisticated serialization.
Helenus API requires users to specify which query parameters will be used before we actually execute the query. While this seems like a limitation, it's actually a way to prevent errors from happening.
By Cassandra's design:
The paging state can only be reused with the exact same statement (same query string, same parameters). It is an opaque value that is only meant to be collected, stored and re-used. If you try to modify its contents or reuse it with a different statement, the results are unpredictable.
This means that all three: Query, Parameters, and PagingState are strongly tied together.
Ideally we would be able to provide a single Pager
that allows query different
pages using different query parameters from a single query. Instead we must define
an API that doesn't let users alter what query parameters were initially provided.