Skip to content

Getting started with Graph

Pierre Laporte edited this page Jul 12, 2018 · 2 revisions

Simple DSE Graph simulation

As usual, your simulation will need to contain three different sections: 1. open a connection to your DSE Graph cluster 2. define your workload like you would in any other Gatling scenario 3. instruct Gatling to close the connection once the scenario is done

Here is a Gatling simulation that sends Gremlin queries during 10 seconds.

import com.datastax.driver.dse.graph.GraphOptions
import com.datastax.driver.dse.{DseCluster, DseSession}
import com.datastax.gatling.plugin.DsePredef._ //(1)
import io.gatling.core.Predef._
import io.gatling.core.scenario.Simulation

import scala.concurrent.duration._

class SimpleDseGraphSimulation extends Simulation {
  private val cluster: DseCluster = DseCluster
    .builder()
    .addContactPoint("127.0.0.1")
    .withGraphOptions(new GraphOptions() //(2)
      .setGraphSource("g")
      .setGraphName("my_graph"))
    .build()
  private val session: DseSession = cluster.connect()

  val scn = scenario("Graph")
    .exec(graph("Select person")
      .executeGraph("g.V('person', 'identifier', 42)")) //(3)

  setUp(scn.inject(constantUsersPerSec(1).during(10 seconds)) //(4)
    .protocols(dseProtocolBuilder.session(session)))

  after(cluster.close()) //(5)
}
  1. Import predefined functions that extend Gatling DSL for Graph queries

  2. Connect to the DSE Graph cluster like in any application

  3. Define the user workload as a single execution of a Gremlin statement

  4. Configure throughput like in any other Gatling test

  5. Instruct Gatling to close the connection when it completes

Graph simulation with parametrized Gremlin statements

One of the common Graph practices is to use parametrized statements. Contrary to the CQL protocol, there are not prepared statements in Gremlin. Instead, one uses Groovy strings that contain variables. These variables are bound when the query is executed.

The previous example did not use any variable. Let’s introduce them by first creating a SimpleGraphStatement for the query. We also add a feeder for the values of variable identifier. Then, we instruct the plugin to bind this variable.

Note
It is impossible for the plugin to detect the variables that are used. This is different from the CQL workloads.
import com.datastax.driver.dse.graph.{GraphOptions, SimpleGraphStatement}
import com.datastax.driver.dse.{DseCluster, DseSession}
import com.datastax.gatling.plugin.DsePredef._
import io.gatling.core.Predef._
import io.gatling.core.scenario.Simulation

import scala.concurrent.duration._

class SimpleDseGraphSimulation extends Simulation {
  private val cluster: DseCluster = DseCluster
    .builder()
    .addContactPoint("127.0.0.1")
    .withGraphOptions(new GraphOptions()
      .setGraphSource("g")
      .setGraphName("my_graph"))
    .build()
  private val session: DseSession = cluster.connect()

  val query =
    new SimpleGraphStatement("g.V('person', 'identifier', identifier)")
  val feeder = Array(
    Map("identifier" -> 42),
    Map("identifier" -> 43),
    Map("identifier" -> 44)
  ).circular

  val scn = scenario("Graph")
    .feed(feeder)
    .exec(graph("Select person")
      .executeGraph(query).withParams("identifier"))

  setUp(scn.inject(constantUsersPerSec(1).during(10 seconds))
    .protocols(dseProtocolBuilder.session(session)))

  after(cluster.close())
}

Graph simulations with fluent statements

Gremlin language variants (GLV) can also be used with the Gatling DSE plugin. Note that they are also refered to as "Fluent statements" in DSE. Their construct is a bit specific, though, as there is no preparation at all. The entire query is built at the time it should be sent.

Warning
Gatling is based on predicting precisely the expected query send time. This is what allows it to not suffer from Coordinated Omission. Fluent statements add work between that time and the actual query send time. In other words, fluent statements build time is part of the reported latency.
import com.datastax.driver.dse.graph.GraphOptions
import com.datastax.driver.dse.{DseCluster, DseSession}
import com.datastax.dse.graph.api.DseGraph
import com.datastax.gatling.plugin.DsePredef._
import io.gatling.core.Predef._
import io.gatling.core.scenario.Simulation

import scala.concurrent.duration._

class SimpleDseGraphSimulation extends Simulation {
  private val cluster: DseCluster = DseCluster
    .builder()
    .addContactPoint("127.0.0.1")
    .withGraphOptions(new GraphOptions()
      .setGraphSource("g")
      .setGraphName("my_graph"))
    .build()
  private val session: DseSession = cluster.connect()

  val queryFactory = (session: Session) =>
    DseGraph.statementFromTraversal(
      DseGraph.traversal()
        .V()
        .has("person", "identifier", session("identifier").as[Int]))

  val feeder = Array(
    Map("identifier" -> 42),
    Map("identifier" -> 43),
    Map("identifier" -> 44)
  ).circular

  val scn = scenario("Graph")
    .feed(feeder)
    .exec(graph("Select person")
      .executeGraphFluent(queryFactory))

  setUp(scn.inject(constantUsersPerSec(1).during(10 seconds))
    .protocols(dseProtocolBuilder.session(session)))

  after(cluster.close())

}