Storm Performance Experiment Tools

Introduction

This suite contains tools to run performance experiments on WASB and ADLS using Apache Storm. The Storm spout emits a randomly generated record of fixed size. The bolt writes to storage and ACKs the tuple. Once all executor threads complete work, the topology is killed. Results for each worker process are stored on the respective nodes.

Prequisites/Setup

Create a HDInsight cluster of desired size.
Fork and clone this repository so you have a local copy.
Install the following jar from the /lib directory of this repo to your local maven repository:

mvn -q install:install-file -Dfile=lib/eventhubs-storm-spout-0.9.4-jar-with-dependencies.jar -DgroupId=com.microsoft.eventhubs -DartifactId=eventhubs-storm-spout -Dversion=0.9.4 -Dpackaging=jar
Build from root folder: mvn clean package

Usage

Submit topology as follows:

storm jar target/org.apache.storm.hdfs.writebuffertest-0.1.jar org.apache.storm.hdfs.WriteTopology

Required parameters:
  -workers,-w                    Number of workers processes on the cluster
  -recordSize,-x                 Size of the record generated by spout (bytes)
  -spoutParallelism,-s           Number of spout executor processes across all workers
  -numTasksSpout,-e              Number of spout tasks across all workers
  -numAckers,-k                  Number of ackers
  -boltParallelism,-b            Number of spout executor processes across all workers
  -numTasksBolt,-t               Number of bolt tasks across all workers
  -fileRotationSize,-f           File size at which the file is rolled over and a new file is written to
  -fileBufferSize,-z             Client side buffer size. Messages are buffered to this size before being flushed to disk.
  -numRecords,-n                 Number of records written by each bolt instance
  -maxSpoutPending,-p            Max number of records that can be alive in the topology that are pending ACKs.
  -topologyName,-y               Name of the topology.
  -storageUrl,-u                 URL to WASB/ADLS Storage Endpoint
  -storageFileDirPath,-r         Relative path within the storage account. E.g. "/pathToDir/".
  
Optional parameters:
   -sizeSyncPolicyEnabled,-v     Enable size sync policy. When this is active, data is flushed only when fileBufferSize is reached.

Running the experiment

On WASB:

storm jar target/org.apache.storm.hdfs.writebuffertest-0.1.jar org.apache.storm.hdfs.WriteTopology -workers 2
-recordSize 100 -spoutParallelism 8 -numTasksSpout 8 -numAckers 8 -boltParallelism 64 -numTasksBolt 64
-fileRotationSize 100 -numRecords 10000000 -maxSpoutPending 1000
-topologyName $topologyName -storageUrl "wasb://$clusterContainer@$storageAccountName.blob.core.windows.net" 
-storageFileDirPath $storageDirectory

On ADLS:

storm jar target/org.apache.storm.hdfs.writebuffertest-0.1.jar org.apache.storm.hdfs.WriteTopology -workers 2
-recordSize 100 -spoutParallelism 8 -numTasksSpout 8 -numAckers 8 -boltParallelism 64 -numTasksBolt 64
-fileRotationSize 100 -numRecords 10000000 -maxSpoutPending 1000
-topologyName $topologyName -storageUrl "adl://$storageAccountName.azuredatalakestore.net"
-storageFileDirPath $storageDirectory

Analyzing Results

Results for run are stored under /tmp folder on the worker nodes. The name of the the file is the name of the topology specified in the input arguments.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.settings		.settings
lib		lib
src/main/java/org/apache/storm/hdfs		src/main/java/org/apache/storm/hdfs
.classpath		.classpath
.gitignore		.gitignore
.project		.project
README.md		README.md
dependency-reduced-pom.xml		dependency-reduced-pom.xml
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Storm Performance Experiment Tools

Introduction

Prequisites/Setup

Usage

Running the experiment

Analyzing Results

About

Releases

Packages

Languages

hdinsight/storm-performance-automation

Folders and files

Latest commit

History

Repository files navigation

Storm Performance Experiment Tools

Introduction

Prequisites/Setup

Usage

Running the experiment

Analyzing Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages