spark_streaming_aggregation

Event aggregation with spark streaming. The example includes event aggregation over Kafka or TCP event streams. The instructions are DSE specific but this should work on a standalone cluster.

To build and run the Kafka example

Build the assembly ./sbt/sbt package
Make sure you've got a running spark server and Cassandra node listening on localhost
Make sure you've got a running Kafka server on localhost with the topic events pre-provisioned.
Start the Kafka producer ./sbt/sbt "run-main KafkaProducer"
Submit the assembly to the spark server dse spark-submit --class KafkaConsumer ./target/scala-2.10/sparkstreamingaggregation_2.10-0.2.jar
Data will be posted to the C* column families demo.event_log and demo.event_counters

To build and run the TCP example

Build the assembly ./sbt/sbt package
Make sure you've got a running spark server and Cassandra node listening on localhost
Start the TCP producer ./sbt/sbt "run-main TcpProducer"
Submit the assembly to the spark server dse spark-submit --class TcpConsumer ./target/scala-2.10/sparkstreamingaggregation_2.10-0.2.jar
Data will be posted to the C* column families demo.event_log and demo.event_counters

"java.lang.NoSuchMethodException" exception

If you get the exception java.lang.NoSuchMethodException follow this guide to alleviate the problem.

Kafka serialization errors

If you installed Kafka via Homebrew, and see serialization errors on start it's probably because you're using Java 7, and Kafka was compiled with Java 8. Two possible workarounds exist: reinstall Kafka but don't use a keg, install the Java 8 JDK and supply the JAVA_HOME enviroment variable. Example below.

JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_40.jdk/Contents/Home/bin kafka-server-start.sh /usr/local/etc/kafka/server.properties

Example output

cqlsh> select * from demo.event_log ;

 name | bucket   | time                     | count
------+----------+--------------------------+-------
  foo | 23544741 | 2014-10-07 05:21:09-0700 |    29
  foo | 23544740 | 2014-10-07 05:20:09-0700 |    29

cqlsh> select * from demo.event_counters  ;

 name | bucket   | count
------+----------+-------
foo | 23544741 |    29

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
sbt		sbt
src/main/scala		src/main/scala
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

spark_streaming_aggregation

To build and run the Kafka example

To build and run the TCP example

"java.lang.NoSuchMethodException" exception

Kafka serialization errors

Example output

About

Releases

Packages

Contributors 2

Languages

License

mstump/spark_streaming_aggregation

Folders and files

Latest commit

History

Repository files navigation

spark_streaming_aggregation

To build and run the Kafka example

To build and run the TCP example

"java.lang.NoSuchMethodException" exception

Kafka serialization errors

Example output

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages