Skip to content

Latest commit

 

History

History
39 lines (33 loc) · 1.2 KB

README.md

File metadata and controls

39 lines (33 loc) · 1.2 KB

Advanced Database System II

Project II of Advanced Database System (CS-422) of EPFL

Implementation of the following data processing frameworks over Spark:

  1. Cube Operator.
  2. Theta Join Operator (M-Bucket-I Algorithm).
  3. Data Streaming Pipeline.

Usage

Prerequisites

  1. Install scala.
  2. Install scala build tool, sbt.
  3. Ensure you have GNU make.
  4. Ensure your environment has recognized scala and sbt:
# Example
which scala
$ /usr/local/bin/scala
which sbt
/usr/local/bin/sbt
  1. Read about apache spark and hadoop, the latter may not really be necessary to read.
  2. Clone this repository.

Building the java jar package

Go to CS422-Project2 dir and follow the README there:

cd CS422-Project2

Testing the applications over Spark environment

Go to docker dir and follow the README there:

cd docker

License

MIT.