Dominance-based queries on Apache Spark.

Skyline queries are a popular and powerful paradigm for extracting interesting objects from a multi-dimensional dataset. Given a set D of d-dimensional objects (or points), the skyline set of R is the set of Pareto-optimal, or undominated, points in D

Algorithms

Skyline query based on the Sort Filter Skyline (SFS) algorithm.
Top-k dominating based on the Skyline-based Top-k Dominating (STD).
Top-k dominating on Skyline

Datasets

There are 4 distributions of synthetic datasets to run the algorithms, from 2-d to 10-d.

Correlated
Uniform
Normal
Anti-correlated

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Dominance-based queries on Apache Spark.

Algorithms

Datasets

Files

README.md

Latest commit

History

README.md

File metadata and controls

Dominance-based queries on Apache Spark.

Algorithms

Datasets