Skip to content

Elasticsearch benchmarks for date_histogram aggregation

Notifications You must be signed in to change notification settings

csoulios/date_histogram-benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Elasticsearch date_histogram aggregations benchmark

Intro

This project is an ES Rally benchmark that measures performance of the date_histogram aggregation for various workloads.

It was created to reproduce a performance issue reported at Elastic forums and GitHub issues.

Test datasets with different workloads have been uploaded here.

Run benchmarks

To run the benchmarks

  1. Install the latest version of Rally, as described in the official Rally documentation.

  2. Configure Rally using esrally configure.

  3. Edit ~/.rally/rally.ini and add the data_histogram-benchmark track in the [tracks] section as shown below (more details in the Rally docs):

    [tracks]
    default.url = https://github.com/elastic/rally-tracks
    date_histogram-benchmark.url = https://github.com/csoulios/date_histogram-benchmark
    
  4. Run rally track with any of the supported challenges.

esrally --on-error=abort --track-repository=date_histogram-benchmark --distribution-version=[elasticsearch_version] --track date_histogram --challenge=[challenge_name]

A different challenge has been created for loading each of the datasets with different distributions of documents in time:

  • timestamps-gaussian-sameday: this dataset represents the actual distribution of log data during a production day. It is a gaussian distribution centered around lunch time (more documents during the day than the night). All documents fit within the same day.
  • timestamps-uniform-sameday: All documents fit within the same day but are evenly distributed (same amount of docs every hours).
  • timestamps-uniform-1s: Documents are spaced a second apart (the first starts at 2000-01-01T00:00:00.000Z, next is 1 second later).
  • timestamps-uniform-10s: 10 second gap between documents.

Acknowledgements

Special thanks to Bertrand Renuart for reporting this issue and creating the benchmark dataset.

About

Elasticsearch benchmarks for date_histogram aggregation

Resources

Stars

Watchers

Forks

Packages

No packages published