A backend for statsd that aggregates and ages out set data based on arbitrary time limits
The syntax for the expiration of a set value is
key:value|s|#age
OR
key:value|s|@sampe-rate|#age
e.g.
active_users:joe-bob|s|#1m
This backend collects all set data which it deduplicates just like a normal set metric. The backend then keeps each value in the set for the specified amount of time (see notation above). Every flush interval it then reports the size of each set as a gague to a given list of backends (it's parent statsd server by default).
The set data is kept for the amount of time specified by the most recent packet so
if you need to expire a key immediately you can just send out
key:value|s|#0
Valid units on the end of the expiration are
- Milliseconds - ms (default)
- Seconds - s
- Minutes - m
- Hours - h
- Days - d
Warning: Having the same key sent with different expirations is probably not what you want. This will lead to metrics which are harder to interperet.
The statsdedupe key can take the following keys
realtime: <bool>
(default false)
this specifies weather a packet should be sent out immediately when a new key is added to a set or if it's alright to wait until the flush interval. If this is false then set sizes timestamps will be skewed right by about 1 flush interval
hosts: [{host: <address>, port: <port>},...]
(defaults to parent statsd)
this specifies the statsd hosts statsdedupe should send data to.
Also see the example config
Each set size is output as a simple gague with the same key name as the set
key:set_size|g
-
Put statsdedupe.js into your backend folder for statsd and priority_queue.js in your statsd library directory
-
Configure the statsdedupe backend
-
Start the statsd daemon:
node stats.js /path/to/config
This project was largely inspired by the Gossip Girl backend and uses priority queue code from Daniel Moore
Thanks to the folks at etsy for statsd