-
Notifications
You must be signed in to change notification settings - Fork 0
S-ARIMA based alert detection for IODA data
License
InetIntel/chocolatine
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
chocolatine: S-ARIMA based alert detection for IODA time series data Authors: The original chocolatine was written by Andréas Guillot. This adaptation was developed by Shane Alcock. For more information on the chocolatine methodology, see the paper "Chocolatine: Outage Detection for Internet Background Radiation" published at TMA 2019 (https://arxiv.org/abs/1906.04426). Chocolatine consists of two components: the modeller and the libchocolatine module. The modeller is used to determine the ARMA model that is the best fit for a particular IODA time series. libchocolatine is used to forecast future time series values and compare the subsequent observations against those forecasts, highlighting instances where the observation and forecast are significantly different. ## Installation Simply enter the chocolatine source code directory and run `pip3 install .`. All python dependencies should be automatically installed at the same time. Both the modeller and the libchocolatine module will be installed. ## Additional setup The ARMA models that are derived for each IODA time series are written into a postgresql database. Both the modeller and any software using libchocolatine will need access to that database. The database requires the following table to exist: ``` CREATE TABLE IF NOT EXISTS public.arma_models ( ar_param INT NOT NULL, ma_param INT NOT NULL, param_limit INT NOT NULL, datasource VARCHAR(255) NOT NULL, entitytype VARCHAR(255) NOT NULL, entitycode VARCHAR(255) NOT NULL, fqid VARCHAR(255) NULL, generated_at INT NOT NULL, training_start INT NOT NULL, time_to_generate REAL NOT NULL, pred_intervals REAL[] NOT NULL, model_type VARCHAR(128) NULL ); ``` ## Running the modeller ## Using libchocolatine This is a very rough guide for using libchocolatine for alert detection. Libchocolatine is a python module, so you'll have to write Python code to make use of it. In practice, it probably makes more sense to use the SArimaChocolatine module within the watchtower-sentry system, especially if you are working with IODA data directly, but I've included these notes here in case anyone wants to use this module in other contexts. Imports: ``` from libchocolatine.libchocolatine import ChocolatineDetector from libchocolatine.asyncfetcher import AsyncHistoryFetcher ``` Now, declare a ChocolatineDetector instance: ``` det = ChocolatineDetector(name, apiurl, kafkaconf, dbconf, maxarma) ``` `name` is simply a label that you want to use to distinguish this detector instance from other detector instances. `apiurl` is the URL that chocolatine must use to query the IODA API for raw signal data, but not including the entityType or entityCode portions of the path (e.g. "https://api.ioda.inetintel.cc.gatech.edu/v2/signals/raw" ). `kafkaconf` is a dictionary containing configuration for the kafka topics that are used to pass model requests and answers between this software and the modeller. There are three dictionary keys that should be provided: * `modellerTopic`: the name of the topic where model requests should be sent * `bootstrapModel`: the list of bootstrap-servers for the kafka cluster that is hosting the topic * `group`: the consumer group to use when fetching model answers from the topic `dbconf` is a dictionary containing configuration for the postgres database where previously derived models are stored. The keys for this dictionary are: * `name`: the name of the database to connect to. Default is "models". * `host`: the host where the database is located. Default is "localhost". * `port`: the port to connect to on the database host. Default is 5432. `maxarma` is the maximum number of AR and MA parameters in total that are permitted in a derived model. The default `maxarma` is 3. Larger values will mean that model derivation will take longer (because there are more possible combinations of AR and MA values to evaluate) and runs the risk of the resulting model being over-fitted to the training data. Next, declare an instance of an AsyncHistoryFetcher: ``` fetcher = AsyncHistoryFetcher(apiurl, det.histRequest, det.histReply) ``` The `apiurl` is the same IODA API URL that you provided when you created the detector. The `histRequest` and `histReply` parameters are members of the ChocolatineDetector that you created earler, and are used to pass historical data fetch requests and the resulting fetched data back and forth between the detector and your asynchronous fetcher. Start both instances using their `start()` method: ``` det.start() fetcher.start() ``` Now, you can pass your latest observed data points to the detector using the `queueLiveData()` method: ``` det.queueLiveData(key, timestamp, value) ``` And the results of the comparison between S-ARIMA forecasts and the observed values that you have previously queued can be accessed by calling the `getLiveDataResult()` method. ``` res = det.getLiveDataResult(blocking) ``` where the `blocking` parameter indicates whether you want the query to block until the detector has a result available -- usually you want to set this to False and simply try again later if you get `None` back as a result. If the result is not `None`, then it should be a tuple that looks like `( key, timestamp, details )`. `key` and `timestamp` obviously tell you which time series and which timestamp the result refers to. The `details` is a dictionary containing the actual result itself, with the following keys: * `timestamp`: the timestamp of the observed data point * `observed`: the observed value at the timestamp * `predicted`: the forecasted value at the timestamp * `threshold`: the minimum allowable observed value for this observation to NOT be considered anomalous by S-ARIMA * `norm_threshold`: the minimum allowable observed value for this observation to be considered "normal" IF the series is currently considered to be in an "alert state". * `alertable`: set to True if `observed` is below `threshold`, False otherwise. * `baseline`: an approximate baseline minimum value for the time series, which can be useful for calculating the magnitude of an alert. When you are done using the detector, you can halt it and fetcher using the `halt()` method. ``` det.halt() fetcher.halt() ```
About
S-ARIMA based alert detection for IODA data
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published