You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -6,124 +6,210 @@ Luminaire *WindowDensityModel* implements the idea of monitoring data over compa
6
6
.. image:: windows.png
7
7
:scale:40%
8
8
9
-
Although *WindowDensityModel* is designed to track anomalies over streaming data, it can be used to track any sustained fluctuations over a window for any frequency. This detection type is suggested for up to hourly data frequency.
9
+
Although *WindowDensityModel* is designed to track anomalies over streaming data, it can be used to track anomalies even for low frequency time series. This detection type is suggested for up to hourly data frequency.
10
10
11
-
Anomaly Detection: Pre-Configured Settings
12
-
------------------------------------------
11
+
This window based anomaly detection feature in Luminaire operates fully automatically where the underlying model detects the frequency that the data has been observed, the optimal size of the window (using the periodic signals in the data) and the optimal detection method given some identified characteristics from the input time series. Moreover, user also has the ability to overwright the configuration for custom use cases.
13
12
14
-
Luminaire provides the capability to configure model parameters based on the frequency that the data has been observed and the methods that can be applied (please refer to the Window density Model user guide for detailed configuration options). Luminaire settings for the window density model are already pre-configured for some typical pandas frequency types and settings for any other frequency types should be configured manually (see the API reference for `Streaming Anomaly Detection Models <https://zillow.github.io/luminaire/api_reference/streaming.html>`_).
13
+
Fully Automated Anomaly Detection using Time-windows
Luminaire provides a fully automated anomaly detection method that tracks time series abnormalities over time-windows. Luminaire is capable of selecting the best possible setting by studying different characteristics of the input time series. Although, compared to the Luminaire outlier detection module, the window based anomaly detection does not require running any separate configuration optimization to obtain the best hyperparameters. Rather, the automation process is embedded withing the data exploration and the training process.
17
+
18
+
Similar to the outlier detection module, Luminaire Window Density Model comes with a streaming data profiling module to extract different characteristics about the high-frequency time series.
15
19
16
20
>>> from luminaire.model.window_density import WindowDensityHyperParams, WindowDensityModel
21
+
>>> from luminaire.exploration.data_exploration import DataExploration
(True, <luminaire_models.model.window_density.WindowDensityModel object at 0x7f8cda42dcc0>)
37
-
38
-
The model object contains the data density structure over a pre-specified window, given the frequency. Luminaire sets the following defaults for some typical pandas frequencies (any custom requirements can be updated in the hyperparameter object instance):
39
-
40
-
- 'S': Hourly windows
41
-
- 'T': 24 hours windows
42
-
- '15T': 24 hours windows
43
-
- 'H': 24 hours windows
44
-
- 'D': 4 weeks windows
45
-
- 'custom': User specified windows
46
-
47
-
In order to score a new window innovation given the trained model object, we have to provide a equal sized window that represents a similar time interval. For example, if each of the windows in the training data represents a 24 hour window between 9 AM to 8:59:59 AM (next day) for last few days, the scoring data should represent the same interval of a different day and should have the same window size.
Luminaire *stream_profile* performs missing data imputation if necessary, extracts the frequency information and obtains the optimal size of the window to be monitored (if not specified by the user). All the information obtained by the profiler can be used to update the configuration for the actual training process.
>>> success, training_end, model = wdm_obj.train(data=data)
61
+
>>> print(success, training_end, model)
62
+
True 2020-07-03 00:00:00 <luminaire.model.window_density.WindowDensityModel object at 0x7fb6fab80b00>
63
+
64
+
The training process generates the success flag, the model timestamp and the actual trained model. The trained model here is a collection of several sub-models that can be used to score any equal length time segment of the day and does not depend on the specific patterns based on the selected time window.
65
+
In order to score a new window innovation given the trained model object, we have to provide a equal sized time window. Moreover, Luminaire allows the user to perform basic processing (imputing missing index etc.) of the scoring window in order to get the data ready for scoring.
There are several options in the *WindowDensityHyperParams* class that can be manually configured. The configuration should be selected mostly based on the frequency that the data has been observed.
Luminaire Window Density model also comes with the capability of ingesting previously trained model in the future model trainings. This can be part of a sequential process that always passes the last trained model in the next training. This ensures richer data accumulation to have more reliable scores, specially when the training history is limited to a fixed length rolling window. This way, the model is able to keep larger history as a metadata even though the actual training history is limited.
121
+
122
+
>>> past_model =<luminaire.model.window_density.WindowDensityModel object at 0x7fb6fab80b00>
123
+
>>> print(new_training_data)
124
+
raw
125
+
index
126
+
2020-06-04 00:00:00 227798
127
+
2020-06-04 00:10:00 224593
128
+
2020-06-04 00:20:00 229400
129
+
2020-06-04 00:30:00 217813
130
+
2020-06-04 00:40:00 217862
131
+
... ...
132
+
2020-07-03 23:10:00 287773
133
+
2020-07-03 23:20:00 255438
134
+
2020-07-03 23:30:00 277127
135
+
2020-07-03 23:40:00 266263
136
+
2020-07-03 23:50:00 275432
137
+
>>> success, training_end, model = wdm_obj.train(data=new_training_data, past_model=past_model)
138
+
139
+
Anomaly Detection using Time-windows: Manual Configuration
There are several options in the *WindowDensityHyperParams* class that can be manually configured. User can select different option starting from the desired window size, whether all previous windows should be used to identify anomalies or the last window only, the detection method and how to manage nonstationarity and periodicity present in the data and so on. Please refer to the API reference for `Streaming Anomaly Detection Models <https://zillow.github.io/luminaire/api_reference/streaming.html>`_.
75
143
76
144
>>> from luminaire.model.window_density import WindowDensityHyperParams, WindowDensityModel
0 commit comments