class staticanalytics. TimeWindowGk
Source: analyticsdoc.
Greenwald - Khanna algorithm for quantile estimation on sliding windows. Given a cumulative probability p, the algorithm returns the approximate value of the p-th quantile of all the values in a sliding window.
The algorithm works by keeping a summary of buckets. Each bucket summarizes a range of values. Through the run of the algorithm, new buckets are created and old ones merged. To allow for the computation on a sliding window, each bucket uses an Exponential Histogram structure to remember how many values it summarizes.
It is summarized in: "Online Algorithm for Approximate Quantile Queries on Sliding Windows" http://dl.acm.org/citation.cfm?id=2954329
The error is not bounded by the absolute value of the output, but by the error in rank of the output element. For instance, if we have 100 elements and query the median, we could get the 48-th (p=0.48), 50-th (p=0.5), 51-th (p=0.51), etc. element.
The algorithms error is bounded by two factors. The first is the quantile estimation factor quantileEps
occurs because of the summary structure and defines the maximum size of the buckets. The second type of error countEps
occurs because of the Exponential Histograms inside the buckets.
The worst-case error is bounded by (quantileEps + 2*countEps + O(countEps^2)), although in practice the error is lower.
This version of the algorithm uses a time-based fixed duration sliding window.
new TimeWindowGk([arg])
Example
// import modules
var qm = require('qminer');
var fs = qm.fs;
var analytics = qm.analytics;
// create the default object
var gk = new analytics.TimeWindowGk({
window: 5,
quantileEps: 0.001,
countEps: 0.0005
});
// create the data used for calculating quantiles
var inputs = [10, 1, 2, 8, 9, 5, 6, 4, 7, 3];
var times = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
// fit the model
for (var i = 0; i < inputs.length; i++) {
gk.partialFit(times[i], inputs[i]);
}
// make the prediction for the 0.1 quantile
var prediction = gk.predict(0.1);
// save the model
gk.save(fs.openWrite('gk.bin')).close();
// open the gk model under a new variable
var gk2 = new analytics.TimeWindowGk(fs.openRead('gk.bin'));
Parameter
Name | Type | Optional | Description |
---|---|---|---|
arg |
Yes |
Construction arguments. There are two ways of constructing:
|
Methods
staticpartialFit(timestamp[, value]) → module:analytics.TimeWindowGk
Adds a new observation to the window. The window is updated with the provided timestamp, all records which fall outside the new window are forgotten.
Example
var qm = require('qminer');
var gk = new qm.analytics.TimeWindowGk({
window: 1000*60*60*24*2 // window is 2 days
});
gk.partialFit(new Date('2017-06-06'), 1.0);
gk.partialFit(new Date('2017-06-07').getTime(), 1.0);
gk.partialFit(new Date('2017-06-08')); // only move the time window, the first value is forgotten
Parameters
Name | Type | Optional | Description |
---|---|---|---|
timestamp |
(number or Date) |
|
time of the observation |
value |
number |
Yes |
the observation |
- Returns
-
module:analytics.TimeWindowGk
reference to self
staticsave(fout) → module:fs.FOut
Saves the objects state into a binary file.
Example
var qm = require('qminer');
var fs = qm.fs;
var gk = new qm.analytics.TimeWindowGk({
window: 100 // window is 2 days
});
gk.partialFit(0, 1.0);
gk.partialFit(1, 1.0);
gk.partialFit(2, 1.0);
gk.partialFit(3, 1.0);
gk.partialFit(4, 1.0);
// save the model
gk.save(fs.openWrite('gk.bin')).close();
// open the model under a new variable
var gk = new analytics.TimeWindowGk(fs.openRead('gk.bin'));
Parameter
Name | Type | Optional | Description |
---|---|---|---|
fout |
|
the output stream |
- Returns
-
module:fs.FOut
the output streamfout
getParams() → module:analytics~TimeWindowGkParam
Returns the models' parameters as a JavaScript object (JSON). These parameters are the same as are set through the constructor.
- Returns
-
module:analytics~TimeWindowGkParam
The construction parameters.var qm = require('qminer'); var fs = qm.fs; var analytics = qm.analytics;
// create the default object var gk = new analytics.TimeWindowGk({ window: 5, quantileEps: 0.001, countEps: 0.0005 });
// create the data used for calculating quantiles var inputs = [10, 1, 2, 8, 9, 5, 6, 4, 7, 3]; var times = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
// fit the model for (var i = 0; i < inputs.length; i++) { gk.partialFit(times[i], inputs[i]); } var params = gk.getParams();
console.log(params.window); console.log(params.quantileEps); console.log(params.countEps);
predict(p) → number
Given an input cumulative probability, returns a quantile associated with that probability (e.g. for input 0.5 it will return the median).
Example
var qm = require('qminer');
var gk = new qm.analytics.TimeWindowGk({
window: 100 // window is 2 days
});
gk.partialFit(0, 1.0);
gk.partialFit(1, 1.0);
gk.partialFit(2, 1.0);
gk.partialFit(3, 1.0);
gk.partialFit(4, 1.0);
console.log(gk.predict(0.01)); // prints the first percentile
console.log(gk.predict(0.25)); // prints the first quartile
console.log(gk.predict(0.5)); // prints the median
Parameter
Name | Type | Optional | Description |
---|---|---|---|
p |
number |
|
cumulative probability between 0 and 1 (both inclusive) |
- Returns
-
number
quantile associated with p