analytics
Source: analyticsdoc.
Analytics module.
Example
// import modules
var qm = require('qminer');
var analytics = qm.analytics;
// load dataset, create model, evaluate model
Child classes
Namespaces
Method
Abstract types
Classes
BiasedGk
BufferedTDigest
DpMeans
Gk
KMeans
LogReg
MDS
NearestNeighborAD
NNet
OneVsAll
PCA
PropHazards
RecLinReg
RecommenderSys
RidgeReg
Sigmoid
SVC
SVR
TDigest
ThresholdModel
Tokenizer
ActiveLearner
Namespaces
metrics
preprocessing
Method
nmf(mat, k[, json]) → Object
Calculates the non-negative matrix factorization, see: https://en.wikipedia.org/wiki/Non-negative_matrix_factorization.
Examples
Asynchronous function
// import modules
var analytics = require('qminer').analytics;
var la = require('qminer').la;
// create a matrix
var mat = new la.Matrix({ rows: 10, cols: 5, random: true });
// compute the non-negative matrix factorization
analytics.nmfAsync(mat, 3, { iter: 100, tol: 1e-4 }, function (err, result) {
if (err) { console.log(err); }
// calculation successful
var U = result.U;
var V = result.V;
});
Synchronous function
// import modules
var analytics = require('qminer').analytics;
var la = require('qminer').la;
// create a matrix
var mat = new la.Matrix({ rows: 10, cols: 5, random: true });
// compute the non-negative matrix factorization
var result = analytics.nmf(mat, 3, { iter: 100, tol: 1e-4 });
var U = result.U;
var V = result.V;
Parameters
Name | Type | Optional | Description | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mat |
|
The non-negative matrix. |
|||||||||||||||||
k |
number |
|
The reduced rank, e.g. number of columns in matrix U and number of rows in matrix V. Must be between 0 and |
||||||||||||||||
json |
Object |
Yes |
Algorithm options. Values in
|
- Returns
-
Object
B The json objectnmfRes
containing the non-negative matrices U and V:
nmfRes.U
- The module:la.Matrix representation of the matrix U,
nmfRes.V
- The module:la.Matrix representation of the matrix V.
Abstract types
ActiveLearnerParam Object
An object used for the construction of module:analytics.ActiveLearner.
Properties
Name | Type | Optional | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
learner |
Object |
Yes |
Learner parameters Values in
|
||||||||
SVC |
Yes |
Support vector classifier parameters. |
BiasedGkParam Object
An object used for the construction of module:analytics.quantiles.BiasedGk.
Properties
Name | Type | Optional | Description |
---|---|---|---|
targetProb |
number |
Yes |
The probability where the algorithm is most accurate. Its accuracy is determined as epsmax(p, targetProb) when targetProb < 0.5 and epsmax(1-p, 1-targetProb) when targetProb >= 0.5. Higher values of Defaults to |
eps |
number |
Yes |
Parameter which determines the accuracy. Defaults to |
compression |
string |
Yes |
Determines when the algorithm compresses its summary. Options are: "periodic", "aggressive" and "manual". Defaults to |
useBands |
boolean |
Yes |
Whether the algorithm should use the 'band' subprocedure. Using this subprocedure should result in a smaller summary. Defaults to |
BufferedDigestParam Object
An object used for the construction of module:analytics.quantiles.BufferedTDigest.
Properties
Name | Type | Optional | Description |
---|---|---|---|
delta |
number |
Yes |
The number of clusters in the summary is bounded by floor(minClusters) <= clusters < 2*ceil(minClusters) Defaults to |
bufferLen |
number |
Yes |
the size of the buffer is minClustersbufferLenFactor, when the buffer fills it is merged with the summary. Also, the algorithm initializes after seeing minClustersbufferLenFactor examples. Defaults to |
seed |
number |
Yes |
random seed (values above 1 are deterministic) Defaults to |
detectorParam Object
An object used for the construction of module:analytics.NearestNeighborAD.
Parameters
Name | Type | Optional | Description |
---|---|---|---|
rate |
number |
Yes |
The expected fracton of emmited anomalies (0.05 -> 5% of cases will be classified as anomalies). Defaults to |
windowSize |
number |
Yes |
Number of most recent instances kept in the model. Defaults to |
DpMeansExplain Object
The examplanation returned by module:analytics.KMeans#explain.
Properties
Name | Type | Optional | Description |
---|---|---|---|
medoidID |
number |
|
The ID of the nearest medoids. |
featureIDs |
|
The IDs of features, sorted by contribution. |
|
featureContributions |
|
Weights of each feature contribution (sum to 1.0). |
DpMeansParam Object
An object used for the construction of module:analytics.KMeans.
Properties
Name | Type | Optional | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
iter |
number |
Yes |
The maximum number of iterations. Defaults to |
||||||||
lambda |
number |
Yes |
Maximum radius of the clusters Defaults to |
||||||||
minClusters |
number |
Yes |
Minimum number of clusters Defaults to |
||||||||
maxClusters |
number |
Yes |
Maximum number of clusters Defaults to |
||||||||
allowEmpty |
boolean |
Yes |
Whether to allow empty clusters to be generated. Defaults to |
||||||||
calcDistQual |
boolean |
Yes |
Whether to calculate the quality measure based on distance, if false relMeanCentroidDist will return 'undefined' Defaults to |
||||||||
centroidType |
string |
Yes |
The type of centroids. Possible options are Defaults to |
||||||||
distanceType |
string |
Yes |
The distance type used at the calculations. Possible options are Defaults to |
||||||||
verbose |
boolean |
Yes |
If Defaults to |
||||||||
fitIdx |
Array of number |
Yes |
The index array used for the construction of the initial centroids. |
||||||||
fitStart |
Object |
Yes |
The KMeans model returned by module:analytics.KMeans.prototype.getModel used for centroid initialization. Values in
|
GkParam Object
An object used for the construction of module:analytics.quantiles.Gk.
Properties
Name | Type | Optional | Description |
---|---|---|---|
eps |
number |
Yes |
Determines the relative error of the algorithm. Defaults to |
autoCompress |
boolean |
Yes |
Whether the summary should be compresses automatically or manually. Defaults to |
useBands |
boolean |
Yes |
Whether the algorithm should use the 'band' subprocedure. Using this subprocedure should result in a smaller summary. Defaults to |
hazardModelParam Object
An object used for the construction of module:analytics.PropHazards.
Property
Name | Type | Optional | Description |
---|---|---|---|
lambda |
number |
Yes |
The regularization parameter. Defaults to |
KMeansExplain Object
The examplanation returned by module:analytics.KMeans#explain.
Properties
Name | Type | Optional | Description |
---|---|---|---|
medoidID |
number |
|
The ID of the nearest medoids. |
featureIDs |
|
The IDs of features, sorted by contribution. |
|
featureContributions |
|
Weights of each feature contribution (sum to 1.0). |
KMeansParam Object
An object used for the construction of module:analytics.KMeans.
Properties
Name | Type | Optional | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
iter |
number |
Yes |
The maximum number of iterations. Defaults to |
||||||||
k |
number |
Yes |
The number of centroids. Defaults to |
||||||||
allowEmpty |
boolean |
Yes |
Whether to allow empty clusters to be generated. Defaults to |
||||||||
calcDistQual |
boolean |
Yes |
Whether to calculate the quality measure based on distance, if false relMeanCentroidDist will return 'undefined' Defaults to |
||||||||
centroidType |
string |
Yes |
The type of centroids. Possible options are Defaults to |
||||||||
distanceType |
string |
Yes |
The distance type used at the calculations. Possible options are Defaults to |
||||||||
verbose |
boolean |
Yes |
If Defaults to |
||||||||
fitIdx |
Array of number |
Yes |
The index array used for the construction of the initial centroids. |
||||||||
fitStart |
Object |
Yes |
The KMeans model returned by module:analytics.KMeans.prototype.getModel used for centroid initialization. Values in
|
logisticRegParam Object
An object used for the construction of module:analytics.LogReg.
Properties
Name | Type | Optional | Description |
---|---|---|---|
lambda |
number |
Yes |
The regularization parameter. Defaults to |
intercept |
boolean |
Yes |
Indicates whether to automatically include the intercept. Defaults to |
MDSParam Object
An object used for the construction of module:analytics.MDS.
Properties
Name | Type | Optional | Description |
---|---|---|---|
maxSecs |
number |
Yes |
The maximum time period to compute Multidimensional Scaling of a matrix. Defaults to |
maxStep |
number |
Yes |
The maximum number of iterations. Defaults to |
minDiff |
number |
Yes |
The minimum difference criteria in MDS. Defaults to |
distType |
string |
Yes |
The type of distance used. Available types: "Euclid", "Cos", "SqrtCos". Defaults to |
NearestNeighborADExplain Object
An object used for interpreting the predictions of module:analytics.NearestNeighborAD#explain.
Properties
Name | Type | Optional | Description |
---|---|---|---|
nearestID |
number |
|
The ID of the nearest neighbor. |
distance |
number |
|
The distance to the nearest neighbor. |
features |
Array of module:analytics~NearestNeighborADFeatureContribution |
|
An array with feature contributions. |
oldestID |
number |
|
The ID of the oldest record in the internal buffer (the record that was added first). |
newestID |
number |
|
The ID of the newest record in the internal buffer (the record that was added last). |
NearestNeighborADFeatureContribution Object
An object explaining the prediction of module:analytics.NearestNeighborAD#explain in terms of a single feature. Contained in the object module:analytics~NearestNeighborADExplain.
Properties
Name | Type | Optional | Description |
---|---|---|---|
id |
number |
|
The ID of the feature. |
val |
number |
|
The value of the feature for the vector we are explaining. |
nearVal |
number |
|
The the value of the feature for the nearest neighbor. |
contribution |
number |
|
Fraction of the total distance |
nnetParam Object
An object used for the construction of module:analytics.NNet.
Properties
Name | Type | Optional | Description |
---|---|---|---|
layout |
Array of number |
Yes |
The array representing the network schema. Defaults to |
learnRate |
number |
Yes |
The learning rate. Defaults to |
momentum |
number |
Yes |
The momentum of optimization. Defaults to |
tFuncHidden |
string |
Yes |
Type of activation function used on hidden nevrons. Possible options are Defaults to |
tFuncOut |
string |
Yes |
Type of activation function used on output nevrons. Possible options are Defaults to |
oneVsAllParam Object
An object used for the construction of module:analytics.OneVsAll.
Properties
Name | Type | Optional | Description |
---|---|---|---|
model |
function() |
Yes |
Constructor for binary model to be used internaly. Constructor should expect only one parameter. |
modelParam |
Object |
Yes |
Parameter for |
categories |
number |
Yes |
Number of categories. |
verbose |
boolean |
Yes |
If false, the console output is supressed. Defaults to |
PCAParam Object
An object used for the construction of module:analytics.PCA.
Properties
Name | Type | Optional | Description |
---|---|---|---|
k |
number |
Yes |
Number of eigenvectors to be computed. Defaults to |
iter |
number |
Yes |
Number of iterations. Defaults to |
recLinRegParam Object
An object used for the construction of module:analytics.RecLinReg.
Parameters
Name | Type | Optional | Description |
---|---|---|---|
dim |
number |
|
The dimension of the model. |
regFact |
number |
Yes |
The regularization factor. Defaults to |
forgetFact |
number |
Yes |
The forgetting factor. Defaults to |
RecSysParam Object
An object used for the construction of module:analytics.RecommenderSys.
Properties
Name | Type | Optional | Description |
---|---|---|---|
iter |
number |
Yes |
The maximum number of iterations. Defaults to |
k |
number |
Yes |
The number of centroids. Defaults to |
tol |
number |
Yes |
The tolerance. Defaults to |
verbose |
boolean |
Yes |
If false, the console output is supressed. Defaults to |
ridgeRegParam Object
An object used for the construction of module:analytics.RidgeReg.
Property
Name | Type | Optional | Description |
---|---|---|---|
gamma |
number |
Yes |
The gamma value. Defaults to |
SVMParam Object
SVM constructor parameters. Used for the construction of module:analytics.SVC and module:analytics.SVR.
Properties
Name | Type | Optional | Description |
---|---|---|---|
algorithm |
string |
Yes |
The algorithm procedure. Possible options are Defaults to |
c |
number |
Yes |
Cost parameter. Increasing the parameter forces the model to fit the training data more accurately (setting it too large may lead to overfitting) . Defaults to |
j |
number |
Yes |
Unbalance parameter. Increasing it gives more weight to the positive examples (getting a better fit on the positive training examples gets a higher priority). Setting j=n is like adding n-1 copies of the positive training examples to the data set. Defaults to |
eps |
number |
Yes |
Epsilon insensitive loss parameter. Larger values result in fewer support vectors (smaller model complexity) Defaults to |
batchSize |
number |
Yes |
Number of examples used in the subgradient estimation. Higher number of samples slows down the algorithm, but makes the local steps more accurate. Defaults to |
maxIterations |
number |
Yes |
Maximum number of iterations. Defaults to |
maxTime |
number |
Yes |
Maximum runtime in seconds. Defaults to |
minDiff |
number |
Yes |
Stopping criterion tolerance. Defaults to |
type |
string |
Yes |
The subalgorithm procedure in LIBSVM. Possible options are Defaults to |
kernel |
string |
Yes |
Kernel type in LIBSVM. Possible options are Defaults to |
gamma |
number |
Yes |
Gamma parameter in LIBSVM. Set gamma in kernel function. Defaults to |
p |
number |
Yes |
P parameter in LIBSVM. Set the epsilon in loss function of epsilon-SVR. Defaults to |
degree |
number |
Yes |
Degree parameter in LIBSVM. Set degree in kernel function. Defaults to |
nu |
number |
Yes |
Nu parameter in LIBSVM. Set the parameter nu of nu-SVC, one-class SVM, and nu-SVR. Defaults to |
coef0 |
number |
Yes |
Coef0 parameter in LIBSVM. Set coef0 in kernel function. Defaults to |
cacheSize |
number |
Yes |
Set cache memory size in MB (default 100) in LIBSVM. Defaults to |
verbose |
boolean |
Yes |
Toggle verbose output in the console. Defaults to |
TDigestParam Object
An object used for the construction of module:analytics.quantiles.TDigest.
Properties
Name | Type | Optional | Description |
---|---|---|---|
minCount |
number |
Yes |
The minimal number of examples before the model is initialized. Defaults to |
clusters |
number |
Yes |
The number of 1-d clusters (large values lead to higher memory usage). Defaults to |
tokenizerParam Object
An object used for the construction of module:analytics.Tokenizer.
Property
Name | Type | Optional | Description |
---|---|---|---|
type |
string |
Yes |
The type of the tokenizer. The different types are:
Defaults to |