Namespaces

Classes

Namespaces

Method

nmf(mat, k[, json]) → Object

Calculates the non-negative matrix factorization, see: https://en.wikipedia.org/wiki/Non-negative_matrix_factorization.

Examples

Asynchronous function

// import modules
var analytics = require('qminer').analytics;
var la = require('qminer').la;
// create a matrix
var mat = new la.Matrix({ rows: 10, cols: 5, random: true });
// compute the non-negative matrix factorization
analytics.nmfAsync(mat, 3, { iter: 100, tol: 1e-4 }, function (err, result) {
   if (err) { console.log(err); }
   // calculation successful
   var U = result.U;
   var V = result.V;
});

Synchronous function

// import modules
var analytics = require('qminer').analytics;
var la = require('qminer').la;
// create a matrix
var mat = new la.Matrix({ rows: 10, cols: 5, random: true });
// compute the non-negative matrix factorization
var result = analytics.nmf(mat, 3, { iter: 100, tol: 1e-4 });
var U = result.U;
var V = result.V;

Parameters

Name Type Optional Description

mat

(module:la.Matrix or module:la.SparseMatrix)

 

The non-negative matrix.

k

number

 

The reduced rank, e.g. number of columns in matrix U and number of rows in matrix V. Must be between 0 and min(mat.rows, mat.cols).

json

Object

Yes

Algorithm options.

Values in json have the following properties:

Name Type Optional Description

iter

number

Yes

The number of iterations used for the algorithm.

Defaults to 100.

tol

number

Yes

The tolerance.

Defaults to 1e-3.

verbose

boolean

Yes

If false, the console output is supressed.

Defaults to false.

Returns

ObjectB The json object nmfRes containing the non-negative matrices U and V:
nmfRes.U- The module:la.Matrix representation of the matrix U,
nmfRes.V- The module:la.Matrix representation of the matrix V.

Abstract types

inner

ActiveLearnerParam  Object

An object used for the construction of module:analytics.ActiveLearner.

Properties

Name Type Optional Description

learner

Object

Yes

Learner parameters

Values in learner have the following properties:

Name Type Optional Description

disableAsserts

boolean

Yes

Disable input asserting

Defaults to false.

SVC

module:analytics~SVMParam

Yes

Support vector classifier parameters.

inner

BiasedGkParam  Object

An object used for the construction of module:analytics.quantiles.BiasedGk.

Properties

Name Type Optional Description

targetProb

number

Yes

The probability where the algorithm is most accurate. Its accuracy is determined as epsmax(p, targetProb) when targetProb < 0.5 and epsmax(1-p, 1-targetProb) when targetProb >= 0.5. Higher values of targetProb allow for a smaller memory footprint.

Defaults to 0.01.

eps

number

Yes

Parameter which determines the accuracy.

Defaults to 0.1.

compression

string

Yes

Determines when the algorithm compresses its summary. Options are: "periodic", "aggressive" and "manual".

Defaults to "periodic".

useBands

boolean

Yes

Whether the algorithm should use the 'band' subprocedure. Using this subprocedure should result in a smaller summary.

Defaults to true.

inner

BufferedDigestParam  Object

An object used for the construction of module:analytics.quantiles.BufferedTDigest.

Properties

Name Type Optional Description

delta

number

Yes

The number of clusters in the summary is bounded by floor(minClusters) <= clusters < 2*ceil(minClusters)

Defaults to 100.

bufferLen

number

Yes

the size of the buffer is minClustersbufferLenFactor, when the buffer fills it is merged with the summary. Also, the algorithm initializes after seeing minClustersbufferLenFactor examples.

Defaults to 1000.

seed

number

Yes

random seed (values above 1 are deterministic)

Defaults to 0.

inner

detectorParam  Object

An object used for the construction of module:analytics.NearestNeighborAD.

Parameters

Name Type Optional Description

rate

number

Yes

The expected fracton of emmited anomalies (0.05 -> 5% of cases will be classified as anomalies).

Defaults to 0.05.

windowSize

number

Yes

Number of most recent instances kept in the model.

Defaults to 100.

inner

DpMeansExplain  Object

The examplanation returned by module:analytics.KMeans#explain.

Properties

Name Type Optional Description

medoidID

number

 

The ID of the nearest medoids.

featureIDs

module:la.IntVector

 

The IDs of features, sorted by contribution.

featureContributions

module:la.Vector

 

Weights of each feature contribution (sum to 1.0).

inner

DpMeansParam  Object

An object used for the construction of module:analytics.KMeans.

Properties

Name Type Optional Description

iter

number

Yes

The maximum number of iterations.

Defaults to 10000.

lambda

number

Yes

Maximum radius of the clusters

Defaults to 1.

minClusters

number

Yes

Minimum number of clusters

Defaults to 2.

maxClusters

number

Yes

Maximum number of clusters

Defaults to inf.

allowEmpty

boolean

Yes

Whether to allow empty clusters to be generated.

Defaults to true.

calcDistQual

boolean

Yes

Whether to calculate the quality measure based on distance, if false relMeanCentroidDist will return 'undefined'

Defaults to false.

centroidType

string

Yes

The type of centroids. Possible options are 'Dense' and 'Sparse'.

Defaults to "Dense".

distanceType

string

Yes

The distance type used at the calculations. Possible options are 'Euclid' and 'Cos'.

Defaults to "Euclid".

verbose

boolean

Yes

If false, the console output is supressed.

Defaults to false.

fitIdx

Array of number

Yes

The index array used for the construction of the initial centroids.

fitStart

Object

Yes

The KMeans model returned by module:analytics.KMeans.prototype.getModel used for centroid initialization.

Values in fitStart have the following properties:

Name Type Optional Description

C

(module:la.Matrix or module:la.SparseMatrix)

 

The centroid matrix.

inner

GkParam  Object

An object used for the construction of module:analytics.quantiles.Gk.

Properties

Name Type Optional Description

eps

number

Yes

Determines the relative error of the algorithm.

Defaults to 0.01.

autoCompress

boolean

Yes

Whether the summary should be compresses automatically or manually.

Defaults to true.

useBands

boolean

Yes

Whether the algorithm should use the 'band' subprocedure. Using this subprocedure should result in a smaller summary.

Defaults to true.

inner

hazardModelParam  Object

An object used for the construction of module:analytics.PropHazards.

Property

Name Type Optional Description

lambda

number

Yes

The regularization parameter.

Defaults to 0.

inner

KMeansExplain  Object

The examplanation returned by module:analytics.KMeans#explain.

Properties

Name Type Optional Description

medoidID

number

 

The ID of the nearest medoids.

featureIDs

module:la.IntVector

 

The IDs of features, sorted by contribution.

featureContributions

module:la.Vector

 

Weights of each feature contribution (sum to 1.0).

inner

KMeansParam  Object

An object used for the construction of module:analytics.KMeans.

Properties

Name Type Optional Description

iter

number

Yes

The maximum number of iterations.

Defaults to 10000.

k

number

Yes

The number of centroids.

Defaults to 2.

allowEmpty

boolean

Yes

Whether to allow empty clusters to be generated.

Defaults to true.

calcDistQual

boolean

Yes

Whether to calculate the quality measure based on distance, if false relMeanCentroidDist will return 'undefined'

Defaults to false.

centroidType

string

Yes

The type of centroids. Possible options are 'Dense' and 'Sparse'.

Defaults to "Dense".

distanceType

string

Yes

The distance type used at the calculations. Possible options are 'Euclid' and 'Cos'.

Defaults to "Euclid".

verbose

boolean

Yes

If false, the console output is supressed.

Defaults to false.

fitIdx

Array of number

Yes

The index array used for the construction of the initial centroids.

fitStart

Object

Yes

The KMeans model returned by module:analytics.KMeans.prototype.getModel used for centroid initialization.

Values in fitStart have the following properties:

Name Type Optional Description

C

(module:la.Matrix or module:la.SparseMatrix)

 

The centroid matrix.

inner

logisticRegParam  Object

An object used for the construction of module:analytics.LogReg.

Properties

Name Type Optional Description

lambda

number

Yes

The regularization parameter.

Defaults to 1.

intercept

boolean

Yes

Indicates whether to automatically include the intercept.

Defaults to false.

inner

MDSParam  Object

An object used for the construction of module:analytics.MDS.

Properties

Name Type Optional Description

maxSecs

number

Yes

The maximum time period to compute Multidimensional Scaling of a matrix.

Defaults to 500.

maxStep

number

Yes

The maximum number of iterations.

Defaults to 5000.

minDiff

number

Yes

The minimum difference criteria in MDS.

Defaults to 1e-4.

distType

string

Yes

The type of distance used. Available types: "Euclid", "Cos", "SqrtCos".

Defaults to "Euclid".

inner

NearestNeighborADExplain  Object

An object used for interpreting the predictions of module:analytics.NearestNeighborAD#explain.

Properties

Name Type Optional Description

nearestID

number

 

The ID of the nearest neighbor.

distance

number

 

The distance to the nearest neighbor.

features

Array of module:analytics~NearestNeighborADFeatureContribution

 

An array with feature contributions.

oldestID

number

 

The ID of the oldest record in the internal buffer (the record that was added first).

newestID

number

 

The ID of the newest record in the internal buffer (the record that was added last).

inner

NearestNeighborADFeatureContribution  Object

An object explaining the prediction of module:analytics.NearestNeighborAD#explain in terms of a single feature. Contained in the object module:analytics~NearestNeighborADExplain.

Properties

Name Type Optional Description

id

number

 

The ID of the feature.

val

number

 

The value of the feature for the vector we are explaining.

nearVal

number

 

The the value of the feature for the nearest neighbor.

contribution

number

 

Fraction of the total distance (v(i) - n(i))^2 / ||v - n||^2.

inner

nnetParam  Object

An object used for the construction of module:analytics.NNet.

Properties

Name Type Optional Description

layout

Array of number

Yes

The array representing the network schema.

Defaults to [1, 2, 1].

learnRate

number

Yes

The learning rate.

Defaults to 0.1.

momentum

number

Yes

The momentum of optimization.

Defaults to 0.5.

tFuncHidden

string

Yes

Type of activation function used on hidden nevrons. Possible options are 'tanHyper', 'sigmoid', 'fastTanh', 'softPlus', 'fastSigmoid' and 'linear'.

Defaults to 'tanHyper'.

tFuncOut

string

Yes

Type of activation function used on output nevrons. Possible options are 'tanHyper', 'sigmoid', 'fastTanh', 'softPlus', 'fastSigmoid' and 'linear'.

Defaults to 'tanHyper'.

inner

oneVsAllParam  Object

An object used for the construction of module:analytics.OneVsAll.

Properties

Name Type Optional Description

model

function()

Yes

Constructor for binary model to be used internaly. Constructor should expect only one parameter.

modelParam

Object

Yes

Parameter for oneVsAllParam.model constructor.

categories

number

Yes

Number of categories.

verbose

boolean

Yes

If false, the console output is supressed.

Defaults to false.

inner

PCAParam  Object

An object used for the construction of module:analytics.PCA.

Properties

Name Type Optional Description

k

number

Yes

Number of eigenvectors to be computed.

Defaults to null.

iter

number

Yes

Number of iterations.

Defaults to 100.

inner

recLinRegParam  Object

An object used for the construction of module:analytics.RecLinReg.

Parameters

Name Type Optional Description

dim

number

 

The dimension of the model.

regFact

number

Yes

The regularization factor.

Defaults to 1.0.

forgetFact

number

Yes

The forgetting factor.

Defaults to 1.0.

inner

RecSysParam  Object

An object used for the construction of module:analytics.RecommenderSys.

Properties

Name Type Optional Description

iter

number

Yes

The maximum number of iterations.

Defaults to 10000.

k

number

Yes

The number of centroids.

Defaults to 2.

tol

number

Yes

The tolerance.

Defaults to 1e-3.

verbose

boolean

Yes

If false, the console output is supressed.

Defaults to false.

inner

ridgeRegParam  Object

An object used for the construction of module:analytics.RidgeReg.

Property

Name Type Optional Description

gamma

number

Yes

The gamma value.

Defaults to 0.0.

inner

SVMParam  Object

SVM constructor parameters. Used for the construction of module:analytics.SVC and module:analytics.SVR.

Properties

Name Type Optional Description

algorithm

string

Yes

The algorithm procedure. Possible options are 'SGD' and 'LIBSVM'. 'PR_LOQO'is not supported anymore.

Defaults to 'SGD'.

c

number

Yes

Cost parameter. Increasing the parameter forces the model to fit the training data more accurately (setting it too large may lead to overfitting) .

Defaults to 1.0.

j

number

Yes

Unbalance parameter. Increasing it gives more weight to the positive examples (getting a better fit on the positive training examples gets a higher priority). Setting j=n is like adding n-1 copies of the positive training examples to the data set.

Defaults to 1.0.

eps

number

Yes

Epsilon insensitive loss parameter. Larger values result in fewer support vectors (smaller model complexity)

Defaults to 1e-3.

batchSize

number

Yes

Number of examples used in the subgradient estimation. Higher number of samples slows down the algorithm, but makes the local steps more accurate.

Defaults to 1000.

maxIterations

number

Yes

Maximum number of iterations.

Defaults to 10000.

maxTime

number

Yes

Maximum runtime in seconds.

Defaults to 1.

minDiff

number

Yes

Stopping criterion tolerance.

Defaults to 1e-6.

type

string

Yes

The subalgorithm procedure in LIBSVM. Possible options are 'C_SVC', 'NU_SVC' and 'ONE_CLASS' for classification and 'EPSILON_SVR', 'NU_SVR' and 'ONE_CLASS' for regression.

Defaults to 'C_SVC'.

kernel

string

Yes

Kernel type in LIBSVM. Possible options are 'LINEAR', 'POLY', 'RBF', 'SIGMOID' and 'PRECOMPUTED'.

Defaults to 'LINEAR'.

gamma

number

Yes

Gamma parameter in LIBSVM. Set gamma in kernel function.

Defaults to 1.0.

p

number

Yes

P parameter in LIBSVM. Set the epsilon in loss function of epsilon-SVR.

Defaults to 1e-1.

degree

number

Yes

Degree parameter in LIBSVM. Set degree in kernel function.

Defaults to 1.

nu

number

Yes

Nu parameter in LIBSVM. Set the parameter nu of nu-SVC, one-class SVM, and nu-SVR.

Defaults to 1e-2.

coef0

number

Yes

Coef0 parameter in LIBSVM. Set coef0 in kernel function.

Defaults to 1.0.

cacheSize

number

Yes

Set cache memory size in MB (default 100) in LIBSVM.

Defaults to 100.

verbose

boolean

Yes

Toggle verbose output in the console.

Defaults to false.

inner

TDigestParam  Object

An object used for the construction of module:analytics.quantiles.TDigest.

Properties

Name Type Optional Description

minCount

number

Yes

The minimal number of examples before the model is initialized.

Defaults to 0.

clusters

number

Yes

The number of 1-d clusters (large values lead to higher memory usage).

Defaults to 100.

inner

tokenizerParam  Object

An object used for the construction of module:analytics.Tokenizer.

Property

Name Type Optional Description

type

string

Yes

The type of the tokenizer. The different types are:
1. 'simple' - Creates break on white spaces.
2. 'html' - Creates break on white spaces and ignores html tags.
3. 'unicode' - Creates break on white spaces and normalizes unicode letters, e.g. o?=o?= changes to cso?=z.

Defaults to 'unicode'.