## Method

### nmf(mat, k[, json]) → Object

Calculates the non-negative matrix factorization, see: https://en.wikipedia.org/wiki/Non-negative_matrix_factorization.

#### Examples

Asynchronous function

``````// import modules
var analytics = require('qminer').analytics;
var la = require('qminer').la;
// create a matrix
var mat = new la.Matrix({ rows: 10, cols: 5, random: true });
// compute the non-negative matrix factorization
analytics.nmfAsync(mat, 3, { iter: 100, tol: 1e-4 }, function (err, result) {
if (err) { console.log(err); }
// calculation successful
var U = result.U;
var V = result.V;
});``````

Synchronous function

``````// import modules
var analytics = require('qminer').analytics;
var la = require('qminer').la;
// create a matrix
var mat = new la.Matrix({ rows: 10, cols: 5, random: true });
// compute the non-negative matrix factorization
var result = analytics.nmf(mat, 3, { iter: 100, tol: 1e-4 });
var U = result.U;
var V = result.V;``````

#### Parameters

Name Type Optional Description

mat

The non-negative matrix.

k

number

The reduced rank, e.g. number of columns in matrix U and number of rows in matrix V. Must be between 0 and `min(mat.rows, mat.cols)`.

json

Object

Yes

Algorithm options.

Values in `json` have the following properties:

Name Type Optional Description

iter

number

Yes

The number of iterations used for the algorithm.

Defaults to `100`.

tol

number

Yes

The tolerance.

Defaults to `1e-3`.

verbose

boolean

Yes

If false, the console output is supressed.

Defaults to `false`.

Returns

`Object`B The json object `nmfRes` containing the non-negative matrices U and V:
`nmfRes.U`- The module:la.Matrix representation of the matrix U,
`nmfRes.V`- The module:la.Matrix representation of the matrix V.

## Abstract types

inner

### ActiveLearnerParam  Object

An object used for the construction of module:analytics.ActiveLearner.

#### Properties

Name Type Optional Description

learner

Object

Yes

Learner parameters

Values in `learner` have the following properties:

Name Type Optional Description

disableAsserts

boolean

Yes

Disable input asserting

Defaults to `false`.

SVC

module:analytics~SVMParam

Yes

Support vector classifier parameters.

inner

### BiasedGkParam  Object

An object used for the construction of module:analytics.quantiles.BiasedGk.

#### Properties

Name Type Optional Description

targetProb

number

Yes

The probability where the algorithm is most accurate. Its accuracy is determined as epsmax(p, targetProb) when targetProb < 0.5 and epsmax(1-p, 1-targetProb) when targetProb >= 0.5. Higher values of `targetProb` allow for a smaller memory footprint.

Defaults to `0.01`.

eps

number

Yes

Parameter which determines the accuracy.

Defaults to `0.1`.

compression

string

Yes

Determines when the algorithm compresses its summary. Options are: "periodic", "aggressive" and "manual".

Defaults to `"periodic"`.

useBands

boolean

Yes

Whether the algorithm should use the 'band' subprocedure. Using this subprocedure should result in a smaller summary.

Defaults to `true`.

inner

### BufferedDigestParam  Object

An object used for the construction of module:analytics.quantiles.BufferedTDigest.

#### Properties

Name Type Optional Description

delta

number

Yes

The number of clusters in the summary is bounded by floor(minClusters) <= clusters < 2*ceil(minClusters)

Defaults to `100`.

bufferLen

number

Yes

the size of the buffer is minClustersbufferLenFactor, when the buffer fills it is merged with the summary. Also, the algorithm initializes after seeing minClustersbufferLenFactor examples.

Defaults to `1000`.

seed

number

Yes

random seed (values above 1 are deterministic)

Defaults to `0`.

inner

### detectorParam  Object

An object used for the construction of module:analytics.NearestNeighborAD.

#### Parameters

Name Type Optional Description

rate

number

Yes

The expected fracton of emmited anomalies (0.05 -> 5% of cases will be classified as anomalies).

Defaults to `0.05`.

windowSize

number

Yes

Number of most recent instances kept in the model.

Defaults to `100`.

inner

### DpMeansExplain  Object

The examplanation returned by module:analytics.KMeans#explain.

#### Properties

Name Type Optional Description

medoidID

number

The ID of the nearest medoids.

featureIDs

module:la.IntVector

The IDs of features, sorted by contribution.

featureContributions

module:la.Vector

Weights of each feature contribution (sum to 1.0).

inner

### DpMeansParam  Object

An object used for the construction of module:analytics.KMeans.

#### Properties

Name Type Optional Description

iter

number

Yes

The maximum number of iterations.

Defaults to `10000`.

lambda

number

Yes

Defaults to `1`.

minClusters

number

Yes

Minimum number of clusters

Defaults to `2`.

maxClusters

number

Yes

Maximum number of clusters

Defaults to `inf`.

allowEmpty

boolean

Yes

Whether to allow empty clusters to be generated.

Defaults to `true`.

calcDistQual

boolean

Yes

Whether to calculate the quality measure based on distance, if false relMeanCentroidDist will return 'undefined'

Defaults to `false`.

centroidType

string

Yes

The type of centroids. Possible options are `'Dense'` and `'Sparse'`.

Defaults to `"Dense"`.

distanceType

string

Yes

The distance type used at the calculations. Possible options are `'Euclid'` and `'Cos'`.

Defaults to `"Euclid"`.

verbose

boolean

Yes

If `false`, the console output is supressed.

Defaults to `false`.

fitIdx

Array of number

Yes

The index array used for the construction of the initial centroids.

fitStart

Object

Yes

The KMeans model returned by module:analytics.KMeans.prototype.getModel used for centroid initialization.

Values in `fitStart` have the following properties:

Name Type Optional Description

C

The centroid matrix.

inner

### GkParam  Object

An object used for the construction of module:analytics.quantiles.Gk.

#### Properties

Name Type Optional Description

eps

number

Yes

Determines the relative error of the algorithm.

Defaults to `0.01`.

autoCompress

boolean

Yes

Whether the summary should be compresses automatically or manually.

Defaults to `true`.

useBands

boolean

Yes

Whether the algorithm should use the 'band' subprocedure. Using this subprocedure should result in a smaller summary.

Defaults to `true`.

inner

### hazardModelParam  Object

An object used for the construction of module:analytics.PropHazards.

#### Property

Name Type Optional Description

lambda

number

Yes

The regularization parameter.

Defaults to `0`.

inner

### KMeansExplain  Object

The examplanation returned by module:analytics.KMeans#explain.

#### Properties

Name Type Optional Description

medoidID

number

The ID of the nearest medoids.

featureIDs

module:la.IntVector

The IDs of features, sorted by contribution.

featureContributions

module:la.Vector

Weights of each feature contribution (sum to 1.0).

inner

### KMeansParam  Object

An object used for the construction of module:analytics.KMeans.

#### Properties

Name Type Optional Description

iter

number

Yes

The maximum number of iterations.

Defaults to `10000`.

k

number

Yes

The number of centroids.

Defaults to `2`.

allowEmpty

boolean

Yes

Whether to allow empty clusters to be generated.

Defaults to `true`.

calcDistQual

boolean

Yes

Whether to calculate the quality measure based on distance, if false relMeanCentroidDist will return 'undefined'

Defaults to `false`.

centroidType

string

Yes

The type of centroids. Possible options are `'Dense'` and `'Sparse'`.

Defaults to `"Dense"`.

distanceType

string

Yes

The distance type used at the calculations. Possible options are `'Euclid'` and `'Cos'`.

Defaults to `"Euclid"`.

verbose

boolean

Yes

If `false`, the console output is supressed.

Defaults to `false`.

fitIdx

Array of number

Yes

The index array used for the construction of the initial centroids.

fitStart

Object

Yes

The KMeans model returned by module:analytics.KMeans.prototype.getModel used for centroid initialization.

Values in `fitStart` have the following properties:

Name Type Optional Description

C

The centroid matrix.

inner

### logisticRegParam  Object

An object used for the construction of module:analytics.LogReg.

#### Properties

Name Type Optional Description

lambda

number

Yes

The regularization parameter.

Defaults to `1`.

intercept

boolean

Yes

Indicates whether to automatically include the intercept.

Defaults to `false`.

inner

### MDSParam  Object

An object used for the construction of module:analytics.MDS.

#### Properties

Name Type Optional Description

maxSecs

number

Yes

The maximum time period to compute Multidimensional Scaling of a matrix.

Defaults to `500`.

maxStep

number

Yes

The maximum number of iterations.

Defaults to `5000`.

minDiff

number

Yes

The minimum difference criteria in MDS.

Defaults to `1e-4`.

distType

string

Yes

The type of distance used. Available types: "Euclid", "Cos", "SqrtCos".

Defaults to `"Euclid"`.

inner

An object used for interpreting the predictions of module:analytics.NearestNeighborAD#explain.

#### Properties

Name Type Optional Description

nearestID

number

The ID of the nearest neighbor.

distance

number

The distance to the nearest neighbor.

features

An array with feature contributions.

oldestID

number

The ID of the oldest record in the internal buffer (the record that was added first).

number

The ID of the newest record in the internal buffer (the record that was added last).

inner

An object explaining the prediction of module:analytics.NearestNeighborAD#explain in terms of a single feature. Contained in the object module:analytics~NearestNeighborADExplain.

#### Properties

Name Type Optional Description

id

number

The ID of the feature.

val

number

The value of the feature for the vector we are explaining.

nearVal

number

The the value of the feature for the nearest neighbor.

contribution

number

Fraction of the total distance `(v(i) - n(i))^2 / ||v - n||^2`.

inner

### nnetParam  Object

An object used for the construction of module:analytics.NNet.

#### Properties

Name Type Optional Description

layout

Array of number

Yes

The array representing the network schema.

Defaults to `[1, 2, 1]`.

learnRate

number

Yes

The learning rate.

Defaults to `0.1`.

momentum

number

Yes

The momentum of optimization.

Defaults to `0.5`.

tFuncHidden

string

Yes

Type of activation function used on hidden nevrons. Possible options are `'tanHyper'`, `'sigmoid'`, `'fastTanh'`, `'softPlus'`, `'fastSigmoid'` and `'linear'`.

Defaults to `'tanHyper'`.

tFuncOut

string

Yes

Type of activation function used on output nevrons. Possible options are `'tanHyper'`, `'sigmoid'`, `'fastTanh'`, `'softPlus'`, `'fastSigmoid'` and `'linear'`.

Defaults to `'tanHyper'`.

inner

### oneVsAllParam  Object

An object used for the construction of module:analytics.OneVsAll.

#### Properties

Name Type Optional Description

model

function()

Yes

Constructor for binary model to be used internaly. Constructor should expect only one parameter.

modelParam

Object

Yes

Parameter for `oneVsAllParam.model` constructor.

categories

number

Yes

Number of categories.

verbose

boolean

Yes

If false, the console output is supressed.

Defaults to `false`.

inner

### PCAParam  Object

An object used for the construction of module:analytics.PCA.

#### Properties

Name Type Optional Description

k

number

Yes

Number of eigenvectors to be computed.

Defaults to `null`.

iter

number

Yes

Number of iterations.

Defaults to `100`.

inner

### recLinRegParam  Object

An object used for the construction of module:analytics.RecLinReg.

#### Parameters

Name Type Optional Description

dim

number

The dimension of the model.

regFact

number

Yes

The regularization factor.

Defaults to `1.0`.

forgetFact

number

Yes

The forgetting factor.

Defaults to `1.0`.

inner

### RecSysParam  Object

An object used for the construction of module:analytics.RecommenderSys.

#### Properties

Name Type Optional Description

iter

number

Yes

The maximum number of iterations.

Defaults to `10000`.

k

number

Yes

The number of centroids.

Defaults to `2`.

tol

number

Yes

The tolerance.

Defaults to `1e-3`.

verbose

boolean

Yes

If false, the console output is supressed.

Defaults to `false`.

inner

### ridgeRegParam  Object

An object used for the construction of module:analytics.RidgeReg.

#### Property

Name Type Optional Description

gamma

number

Yes

The gamma value.

Defaults to `0.0`.

inner

### SVMParam  Object

SVM constructor parameters. Used for the construction of module:analytics.SVC and module:analytics.SVR.

#### Properties

Name Type Optional Description

algorithm

string

Yes

The algorithm procedure. Possible options are `'SGD'` and `'LIBSVM'`. `'PR_LOQO'`is not supported anymore.

Defaults to `'SGD'`.

c

number

Yes

Cost parameter. Increasing the parameter forces the model to fit the training data more accurately (setting it too large may lead to overfitting) .

Defaults to `1.0`.

j

number

Yes

Unbalance parameter. Increasing it gives more weight to the positive examples (getting a better fit on the positive training examples gets a higher priority). Setting j=n is like adding n-1 copies of the positive training examples to the data set.

Defaults to `1.0`.

eps

number

Yes

Epsilon insensitive loss parameter. Larger values result in fewer support vectors (smaller model complexity)

Defaults to `1e-3`.

batchSize

number

Yes

Number of examples used in the subgradient estimation. Higher number of samples slows down the algorithm, but makes the local steps more accurate.

Defaults to `1000`.

maxIterations

number

Yes

Maximum number of iterations.

Defaults to `10000`.

maxTime

number

Yes

Maximum runtime in seconds.

Defaults to `1`.

minDiff

number

Yes

Stopping criterion tolerance.

Defaults to `1e-6`.

type

string

Yes

The subalgorithm procedure in LIBSVM. Possible options are `'C_SVC'`, `'NU_SVC'` and `'ONE_CLASS'` for classification and `'EPSILON_SVR'`, `'NU_SVR'` and `'ONE_CLASS'` for regression.

Defaults to `'C_SVC'`.

kernel

string

Yes

Kernel type in LIBSVM. Possible options are `'LINEAR'`, `'POLY'`, 'RBF'`, 'SIGMOID'` and `'PRECOMPUTED'`.

Defaults to `'LINEAR'`.

gamma

number

Yes

Gamma parameter in LIBSVM. Set gamma in kernel function.

Defaults to `1.0`.

p

number

Yes

P parameter in LIBSVM. Set the epsilon in loss function of epsilon-SVR.

Defaults to `1e-1`.

degree

number

Yes

Degree parameter in LIBSVM. Set degree in kernel function.

Defaults to `1`.

nu

number

Yes

Nu parameter in LIBSVM. Set the parameter nu of nu-SVC, one-class SVM, and nu-SVR.

Defaults to `1e-2`.

coef0

number

Yes

Coef0 parameter in LIBSVM. Set coef0 in kernel function.

Defaults to `1.0`.

cacheSize

number

Yes

Set cache memory size in MB (default 100) in LIBSVM.

Defaults to `100`.

verbose

boolean

Yes

Toggle verbose output in the console.

Defaults to `false`.

inner

### TDigestParam  Object

An object used for the construction of module:analytics.quantiles.TDigest.

#### Properties

Name Type Optional Description

minCount

number

Yes

The minimal number of examples before the model is initialized.

Defaults to `0`.

clusters

number

Yes

The number of 1-d clusters (large values lead to higher memory usage).

Defaults to `100`.

inner

### tokenizerParam  Object

An object used for the construction of module:analytics.Tokenizer.

#### Property

Name Type Optional Description

type

string

Yes

The type of the tokenizer. The different types are:
1. 'simple' - Creates break on white spaces.
2. 'html' - Creates break on white spaces and ignores html tags.
3. 'unicode' - Creates break on white spaces and normalizes unicode letters, e.g. o?=o?= changes to cso?=z.

Defaults to `'unicode'`.