Skip to main content


Browse the glossary using this index

Special | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | ALL

Page: (Previous)   1  2  3  4  5  6  7  8  9  10  ...  15  (Next)



In regression it refers to the phenomenon by which one variable, X1, is a linear combination of another variable, X2.

Computer cluster

A computer cluster is a (large) set of computers connected by a network which behave as if they were one single computer.

Confidence of an AR

Represents the ratio of instances in which the antecedent and consequent of the rule appear together with respect to the set of instances in which the antecedent appears.

Confounding or Spurious variable

Input attributes that correlate with the output variable but do not really represent information that is useful for prediction.

Confusion matrix

A square table which groups the classification results, usually by rows indicating the actual class and by columns indicating the predicted class. The diagonal compiles the total number of hits, or well-labelled examples.

Consequent of the AR

In rule A → C, C is the consequent of the rule. In other words, there is a high probability that it will appear in the instance when A appears in that instance.


This metric represents the expected error of the rule. In other words, how often the antecedent of the rule appears in a transaction in which the consequent does not appear. Its domain is [0,∞], where values less than 1 represent negative dependence, 1 represents independence, and values greater than 1 represent positive dependence.

Cost (SVM parameter)

Indicates how permissive the training is based on the training samples that incorrectly fall on the opposite side of the separation plane for their class. The cost is directly related to SVM overlearning.

CSV (comma-separated values)

These refer to a file containing a table of data in text format, with each datapoint separated by commas or other delimiters. Each row of the table corresponds to a line in the file, and each of the field values in the rows are separated from the others by commas.


Data bias

An unwanted situation whereby the collected data have some properties that make them difficult to correctly learn. For example, an uneven distribution of examples in classes.

Page: (Previous)   1  2  3  4  5  6  7  8  9  10  ...  15  (Next)