Browse the glossary using this index

Information gain or Entropy

A metric used when constructing a decision tree to determine the ‘’ pair that best separates the two classes of the problem.

Insulin resistance

A condition in which the tissue response to the action of insulin upon circulating glucose is impaired, especially in the liver, skeletal muscle, adipose tissue, and brain. After some time, this alteration, in conjunction with deficient insulin production by the pancreas, can lead to the development of type-2 diabetes mellitus.


A desirable property for any machine learning model is its interpretability, meaning that the type of discriminant function can be understood by a human user, making it the opposite of a black box model.


In reference to association rules, an item refers to one of the elements of the rows of a data set. In the case of regression or classification techniques, this would be the equivalent of the value for a single variable in an instance.


Set of items from a dataset.


K-fold cross validation or Cross-validation of k-partitions

A validation or partitioning technique by which ‘k’ disjointed sets are created for testing. For each of them, the training set is formed by joining the remaining ‘k−1’ sets.


In reference to association rules, a k-itemset refers to a set of k items.

K-nearest neighbors

A supervised learning model that approximates the output value to that of the K most similar training samples.

KEGG (Kyoto Encyclopedia of Genes and Genomes)

A database used to help understand phenotypes and biological systems based on molecular information, especially sets of metabolic pathways and the signaling networks operating in different organisms.

Kernel function

A mathematical function that performs a non-linear transformation to increase the dimensionality of the problem. The most common kernel functions are the polynomial function and the radial base function (RBF).

