Table of Contents
Glossary of machine learning terms
A
accuracy
Percentage of correct predictions by a classification model.
It is defined as $$ \frac{TP + TN}{TP + FP + FN + TN}\,.$$
TP…true positive, TN…true negative, FP…false positive, FN…false negative
activation function
A function that defines the output of a layer in a neural network given an input from the previous layer (e.g. ReLU).
active learning
An ML approach in which the algorithm chooses the data to learn from. An active learning approach is particularly useful when there is a lot of unlabeled data and manual labeling is very expensive. Often, the number of examples to learn from is lower than when blindly seeking a diverse range of labeled examples in normal supervised learning.
B
batch normalisation
A method that makes the training of a deep neural network faster and more stable. It consists of normalising the input and ouput of an activation function in a hidden layer.
C
class
One of a set of target values for a label.
classification
The prediction of a model is a category, i.e. a discrete class.
clustering
Grouping of data, particulary during unsupervised learning. There exist many clustering algorithms.
convolutional layer
A layer in a deep neural network in which a convolutional filter passes over the input matrix.
convolutional neural network (CNN)
A neural network in which at least one layer is a convolutional layer.
cross-validation
A method to estimate how well a model will generalise to new data. In cross-validation, the model is trained on a subset of the data and then validated on the remaining non-overlapping subsets, e.g. k-fold cross-validation.
D
data imbalance
When the labels of the classes have significantly different statistical distributions in the data set. It is also termed class-imbalanced data set.
deep learning
deep neural network
A type of neural network containing multiple hidden layers.
E
early stopping
epoch
Describes the number of times the algorithm sees the whole data set.
F
F1
false negative (FN)
false positive (FN)
false positive rate (FPR)
feature
An input variable for making predictions.
feature engineering
The process of converting data into useful features for training a model.
feature selection
The process of selecting relevant features from a data set.
feature vector
A list of features passed into a model.
G
H
hidden layer
Artificial layer in a neural network between input and output layer. Typically, hidden layers contain activation functions.
hierarchical agglomerative clustering
A clustering approach that creates a tree of clusters, specifially well-suited for hierarchically organised data. In a first step, the algorithm assigns a cluster to each example. In a second step, it merges the closest clusters to create a hierarchical tree.
hyperparameters
Higher-level properties of a model, such as the learning rate (how fast it can learn) or the number of hidden layers.
I
J
K
k-fold cross-validation
The training set is split into k smaller subsets. The model is trained on one of the k folds as training set and validated on the remaining (k-1) folds. This is done for all k folds. The performance measure calculated by the k-fold cross-validation is the average of the results of all k folds.
L
label
long short-term memory (LSTM)
loss
M
model
multi-class classification
N
neural network
P
precision
prediction
Output of a model.
predictor
R
recall
rectified linear unit (ReLU)
An activation function defined as follows:
- If the input is negative or zero, the ouput is zero.
- if the input is positive, the output is equal to the input.
S
supervised learning
A labeled data set is used to train a model.