This is an old revision of the document!

Glossary of machine learning terminology

A

accuracy

Percentage of correct predictions by a classification model.

It is defined as $$ \frac{TP + TN}{TP + FP + FN + TN}\,.$$

TP…true positive, TN…true negative, FP…false positive, FN…false negative

active learning

An ML approach in which the algorithm chooses the data to learn from. An active learning approach is particularly useful when there is a lot of unlabeled data and manual labeling is very expensive. Often, the number of examples to learn from is lower than when blindly seeking a diverse range of labeled examples in normal supervised learning.

B

binary classification

C

classification

The prediction of a model is a category.

clustering

Grouping of data, particulary during unsupervised learning. There exist many clustering algorithms.

convolutional neural network (CNN)

cross-validation

D

deep learning

deep neural network

A type of neural network containing multiple hidden layers.

E

early stopping

epoch

Describes the number of times the algorithm sees the whole data set.

F

F1

false negative (FN)

false positive (FN)

false positive rate (FPR)

feature

An input variable for making predictions.

feature engineering

The process of converting data into useful features for training a model. Feature selection is a part of feature engineering.

feature selection

The process of selecting relevant features from a data set.

feature vector

G

H

hidden layer

hierarchical agglomerative clustering

A clustering approach that creates a tree of clusters, specifially well-suited for hierarchically organised data. In a first step, the algorithm assigns a cluster to each example. In a second step, it merges the closest clusters to create a hierarchical tree.

Table of Contents

Glossary of machine learning terminology

A

accuracy

active learning

B

binary classification

C

classification

clustering

convolutional neural network (CNN)

cross-validation

D

deep learning

deep neural network

E

early stopping

epoch

F

F1

false negative (FN)

false positive (FN)

false positive rate (FPR)

feature

feature engineering

feature selection

feature vector

G

H

hidden layer

hierarchical agglomerative clustering

hyperparameters

I

J

K

k-fold validation

L

label

long short-term memory (LSTM)

loss

M

model

multi-class classification

N

neural network

P

precision

prediction

predictor

R

recall

S

supervised learning

T

true negative (TN)

true positive (TP)

U

unsupervised learning