Map
> Problem Definition > Data
Preparation > Data Exploration
> Modeling > Evaluation > Deployment |
|
|
|
|
|
Model Evaluation
|
|
|
Model Evaluation is an integral part of
the model development process.
It helps to find the best model that represents our data and how well the
chosen model will work in the future. Evaluating model performance with the
data used for training is not acceptable in data science because it can
easily generate overoptimistic and overfitted models. There are two methods of evaluating models in data
science, Hold-Out and
Cross-Validation. To avoid overfitting, both methods use a test set (not seen by the model)
to evaluate model performance. |
|
|
|
|
|
Hold-Out |
|
|
In this method, the mostly large dataset
is randomly divided to three subsets: |
|
|
|
|
|
- Training set is a subset of the dataset used to
build predictive models.
- Validation set is a subset of the dataset used
to assess the performance of model built in the training phase. It provides a
test platform for fine tuning model's parameters and selecting the best-performing model.
Not all modeling algorithms need a validation set.
- Test set or unseen
examples is a subset of the dataset to assess the likely future performance of
a model. If a model fit to the training set much better than it fits the test set, overfitting is
probably the cause.
|
|
|
|
|
|
Cross-Validation |
|
|
When only a limited amount of
data is available, to achieve an unbiased estimate of the
model performance we use k-fold cross-validation. In
k-fold cross-validation, we divide the data into k subsets of equal size.
We build models k times, each time leaving out one of the subsets from
training and use it as the test set. If k equals the sample size, this is called "leave-one-out". |
|
|
|
|
|
Model evaluation can be divided to two sections: |
|
|
|
|
|
|
|
|