Model evaluation

After training a model, there is a process called “Model evaluation”, which allows users to understand a model’s performance in the form of a metric. The metrics show us how well the new model perform

Beta version: The platform is in a development stage of the final version, which may be less stable than usual. The efficiency of platform access and usage might be limited. For example, the platform might crash, some features might not work properly, or some data might be lost.

Evaluation metrics

The evaluation metrics are designed for specific topics as follows.

1. Classification metrics

2. Regression metrics

3. Clustering metrics

Classification metrics

The metrics are designed to measure classification model performance. The classification metrics are listed below:

Accuracy – is the ratio of correctly predicted observations to the total observations. The accuracy indicates how much percentage the corrected prediction is.
Precision – is the ratio of correctly predicted observations of a specific class to the total predicted observations of a specific class.
Recall – is the ratio of correctly predicted observations of a specific class to the total observations of a specific class.
F1 score – is the weighted average of precision and recall. It is a harmonic mean.
Confusion matrix – represents the results from the model by reporting the numbers which match between the prediction and actual results in every class. It is usually presented in a table or a matrix.

Regression metrics

Coming soon

Clustering metrics

Coming soon

Model evaluation on ACP

The model evaluation is part of ACP.

Model evaluation for classification model

Model evaluation page for classification model is divided into 3 parts:

1. Metric part

The F1 score, precision, and recall are presented by the macro average*. The values in the “Train” row are calculated from the train dataset. Similarly, the values in the “Test” row are calculated from the test dataset.

A high model performance should have overall high values in both “Train” and “Test”. If “Train” is overall high but “Test” is overall quite low (or contains a big gap between “Train” and “Test”), the probability that the model is overfitting** is high. If the overall score is low, then the model is not performing well or did not learn properly.

*Macro average is computed by taking the unweighted mean of all the class matrix (F1 score, precision and, recall)

**Overfitting is a modeling error occurring when a model relies too strongly on the train dataset, resulting in a model’s inability to generalize, which means it will not perform well on other data.

2. Confusion Matrix (CM) part

A confusion matrix is a table with rows and columns that represent the numbers of classes. The rows are actual classes, and the columns are predicted classes. The color shades indicate the value range. The shades references can be checked on the bar by the right side of the confusion matrix.

There are two functions on Confusion Matrix (CM):

Select train or test dataset to present on the confusion matrix. The confusion matrix of a train dataset or test dataset can be selected at the top right of the matrix.
Select normalization. The confusion matrix can also report the normalized value by the actual class through clicking on the dropdown box and choosing “True” and by the predicted class through choosing “Prediction”. The default value is set as “None”. “Normalized” means the total of elements in each “Actual class” or “Predicted class” is normalized to be 1.00

None (without normalization): The confusion matrix will be presented as the number of elements in each pair of specific “Actual class” and “Predicted class”
True (Sum of percentage on each actual class is 1.00): The confusion matrix will be presented as percentage of elements in each pair, calculated by the number of elements in each pair divided by sum of elements on its “Actual class” axis
Prediction (Sum of percentage on each predicted class is 1.00): The confusion matrix will be presented as percentage of elements in each pair, calculated by the number of elements in each pair divided by sum of elements on its “Predicted class” axis

3. Prediction result part

The prediction result is presented in tabular form. The table consists of 4 columns: Row number, Sentence, Prediction, and Ground truth (Actual class). The result is separated into two tables: True prediction and False prediction, users can monitor the prediction performance easily.

True prediction: displays all true predictions. It can be filtered by classes based on prediction or actual class (ground truth), and users can choose between sorting data by row and random data.
False prediction: Similarly, all false prediction can also be viewed. The filter is also set by classes based on prediction or actual class (ground truth)

PreviousData preparation

Last updated 2 years ago

Model evaluation

Evaluation metrics

The evaluation metrics are designed for specific topics as follows.

1. Classification metrics

2. Regression metrics

3. Clustering metrics

Classification metrics

The metrics are designed to measure classification model performance. The classification metrics are listed below:

Accuracy – is the ratio of correctly predicted observations to the total observations. The accuracy indicates how much percentage the corrected prediction is.
Precision – is the ratio of correctly predicted observations of a specific class to the total predicted observations of a specific class.
Recall – is the ratio of correctly predicted observations of a specific class to the total observations of a specific class.
F1 score – is the weighted average of precision and recall. It is a harmonic mean.
Confusion matrix – represents the results from the model by reporting the numbers which match between the prediction and actual results in every class. It is usually presented in a table or a matrix.

Regression metrics

Coming soon

Clustering metrics

Coming soon

Model evaluation on ACP

The model evaluation is part of ACP.

: to observe our pre-trained model performance
: to monitor your model performance after it has been trained

Model evaluation for classification model

Model evaluation page for classification model is divided into 3 parts:

1. Metric part

*Macro average is computed by taking the unweighted mean of all the class matrix (F1 score, precision and, recall)

2. Confusion Matrix (CM) part

There are two functions on Confusion Matrix (CM):

Select train or test dataset to present on the confusion matrix. The confusion matrix of a train dataset or test dataset can be selected at the top right of the matrix.
Select normalization. The confusion matrix can also report the normalized value by the actual class through clicking on the dropdown box and choosing “True” and by the predicted class through choosing “Prediction”. The default value is set as “None”. “Normalized” means the total of elements in each “Actual class” or “Predicted class” is normalized to be 1.00

None (without normalization): The confusion matrix will be presented as the number of elements in each pair of specific “Actual class” and “Predicted class”
True (Sum of percentage on each actual class is 1.00): The confusion matrix will be presented as percentage of elements in each pair, calculated by the number of elements in each pair divided by sum of elements on its “Actual class” axis
Prediction (Sum of percentage on each predicted class is 1.00): The confusion matrix will be presented as percentage of elements in each pair, calculated by the number of elements in each pair divided by sum of elements on its “Predicted class” axis

3. Prediction result part

True prediction: displays all true predictions. It can be filtered by classes based on prediction or actual class (ground truth), and users can choose between sorting data by row and random data.
False prediction: Similarly, all false prediction can also be viewed. The filter is also set by classes based on prediction or actual class (ground truth)

PreviousData preparation

Last updated 2 years ago