Evaluation Metrics - Regression

4 minute read

this is a placeholder image

Performance Evaluation Metrics in Machine Learning

This article is part of Machine Learning Evaluation Metrics series where we cover most prominent machine learning metrics used to evaluate/compare machine learning models. Choosing an appropriate metric is crucial while evaluating machine learning (ML) models. Each metric has its own applications and caveats and are grouped based on the ML problem they address. In this article, we shall discuss the metrics used to evaluate regression.

The following list is a overview of how the model evaluation metrics/techniques are classified:

1. Regression Metrics
2. Classification Metrics
3. Ranking Metrics
4. Clustering Metrics
5. Statistical Metrics 5. Other Model Evaluation Techniques

RMSE

Root mean squared error (RMSE) is a quantifiable measure to check how the model’s predictions stack up against the actual outcome for regression tasks. It is defined as the square root of the average squared distance between the actual outcome and the predictions:

\[RMSE = \sqrt{\frac {1}{N} \sum_{u,i}{(\hat{y_i} - y_i)^2}}\]

where $N$ is the number of observations, $\hat{y_i}$ is the predicted value and $y_i$ is the actual value.

Advantages

  • Easy to optimize

Limitations

  • Less intuitive to understand;
  • Sensitive to outliers;

When to use RMSE?
Use RMSE to evaluate your regression model, when the error is evenly distributed across the dataset or for error terms that follow a normal distribution. Since its computationally efficient, its a loss metric of choice when hyperparameter tuning or batch training a deep neural network.

MAE

Mean Absolute Error(MAE) is a loss function used for regression tasks and is defined as the sum of absolute differences between our target and predicted variables.

\[MAE = \frac {1}{N} \sum_{i=1}^{N}{|\hat{y_i} - y_i|}\]

where $N$ is the number of observations, $\hat{y_i}$ is the predicted value and $y_i$ is the actual value.

Advantages

  • Easy to understand;
  • Robust to outliers;

Limitations

  • MAE with absolute value calculation is not differentiable globally, and hence not easy to optimize.

When to use MAE?
If your use case consists of dataset with large outliers, RMSE will be much larger than MAE. Using MAE will be appropriate in such cases.

Quantile of Errors

Quantile of errors, also known as the residual quantiles play a vital role in inspecting statistical models. It is useful in examining models like the normal linear regression, because we can expect that the residuals are normally distributed and have equal variance. In non-normal regression situations, such as logistic regression or log-linear analysis, the residual quantiles are practically of no use.

Advantages

  • Robust to outliers;
  • Useful for examining normal linear regression models;

Limitations

  • Not applicable in case of non-normal regression situations;

When to use Residual Quantiles?
Use quantiles to evaluate your statistical models, when the distribution is normal.

MAPE

The mean absolute percentage error(MAPE), also known as mean absolute percentage deviation (MAPD) is one of the most widely used measures of forecast accuracy and is expressed as a ratio defined by the formula:

\[MAPE = \frac {1}{n} \sum_{t=1}^{n}\left |{\frac{A_t - F_t}{A_t}}\right |\]

where $A_t$ is the Actual value, $F_t$ is the forecast value. The MAPE is also sometimes reported as a percentage, which is the above equation multiplied by 100.

Advantages

  • THey are dimensionless and easy to interpret;
  • It is scale-independent and can apply easily to both high and low volume products;

Limitations

  • MAPE produces infinite or undefined values for zero or close-to-zero actual values
  • MAPE is asymmetric, it imposes a larger penalty for negative errors than for positive errors.
  • MAPE assumes that the unit of measurement of the variable has a meaningful zero value. For instance, it should not be used to calculate the accuracy of a temperature forecast since it can take arbitrary zero value.
  • MAPE is not differentiable everywhere, and it can cause issues when used as an optimization criterion.

When to use MAPE?
MAPE is a good measure for assessing demand volatility, comparing overall process outcome, Although MAPE is used extensively in forecasting accuracy, it is advised to use it in conjunction with other metrics . Additionally, there are various alternative measures to fix the shortcomings of MAPE. These include symmetric mean absolute percentage error (SMAPE), mean absolute proportional error (MASE), average direction accuracy (MDA), weighted MAPE (WMAPE) etc.

RMSLE

The root mean squared log error(RMSLE)

\[RMSLE = \sqrt{\frac {1}{N} \sum_{u,i}{(\hat{y_i} - y_i)^2}}\]

Advantages

  • Robust to outliers;

Limitations

Applications

RRSE

Advantages

Limitations

Applications

RAE

Advantages

Limitations

Applications

R-Squared

Advantages

Limitations

Applications

Adjusted R-Squared

Advantages

Limitations

Applications

Thanks for reading! If you want to get in touch with me or leave me any feedback, feel free to reach me on my email.

References

Gneiting, T. Making and Evaluating Point Forecasts. Journal of the American Statistical Association, 2011, 106, 746-762

Goodwin, P. & Lawton, R. On the asymmetry of the symmetric MAPE. International Journal of Forecasting, 1999, 15, 405-408

Hoover, J. Measuring Forecast Accuracy: Omissions in Today’s Forecasting Engines and Demand-Planning Software. Foresight: The International Journal of Applied Forecasting, 2006, 4, 32-35

Kolassa, S. Why the “best” point forecast depends on the error or accuracy measure (Invited commentary on the M4 forecasting competition). International Journal of Forecasting, 2020, 36(1), 208-211

Kolassa, S. & Martin, R. Percentage Errors Can Ruin Your Day (and Rolling the Dice Shows How). Foresight: The International Journal of Applied Forecasting, 2011, 23, 21-29

Kolassa, S. & Schütz, W. Advantages of the MAD/Mean ratio over the MAPE. Foresight: The International Journal of Applied Forecasting, 2007, 6, 40-43

McKenzie, J. Mean absolute percentage error and bias in economic forecasting. Economics Letters, 2011, 113, 259-262

Zheng, S. Gradient descent algorithms for quantile regression with smooth approximation. International Journal of Machine Learning and Cybernetics, 2011, 2, 191-207

Mark A. Moon, Demand and Supply Integration: The Key to World-Class Demand Forecasting. Performance Measurement, 2013,

Next Generation Demand Management by Charles W. Chase Published by Wiley, 2016 WHY MAPE IS NOT ALWAYS THE BEST METRIC

Updated: