4th European International Conference on Industrial Engineering and Operations Management

Comparing the Accuracy of Prediction Models based on Ensemble Machine Learning Schemes

Carlos Hernández
Publisher: IEOM Society International
0 Paper Citations
1 Views
1 Downloads
Track: Machine Learning
Abstract

This research analyzes the influence of the configuration of ensemble learning algorithms’ accuracy when predicting the annual production of honey for export in the south of Chile. The research is carried out following a classic 4-stage methodology (analysis, design, development, and validation). During the analysis, data is gathered and preprocessed. During the design, independent variables, ensemble algorithms, and performance metrics (correlation coefficient, MAE and RMSE) are defined. Construction and validation are carried out using the software WEKA. To build the models, 9 variables are considered. The dataset is split up in a subset for training and test (80%) and another one for validation (20%). The predictions are obtained by means of configuring a stacking scheme as ensemble and interchanging a support vector machine, a linear regression, a decision tree, and a Gaussian process as meta or base learners. According to the results, while the correlation coefficient between predictions and actual values fluctuates significantly in the range of 18% to 46%, MAE does it between 32% and 37%. In conclusion, although being inaccurate, results suggests that the arrangement of the meta and base algorithms within the ensemble does affect the prediction accuracy.

Published in: 4th European International Conference on Industrial Engineering and Operations Management, Rome, Italy

Publisher: IEOM Society International
Date of Conference: August 2-5, 2021

ISBN: 978-1-7923-6127-2
ISSN/E-ISSN: 2169-8767