TY - JOUR
T1 - Improving the predictive accuracy of production frontier models for efficiency measurement using machine learning
T2 - The LSB-MAFS method
AU - Guillen, María D.
AU - Aparicio, Juan
AU - Zofío, José L.
AU - España, Victor J.
N1 - Publisher Copyright: © 2024 The Authors
PY - 2024/11
Y1 - 2024/11
N2 - Making accurate predictions of the true production frontier is critical for reliable efficiency analysis. However, canonical deterministic methods like Data Envelopment Analysis (DEA) provide approximations of the production frontier that cannot accommodate noise satisfactorily and suffer from overfitting. This study combines machine learning techniques known as Least Squares Boosting (LSB) and Multivariate Adaptive Regression Splines (MARS), to introduce a new methodology that improves the accuracy of production frontiers predictions and overcomes previous limitations. The new method fits pairwise regression splines to the data while ensuring that the predicted production frontiers satisfy certain the required regularity conditions: envelopmentness, monotonicity, and concavity. The method, termed LSB-MAFS, is implemented through computational algorithms, and we illustrate its applicability by performing simulations with several data generating processes. We also compare its performance against the most popular alternatives, considering both deterministic and stochastic scenarios: DEA, bootstrapped DEA, Corrected Concave Non-Parametric Least Squares (C2NLS) and Stochastic Frontier Analysis (SFA). The new method outperforms these alternatives in the most complex scenarios, including stochastic settings where parametric methods like SFA should perform better in principle. We conclude that our approach to production frontier prediction is a valid and competitive alternative for dependable efficiency analysis.
AB - Making accurate predictions of the true production frontier is critical for reliable efficiency analysis. However, canonical deterministic methods like Data Envelopment Analysis (DEA) provide approximations of the production frontier that cannot accommodate noise satisfactorily and suffer from overfitting. This study combines machine learning techniques known as Least Squares Boosting (LSB) and Multivariate Adaptive Regression Splines (MARS), to introduce a new methodology that improves the accuracy of production frontiers predictions and overcomes previous limitations. The new method fits pairwise regression splines to the data while ensuring that the predicted production frontiers satisfy certain the required regularity conditions: envelopmentness, monotonicity, and concavity. The method, termed LSB-MAFS, is implemented through computational algorithms, and we illustrate its applicability by performing simulations with several data generating processes. We also compare its performance against the most popular alternatives, considering both deterministic and stochastic scenarios: DEA, bootstrapped DEA, Corrected Concave Non-Parametric Least Squares (C2NLS) and Stochastic Frontier Analysis (SFA). The new method outperforms these alternatives in the most complex scenarios, including stochastic settings where parametric methods like SFA should perform better in principle. We conclude that our approach to production frontier prediction is a valid and competitive alternative for dependable efficiency analysis.
UR - http://www.scopus.com/inward/record.url?scp=85200608931&partnerID=8YFLogxK
U2 - 10.1016/j.cor.2024.106793
DO - 10.1016/j.cor.2024.106793
M3 - Article
AN - SCOPUS:85200608931
SN - 0305-0548
VL - 171
JO - Computers and Operations Research
JF - Computers and Operations Research
M1 - 106793
ER -