TY - UNPB
T1 - Improved risk predictions of vessels using machine learning: how effective is the status quo?
AU - Knapp, Sabine
AU - van de Velden, Michel
PY - 2024
Y1 - 2024
N2 - This study develops risk prediction models of 22 endpoints of interest (9 incident types, detention and 12 deficiency types) and demonstrates the magnitude of improvement of selecting ships for inspections by Port State Control (PSC). A unique global dataset of 1.2 million observations combining fleet data, global inspection and incident data for the period 2018 to 2023 is used. Over 600 variables were considered, and results were validated using out of sample data. Practical application of the proposed models is appraised by considering various combinations, data-driven as well as expert-based, of the prediction models. Our results show that there is ample room for improvement. Furthermore, the most influential variables for prediction are the beneficial owner, the safety management company, age and size of the vessel. The balanced random forest-based (RF) targeting models clearly outperform those based on logit model predictions. Using prediction models to target high risk vessels can reduce false negative events (that is missing a risky vessel and not selecting it for inspection) by up to 46%. Targeting using prediction models can lead to improved prediction power by a factor of 6 to 8 compared to random selection. Finally, combining the individual models into one main targeting metric does not improve prediction power compared to using the model for predicting incidents (VSS) alone based on the 2023 out of sample data. For model combining, RF models lead to very similar results and appear to be able to incorporate information from the detention and deficiency models making the transitioning towards predicting risk easier from a change management perspective.
AB - This study develops risk prediction models of 22 endpoints of interest (9 incident types, detention and 12 deficiency types) and demonstrates the magnitude of improvement of selecting ships for inspections by Port State Control (PSC). A unique global dataset of 1.2 million observations combining fleet data, global inspection and incident data for the period 2018 to 2023 is used. Over 600 variables were considered, and results were validated using out of sample data. Practical application of the proposed models is appraised by considering various combinations, data-driven as well as expert-based, of the prediction models. Our results show that there is ample room for improvement. Furthermore, the most influential variables for prediction are the beneficial owner, the safety management company, age and size of the vessel. The balanced random forest-based (RF) targeting models clearly outperform those based on logit model predictions. Using prediction models to target high risk vessels can reduce false negative events (that is missing a risky vessel and not selecting it for inspection) by up to 46%. Targeting using prediction models can lead to improved prediction power by a factor of 6 to 8 compared to random selection. Finally, combining the individual models into one main targeting metric does not improve prediction power compared to using the model for predicting incidents (VSS) alone based on the 2023 out of sample data. For model combining, RF models lead to very similar results and appear to be able to incorporate information from the detention and deficiency models making the transitioning towards predicting risk easier from a change management perspective.
M3 - Working paper
VL - 2024
T3 - Econometric Institute Research Report
BT - Improved risk predictions of vessels using machine learning: how effective is the status quo?
PB - Econometric Institute, EUR
ER -