Exploration of machine learning methods for maritime risk predictions

Sabine Knapp*, Michel van de Velden

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)


Maritime applications such as targeting ships for inspections, improved domain awareness, and dynamic risk exposure assessments for strategic planning all benefit from ship-specific incident probabilities. Using a unique and comprehensive global data set, of 1.2 million observations over the period from 2014 to 2020, this study explores the effectiveness and suitability of 144 model variants from the field of machine learning for eight incident endpoints of interest and evaluating over 580 covariates. Furthermore, the importance of covariates is examined and visualized. The results differ for each endpoint of interest but confirm that random forest methods can improve prediction capabilities. Based on out-of-sample evaluations for the year 2020, targeting the top 10% most risky vessels would improve predictions by a factor of 2.7 to 4.9 compared to random selection and based on the top decile lift. Balanced random forests and random forests with balanced training variants outperform regular random forests, for which the selected variants also depend on aggregation types. The most important covariate groups for predicting incident probabilities relate to beneficial ownership, the safety management company, and the size and age of the vessel, while the relevance of these factors remains similar across the different endpoints of interest.

Original languageEnglish
JournalMaritime Policy and Management
Publication statusAccepted/In press - 2023

Bibliographical note

Publisher Copyright:
© 2023 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.


Dive into the research topics of 'Exploration of machine learning methods for maritime risk predictions'. Together they form a unique fingerprint.

Cite this