Background: Information on long-term alcohol consumption is relevant for medical and public health research, disease therapy, and other areas. Recently, DNA methylation-based inference of alcohol consumption from blood was reported with high accuracy, but these results were based on employing the same dataset for model training and testing, which can lead to accuracy overestimation. Moreover, only subsets of alcohol consumption categories were used, which makes it impossible to extrapolate such models to the general population. By using data from eight population-based European cohorts (N = 4677), we internally and externally validated the previously reported biomarkers and models for epigenetic inference of alcohol consumption from blood and developed new models comprising all data from all categories. Results: By employing data from six European cohorts (N = 2883), we empirically tested the reproducibility of the previously suggested biomarkers and prediction models via ten-fold internal cross-validation. In contrast to previous findings, all seven models based on 144-CpGs yielded lower mean AUCs compared to the models with less CpGs. For instance, the 144-CpG heavy versus non-drinkers model gave an AUC of 0.78 ± 0.06, while the 5 and 23 CpG models achieved 0.83 ± 0.05, respectively. The transportability of the models was empirically tested via external validation in three independent European cohorts (N = 1794), revealing high AUC variance between datasets within models. For instance, the 144-CpG heavy versus non-drinkers model yielded AUCs ranging from 0.60 to 0.84 between datasets. The newly developed models that considered data from all categories showed low AUCs but gave low AUC variation in the external validation. For instance, the 144-CpG heavy and at-risk versus light and non-drinkers model achieved AUCs of 0.67 ± 0.02 in the internal cross-validation and 0.61–0.66 in the external validation datasets. Conclusions: The outcomes of our internal and external validation demonstrate that the previously reported prediction models suffer from both overfitting and accuracy overestimation. Our results show that the previously proposed biomarkers are not yet sufficient for accurate and robust inference of alcohol consumption from blood. Overall, our findings imply that DNA methylation prediction biomarkers and models need to be improved considerably before epigenetic inference of alcohol consumption from blood can be considered for practical applications.
Bibliographical noteFunding Information:
This work was performed within the framework of the BBMRI Metabolomics Consortium funded by BBMRI-NL, a research infrastructure financed by the Dutch government (NWO 184.021.007 and 184.033.111). A full list of the BIOS consortium investigators is available in Additional file . SCEM, AV, MG, and MK were supported by the Erasmus MC University Medical Center Rotterdam. AV was additionally supported with an EUR Fellowship by Erasmus University Rotterdam. Detailed cohort specific funding are included in the Supplementary Methods (Additional file ). The researchers are independent from the funders. The study sponsors had no role in the study design, data collection, data analysis, interpretation of data and preparation, review or approval of the manuscript.
HJG has received travel grants and speakers honoraria from Fresenius Medical Care, Neuraxpharm, Servier and Janssen Cilag as well as research funding from Fresenius Medical Care.
The authors are grateful to the participants of the included cohorts; the Rotterdam Study (http://www.erasmus-epidemiology.nl/research/ergo.htm), the CODAM study (http://www.carimmaastricht.nl/), the Netherlands Twin Registry (http://www.tweelingenregister.org), the Leiden Longevity Study (http://www.leidenlangleven.nl), the PAN study (http://www.alsonderzoek.nl/), the KORA Study (https://www.helmholtzmuenchen.de/en/kora/index.html), SHIP-Trend (https://ship.community-medicine.de/) and TwinsUK (https://twinsuk.ac.uk/). Detailed cohort specific acknowledgments are included in Additional file 1 : Supplementary Methods.
© 2021, The Author(s).