Targeted DNA methylation analysis and prediction of smoking habits in blood based on massively parallel sequencing

Athina Vidaki*, Benjamin Planterose Jiménez, the Biobank-based Integrative Omics Study (BIOS) Consortium, Brando Poggiali, Vivian Kalamara, Kristiaan J. van der Gaag, Silvana C.E. Maas, Mohsen Ghanbari, Titia Sijen, Manfred Kayser

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

4 Citations (Scopus)
166 Downloads (Pure)


Tobacco smoking is a frequent habit sustained by > 1.3 billion people in 2020 and the leading preventable factor for health risk and premature mortality worldwide. In the forensic context, predicting smoking habits from biological samples may allow broadening DNA phenotyping. In this study, we aimed to implement previously published smoking habit classification models based on blood DNA methylation at 13 CpGs. First, we developed a matching lab tool based on bisulfite conversion and multiplex PCR followed by amplification-free library preparation and targeted paired-end massively parallel sequencing (MPS). Analysis of six technical duplicates revealed high reproducibility of methylation measurements (Pearson correlation of 0.983). Artificially methylated standards uncovered marker-specific amplification bias, which we corrected via bi-exponential models. We then applied our MPS tool to 232 blood samples from Europeans of a wide age range, of which 90 were current, 71 former and 71 never smokers. On average, we obtained 189,000 reads/sample and 15,000 reads/CpG, without marker drop-out. Methylation distributions per smoking category roughly corresponded to previous microarray analysis, showcasing large inter-individual variation but with technology-driven bias. Methylation at 11 out of 13 smoking-CpGs correlated with daily cigarettes in current smokers, while solely one was weakly correlated with time since cessation in former smokers. Interestingly, eight smoking-CpGs correlated with age, and one displayed weak but significant sex-associated methylation differences. Using bias-uncorrected MPS data, smoking habits were relatively accurately predicted using both two- (current/non-current) and three- (never/former/current) category model, but bias correction resulted in worse prediction performance for both models. Finally, to account for technology-driven variation, we built new, joint models with inter-technology corrections, which resulted in improved prediction results for both models, with or without PCR bias correction (e.g. MPS cross-validation F1-score > 0.8; 2-categories). Overall, our novel assay takes us one step closer towards the forensic application of viable smoking habit prediction from blood traces. However, future research is needed towards forensically validating the assay, especially in terms of sensitivity. We also need to further shed light on the employed biomarkers, particularly on the mechanistics, tissue specificity and putative confounders of smoking epigenetic signatures.

Original languageEnglish
Article number102878
JournalForensic Science International: Genetics
Publication statusPublished - Jul 2023

Bibliographical note

We would like to thank the donors from the Erasmus Rucphen Family (ERF) study that have donated whole blood samples included in this study. We are also grateful to Ivana Prokic (Dept. Epidemiology, Erasmus MC) and Arwin Ralf (Dept. Genetic Identification, Erasmus MC) for their technical assistance with metadata and sample curation, respectively. Methylation microarray data for the 13 smoking-CpGs were kindly provided by six Dutch cohorts embedded within the Biobank-based Integrative Omics Study (BIOS) Consortium: LifeLines, the Leiden Longevity Study, the Netherlands Twin Registry (NTR), the Rotterdam Study, the Cohort on Diabetes and Atherosclerosis Maastricht (CODAM) study, and the Prospective ALS study Netherlands (PAN). We would like to thank the participants of all aforementioned biobanks and their investigators. This research was financially supported by Erasmus MC and the Netherlands Forensic Institute. AV was additionally supported by an Erasmus MC fellowship 2020.

Publisher Copyright: © 2023 The Authors


Dive into the research topics of 'Targeted DNA methylation analysis and prediction of smoking habits in blood based on massively parallel sequencing'. Together they form a unique fingerprint.

Cite this