Whole genome sequencing (WGS) is a key tool in identifying and characterising disease-associated bacteria across clinical, agricultural, and environmental contexts. One increasingly common use of genomic and metagenomic sequencing is in identifying the type and range of antimicrobial resistance (AMR) genes present in bacterial isolates in order to make predictions regarding their AMR phenotype. However, there are a large number of alternative bioinformatics software and pipelines available, which can lead to dissimilar results. It is, therefore, vital that researchers carefully evaluate their genomic and metagenomic AMR analysis methods using a common dataset. To this end, as part of the Microbial Bioinformatics Hackathon and Workshop 2021, a ‘gold standard’ reference genomic and simulated metagenomic dataset was generated containing raw sequence reads mapped against their corresponding reference genome from a range of 174 potentially pathogenic bacteria. These datasets and their accompanying metadata are freely available for use in benchmarking studies of bacteria and their antimicrobial resistance genes and will help improve tool development for the identification of AMR genes in complex samples.
This work was made possible and supported by a collaboration between the Public Health Alliance for Genomic Epidemiology (PHA4GE - https://pha4ge.org), the Joint Programming Initiative on Antimicrobial Resistance (JPIAMR - https://www.jpiamr.eu/) and the MRC Cloud Infrastructure for Microbial Bioinformatics (MRC CLIMB-BD - https://tinyurl.com/climb-movie). We would also like to thank Boas van der Putten (University of Amsterdam) for initial contributions to the work performed in this publication.
Publisher Copyright: © 2022, The Author(s).