E-Poster Presentation Australian Society for Microbiology Annual Scientific Meeting 2021

Cluster-specific gene markers enhance Shigella and Enteroinvasive Escherichia coli  in silico serotyping (#336)

Xiaomei Zhang 1 , Michael Payne 1 , Thanh Nguyen 1 , Sandeep Kaur 1 , Ruiting Lan 1
  1. University of New South Wales, Sydney, NSW, Australia

Shigella and enteroinvasive Escherichia coli (EIEC) cause human bacillary dysentery with similar invasion mechanisms and share ancestry within E. coli as well as similar physiological, biochemical and genetic characteristics. These similarities make differentiation between Shigella and EIEC difficult. However distinguishing them is important for clinical diagnostic and public health epidemiological investigations. Current genetic markers may not discriminate between Shigella and EIEC in all cases. Importantly, Shigella and EIEC are separated into multiple phylogenetic clusters. In this study, we investigated the use of genomic markers for accurate identification and separation of Shigella and EIEC clusters and serotypes .

 

We identified 10 Shigella clusters, 7 EIEC clusters and 53 sporadic types of EIEC by examining over 17,000 publicly available Shigella/EIEC genomes. We then compared Shigella and EIEC accessory genomes to identify an individual or a set of cluster-specific gene markers for the 17 clusters and 53 sporadic types. The gene markers showed 99.63% accuracy and more than 97.02% specificity for cluster identification.

 

We additionally developed a freely available in silico serotyping pipeline named Shigella EIEC Cluster Enhanced Serotype Finder (ShigEiFinder) by incorporating the cluster-specific gene markers and established Shigella/EIEC serotype specific O antigen genes and modification genes into typing. ShigEiFinder (https://github.com/LanLab/ShigEiFinder) can process either paired end Illumina sequencing reads or assembled genomes and almost perfectly differentiated Shigella from EIEC with 99.70% and 99.81% cluster assignment accuracy for the assembled genomes and mapped reads respectively. ShigEiFinder was able to identify over 59 Shigella serotypes and 22 EIEC serotypes and provided a high typing specificity with 99.40% for assembled genomes and 99.38% for mapped reads.

 

The cluster-specific gene markers that we identified enhanced in silico serotyping using genomic data and could be adapted for metagenomics or culture independent typing. Our new serotyping tool, ShigEiFinder can differentiate Shigella and EIEC, accurately assign phylogenetic clusters and serotype from genome data and will be useful for clinical, epidemiological and diagnostic investigations.