E-Poster Presentation Australian Society for Microbiology Annual Scientific Meeting 2021

New Curated Databases in BioCyc.org for the Pathogens Acinetobacter baumannii and Chlamydia trachomatis (#323)

Lisa R Moore 1 , Amanda Mackie 1 , Ron Caspi 2 , Anamika Kothari 2 , Suzanne Paley 2 , Ian Paulsen 1 , Peter Karp 2
  1. Macquarie University, North Ryde, NSW, Australia
  2. Bioinformatics Research Group, SRI International, Menlo Park, California, United States

BioCyc.org is a web portal for >18,000 sequenced microbes Pathway/Genome Databases (PGDBs) that couple high-quality genomic and metabolic pathway information with a variety of sophisticated bioinformatics tools. The PGDBs range in the amount of curation with some having significant manual curation of genes, proteins and reactions to improve the metabolic pathways and transporter predictions from the initial computational-based inferences. The curation process starts with BioCyc's PathoLogic program, which carries out a series of computational inferences on an annotated RefSeq genome, followed by importation of information from related databases, such as protein features from UniProt and gene essentiality data from OGEE. Finally, manual curation involves integration of relevant information from the experimental literature, including updating gene and protein names and functions, writing summaries for selected proteins, and adding citations.

We introduce new BioCyc PGDBs for three human pathogenic bacteria, the multi-drug resistant Acinetobacter baumannii strains ATCC 17978 and AB5075-UW that are implicated in a variety of community- and hospital-acquired infections, and Chlamydia trachomatis strain D/UW-3/CX, a common, sexually transmitted pathogen. More than 80 publications were used in the curation process of the two A. baumannii PGDBs, which updated gene functions with a focus on virulence factors, such as metal acquisition systems and outer membrane proteins (OMP). The C. trachomatis serovar D PGDB curation is based on >30 publications with a primary focus on the reduced but still functional central metabolic pathways, illustrating the smaller genome and fewer metabolic pathways of this obligate intracellular pathogen. Examples of several important curated proteins, reactions and pathways will be highlighted using some of the extensive bioinformatics tools available through BioCyc for searching, visualizing and analyzing the PGDBs. For instance, painting of a proteomics dataset onto the cellular pathway overview and comparative analysis of TCA cycles will be used to show how these new curated PGDBs contribute to the growing set in the BioCyc collection that serve as valuable resources for researchers and educators.