A B R o m i c s

Loading

ABRomics is an online community-driven platform to scale up and improve surveillance and research on antibiotic resistance from a One Health perspective.

Overview

Genomic Workflow

The ABRomics genomic workflow, powered by Galaxy France and launched from the ABRomics platform, is designed to process and analyze bacterial genomic data through a systematic approach. It is divided into four main steps, each ensuring robust and reliable results.

Quality and Contamination Control

This initial step ensures that raw paired-end Illumina reads are of high quality and control the contamination.

Key Steps:

  1. Quality control and trimming
    • fastp (Chen et al., 2018) QC control and trimming
  2. Taxonomic assignation on trimmed data
    • Kraken2 (Wood et al., 2019) assignation
    • Bracken (Lu et al., 2017) to re-estimate abundance to the species level
    • Recentrifuge (Martı́ Jose Manuel 2019) to make a krona chart
  3. Aggregating outputs into a single JSON file
    • ToolDistillator (ABRomics consortium, 2023) to extract and aggregate information from different tool outputs to JSON parsable files

Outputs:

  1. Quality control:
    • quality report
    • trimmed raw reads
  2. Taxonomic assignation:
    • Tabular report of identified species
    • Tabular file with assigned read to a taxonomic level
    • Krona chart to illustrate species diversity of the sample
  3. Aggregating outputs:
    • JSON file with information about the outputs of fastp, Kraken2, Bracken, Recentrifuge

Specifications:

ToolVersionParameterDatabase
fastp0.23.4Default./
Kraken22.1.3Default.PlusPF-16 (version 2022-06-07)
Bracken3.0Default.
Taxonomic level: Species
PlusPF-16 (version 2022-06-07)
Recentrifuge1.15.0Default.NCBI-2015-10-05
ToolDistillator0.9.1fastp
– Default: report.json
– Optional: trimmed_R1.fastq, trimmed_R2.fastq, report.html
Kraken2
– Default: taxonomy_assignation.tsv
– Optional: reads_assignation.txt
Bracken
– Default: output.tsv
– Optional: kraken_reestimated_report.tsv, prior of read for estimation (default 0), read length, taxonomic level
Recentrifuge
– Default: data.tsv
– Optional: report.html, stat.tsv
/

Genome Assembly

Once the data is cleaned, it is assembled into contigs to form a coherent genomic sequence.

Key Steps:

  1. Assembly raw reads to a final contig fasta file
    • Shovill (Seemann Torsten 2016)
  2. Quality control of the assembly
    • Quast (Gurevich et al., 2013)
    • Bandage (Wick et al., 2015) to plot assembly graph
    • Refseqmasher (Ondov et al., 2016) to identify the closed reference genome
  3. Aggregating outputs into a single JSON file
    • ToolDistillator (ABRomics consortium, 2023) to extract and aggregate information from different tool outputs to JSON parsable files

Outputs:

  1. Assembly:
    • Assembly with contig in fasta
    • Mapped read on assembly in bam format
    • Graph assembly in gfa format
  2. Quality of Assembly:
    • Assembly report
    • Assembly Graph
    • Tabular result of closed reference genome
  3. Aggregating outputs:
    • JSON file with information about the outputs of Shovill, Quast, Bandage, Refseqmasher

Specifications:

ToolVersionParameterDatabase
Shovill1.1.0Default./
Quast5.3.0Default./
Bandage2022.09Default./
Refseqmasher0.1.2Default.
Top N matches to report: 3
/
ToolDistillator0.9.1Shovill
– Default: contigs.fasta
– Optional: alignment.bam, contigs_graph.gfa
Quast
– Default: output.tsv
– Optional: report.html
Bandage
– Default: report_info.txt
– Optional: plot.svg
Refseqmasher
– Default: results.txt
/

Genome Annotation

This step annotates assembled genomes and identifies key genetic elements.

Key Steps:

  1. Genomic annotation
    • Bakta (Schwengers et al., 2021) to predict CDS and small proteins (sORF)
  2. Integron identification
    • IntegronFinder2 (Néron et al., 2022) to identify CALIN elements, In0 elements, and complete integrons
  3. Plasmid gene identification
    • Plasmidfinder (Carattoli and Hasman 2020) to identify and typing plasmid sequences
  4. Inserted sequence (IS) detection
    • ISEScan (Xie and Tang 2017) to detect IS elements
  5. Aggregating outputs into a single JSON file
    • ToolDistillator (ABRomics consortium, 2023) to extract and aggregate information from different tool outputs to JSON parsable files

Outputs:

  1. Genomic annotation:
    • Genome annotation in tabular, gff and several other formats
    • Annotation plot
    • Nucleotide and protein sequences identified
    • Summary of genomic identified elements
  2. Integron identification:
    • Integron identification in tabular format and a summary
  3. Plasmid gene identification:
    • Plasmid gene identified and associated blast hits
  4. Inserted Element (IS) detection:
    • IS element list in tabular format
    • IS hits in fasta format
    • ORF hits in protein and nucleotide fasta format
    • IS annotation gff format
  5. Aggregating outputs:
    • JSON file with information about the outputs of Bakta, IntegronFinder2, Plasmidfinder, ISEScan

Specifications:

ToolVersionParameterDatabase
Bakta1.9.4Default (“Full” annotation)./
IntegronFinder22.0.5Default.
Thorough local detection: Yes
Search also for promoter and attI sites? Yes
/
PlasmidFinder2.1.6Default.commit 81c11f4 – 2023-12-04
ISEScan1.7.2.3Default./
ToolDistillator0.9.1Bakta
– Default: output.json
– Optional: protein.faa, nucleotide.fna, annotation.gff3, annotation.tsv, summary.txt, Genbank file, Embl file, contigs.fasta, hypothetical_protein.fasta, hypothetical_annotation.tsv, plot.svg
IntegronFinder2
– Default: output.integrons
– Optional: output.summary
PlasmidFinder
– Default: output.json
– Optional: genome_hits.fasta, plasmid_hits.fasta
ISEScan
– Default: output.tsv
– Optional: is.fna, orf.faa, orf.fna, annotation.gff3
/

AMR Gene Detection

Performed in parallel with annotation, this step focuses on detecting antimicrobial resistance (AMR) genes.

Key Steps:

  1. Genomic detection
    • Antimicrobial resistance gene identification:
      • StarAMR (Bharat et al., 2022) to blast against ResFinder (Zankari et al., 2012) and PlasmidFinder (Carattoli et al., 2014) databases
      • AMRFinderPlus (Feldgarden et al., 2021) to find antimicrobial resistance genes and point mutations
    • Virulence gene identification:
      • ABRicate (Seemann Torsten 2016) with VFDB_A database
  2. Aggregating outputs into a single JSON file
    • ToolDistillator (ABRomics consortium, 2023) to extract and aggregate information from different tool outputs to JSON parsable files

Outputs:

  1. Genomic detection
    • Antimicrobial resistance gene identification:
      • AMR gene list
      • MLST typing
      • Plasmid gene identification
      • Blast hits
      • AMR gene fasta (assembled nucleotide sequences)
      • Point mutation list
    • Virulence gene identification:
      • Gene identification in tabular format
  2. Aggregating outputs:
    • JSON file with information about the outputs of StarAMR, AMRFinderPlus, ABRicate

Specifications:

ToolVersionParameterDatabase
StarAMR0.10.0Default.
Percent identity threshold for BLAST: 90.0
ResFinder: 2.4.0 – commit
e0525f2 – 2024-09-23
PointFinder: 4.1.1 – commit
694919f – 2024-08-08
PlasmidFinder: commit 4add282 – 2024-11-14
MLST version: 2.23.0
AMRFinderPlus3.12.8Default.V3.12 – 2024-05-02
ABRicate1.0.1Default (Minimum DNA %identity and %coverage: 80.0).VFDB
ToolDistillator0.9.1StarAMR
– Default: resfinder.tsv
– Optional: mlst.tsv, pointfinder.tsv, plasmidfinder.tsv, settings.tsv
AMRFinderPlus
– Default: report.tsv
– Optional: point_mutation_report.tsv, nucleotide_sequence.fasta
ABRicate
– Default: report.tsv
/

Conclusion

The ABRomics workflow provides a comprehensive and integrated approach for bacterial genomic data analysis. From ensuring data quality to identifying critical genes, each step is optimized to deliver actionable and well-organized results.


Useful Links

  • Galaxy France platform: A web-based platform providing access to powerful, open-source tools for large-scale genomic and metagenomic data analysis.
  • Learning Pathway “Detection of AMR genes in bacterial genomes” part of the Galaxy Training Network. This pathway provides hands-on tutorials for researchers and students interested in detecting antimicrobial resistance (AMR) genes in bacterial genomes.

References

  • ABRomics consortium (2023). ToolDistillator: a tool to extract and aggregate information from different tool outputs to JSON parsable files. https://gitlab.com/ifb-elixirfr/abromics/tooldistillator
  • Bharat A, Petkau A, Avery BP, Chen JC, Folster JP, Carson CA, Kearney A, Nadon C, Mabon P, Thiessen J, Alexander DC, Allen V, El Bailey S, Bekal S, German GJ, Haldane D, Hoang L, Chui L, Minion J, Zahariadis G, Domselaar GV, Reid-Smith RJ, Mulvey MR. Correlation between Phenotypic and In Silico Detection of Antimicrobial Resistance in Salmonella enterica in Canada Using Staramr. Microorganisms. 2022; 10(2):292. 10.3390/microorganisms10020292
  • Carattoli, A., and H. Hasman, 2020 PlasmidFinder and in silico pMLST: identification and typing of plasmid replicons in whole-genome sequencing (WGS). Horizontal gene transfer: methods and protocols 285–294. 10.1007/978-1-4939-9877-7_20
  • Chen, S., Y. Zhou, Y. Chen, and J. Gu, 2018 fastp: an ultra-fast all-in-one FASTQ preprocessor. 10.1093/bioinformatics/bty560
  • Feldgarden, M., Brover, V., Gonzalez-Escalona, N. et al. AMRFinderPlus and the Reference Gene Catalog facilitate examination of the genomic links among antimicrobial resistance, stress response, and virulence. Sci Rep 11, 12728 (2021). 10.1038/s41598-021-91456-0
  • Gurevich, A., V. Saveliev, N. Vyahhi, and G. Tesler, 2013 QUAST: quality assessment tool for genome assemblies. Bioinformatics 29: 1072–1075. 10.1093/bioinformatics/btt086
  • Lu, J., F. P. Breitwieser, P. Thielen, and S. L. Salzberg, 2017 Bracken: estimating species abundance in metagenomics data. PeerJ Computer Science 3: e104. 10.7717/peerj-cs.104
  • Martı́ Jose Manuel, 2019 Recentrifuge: Robust comparative analysis and contamination removal for metagenomics. PLoS computational biology 15: e1006967. 10.1371/journal.pcbi.100696
  • Néron, B., E. Littner, M. Haudiquet, A. Perrin, J. Cury et al., 2022 IntegronFinder 2.0: identification and analysis of integrons across bacteria, with a focus on antibiotic resistance in Klebsiella. Microorganisms 10: 700. 10.3390/microorganisms10040700
  • Ondov, B. D., Treangen, T. J., Melsted, P., Mallonee, A. B., Bergman, N. H., Koren, S., & Phillippy, A. M. (2016). Mash: fast genome and metagenome distance estimation using MinHash. Genome Biology, 17(1). 10.1186/s13059-016-0997-x
  • Schwengers, O., L. Jelonek, M. A. Dieckmann, S. Beyvers, J. Blom et al., 2021 Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microbial genomics 7: 000685. 10.1099/mgen.0.000685
  • Seemann, T. (2016). ABRicate: mass screening of contigs for antiobiotic resistance genes. https://github.com/tseemann/abricate
  • Seemann, T. (2016). Shovill: Assemble bacterial isolate genomes from Illumina paired-end reads. https://github.com/tseemann/shovill
  • Wick, R. R., M. B. Schultz, J. Zobel, and K. E. Holt, 2015 Bandage: interactive visualization of de novo genome assemblies. Bioinformatics 31: 3350–3352. 10.1093/bioinformatics/btv383
  • Wood, D. E., and S. L. Salzberg, 2014 Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biology 15: R46. 10.1186/gb-2014-15-3-r46
  • Wood, D. E., J. Lu, and B. Langmead, 2019 Improved metagenomic analysis with Kraken 2. Genome biology 20: 1–13. 10.1186/s13059-019-1891-0
  • Xie, Z., and H. Tang, 2017 ISEScan: automated identification of insertion sequence elements in prokaryotic genomes. Bioinformatics 33: 3340–3347. 10.1093/bioinformatics/btx433
  • Zankari, E., H. Hasman, S. Cosentino, M. Vestergaard, S. Rasmussen et al., 2012 Identification of acquired antimicrobial resistance genes. Journal of antimicrobial chemotherapy 67: 2640–2644. 10.1093/jac/dks261

Last Modified