# BV-BRC Data Types and Sources ## Overview BV-BRC is an integrated data and analysis resource designed to support genomic and related infectious disease research for viral and bacterial pathogens. As such, the primary data type is genomic sequences, primarily ingested from public repositories such as NCBI GenBank. BV-BRC reannotates genomes with curated subsystem data for consistency, curates and standardizes metadata, and formats the data to enable the integrated tools to perform comparative analysis computations. BV-BRC also annotates genomes from assembled reads collected from SRA for specialized phenotypic characteristics, such as antimicrobial resistance (AMR). Annotations may be derived from computations such as gene and other feature prediction algorithms, subsystem and other functional groupings, and phenotype prediction such as AMR. Appropriate references to the computations performed are available in the user documentation. Data from expression studies have also been curated and structured for comparative analyses. Additional data are incorporated from various resources, such as protein structures from PDB as well as computed structures, e.g., from AlphaFold, to augment annotations and comparative analyses. Further, BV-BRC has integrated data sets from other NIAID programs, such as the Systems Biology Centers (SBCs). Additional metadata from serology and surveillance efforts may also be associated with genome sequences and other data types. Data from these resources has appropriate provenance information to enable traceability to the source. Below are the data types supported by BV-BRC. Each has a "source" designation of either * Primary - retrieved and imported from an external source, with origin information * Secondary - curated, derived, or generated by BV-BRC Phylogeny Taxonomy Strains Features Genomes AMR Phenotypes Sequences Features Proteins Protein Structures Speciality Genes Domains and Motifs Epitopes Pathways Subsystems Experiments Interactions, Taxon/Genome-Level Surveillance Data Serology Data Sequence Feature Variant Types ## Phylogeny **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## Taxonomy **Description:** **Source:** Primary **Related and Derived Data Types:** Reference organisms **Origin Data Source(s):** NCBI Taxonomy, ICTV **Processing Protocol(s):** **Quick Reference Guide:** [Taxonomic Overview](https://www.bv-brc.org/docs/quick_references/organisms_taxon/overview.html) ## Genomes **Description:** The central data type in BV-BRC is genomes. Most of the data and information within BV-BRC is linked back to sequenced, assembled, and annotated genomes stored in the BV-BRC database. Genomes are incorporated from RefSeq, GenBank, and other sources, and are annotated using a standard annotation protocol, RASTtk, to enable comparative analyses and linking of data across the website. In addition, the BV-BRC team searches literature for large published AMR studies and assembles corresponding genomes using the reads available in the SRA database. **Source:** Primary **Related and Derived Data Types:** Clinical and environmental metadata, AMR / AVR phenotypes, QC results **Origin Source(s):** GenBank, SRA, User-published **Processing Protocol(s):** **Quick Reference Guide:** [Genomes](https://www.bv-brc.org/docs/quick_references/organisms_taxon/genomes.html) ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []() ## **Description:** **Source:** **Related and Derived Data Types:** **Origin Data Source(s):** **Processing Protocol(s):** **Quick Reference Guide:** []()