< API Documentation Home

Genome Sequence Data

Contig, chromosome, and plasmid nucleotide sequences for a genome

Data Type: genome_sequence

Primary Key: sequence_id

Attributes

_version_ (number)
accession (string) - GenBank/RefSeq accession (with or without version) for this replicon; authoritative link to external databases. "NC_000913.3"
chromosome (case insensitive string) - Label for chromosomal replicons; may be empty for plasmids or viral segments. "Chromosome"
date_inserted (date) - ISO-8601 UTC datetime when the sequence row entered BV-BRC. "2023-05-14T12:33:21Z"
date_modified (date) - Updated whenever any field in the row changes; enables cache invalidation and incremental syncs. "2025-04-02T08:11:05Z"
description (case insensitive string) - Full DEFINITION from the GenBank record or submitter-supplied description. "Escherichia coli K-12 MG1655 complete genome"
gc_content (number) - 100 × (G + C) / length, stored as float with one decimal. 50.8
genome_id (string) - Stable BV-BRC genome_id that owns this sequence; foreign-key join target. "511145.183"
genome_name (case insensitive string) - Scientific name + strain for convenience display; denormalised from the genome record. "Escherichia coli K-12 MG1655"
gi (integer) - Historic GenInfo Identifier; null for accessions created after NCBI retired GIs (2016). 556503834
length (integer) - Total number of nucleotides in the replicon. 4 641 652
mol_type (case insensitive string) - Values such as genomic DNA, viral cRNA, plasmid DNA. Mirrors the GenBank MOL_TYPE qualifier. "genomic DNA"
owner (string) - BV-BRC user/org that controls the record; governs default ACLs. "patric_public"
p2_sequence_id (integer) - Numeric key from retired PATRIC2 schema; retained for cross-reference. 1234567
plasmid (case insensitive string) - Name/ID of plasmid when sequence_type = plasmid. "pO157"
public (boolean) - true → sequence is accessible to all users; false → restricted to workspace owners. True
release_date (date) - Date the sequence became publicly available in BV-BRC (often matches GenBank release). "1997-09-05T00:00:00Z"
segment (case insensitive string) - Segment label for segmented viruses (PB2, HA, S, L etc.); empty for non-segmented genomes. "HA"
sequence () - Complete FASTA string (A,C,G,T,N) stored in compressed form; served on demand for downloads and BLAST. "ATGAC...TAA"
sequence_id * (string) - BV-BRC unique identifier for the sequence row; usually equals accession but guaranteed unique even for private drafts. "NC_000913.3"
sequence_md5 (string) - MD5 hash of the raw sequence; enables rapid identity checks and deduplication. "b3b2a5b1d5fbb5e5e3d5e5b1d5fbb5e5"
sequence_status (string) - Controlled terms: complete, partial, draft, degapped; guides quality filters in UI. "complete"
sequence_type (case insensitive string) - chromosome, plasmid, contig, scaffold, segment, etc.; determines iconography and default ordering. "chromosome"
taxon_id (integer) - Numeric taxon identifier inherited from parent genome; used for taxon-scoped searches. 562
topology (case insensitive string) - circular or linear; affects downstream tools like GC-skew plots. "circular"
user_read (array of strings) - BV-BRC user/org IDs allowed to view the record (includes "public" for public data). [ "public" ]
user_write (array of strings) - User/org IDs with edit rights to the record. [ "maulik@bvbrc.org" ]
version (integer) - Numeric suffix from INSDC accessions (NC_000913.**3** → 3); enables tracking of updated sequences. 3

API

GET :sequence_id

Retrieve a genome_sequence data object by sequence_id

EXAMPLE

https://www.bv-brc.org/api/genome_sequence/170673.13.con.0100

Try It!

QUERY :query

Query for genome_sequence data objects with an RQL Query

Return Formats

Requests may include an HTTP ACCEPT header from this list to transform the data into the requested type.

application/json - Returns results as an array of JSON objects
application/solr+json - Results results in SOLR JSON response format
text/csv - Returns results in Comma Separated values (CSV) format. Columns are separated by ','. Multi-value columns are separated by ';'. Rows are separated by new line
text/tsv - Returns results in Tab Separated values (TSV) format. Columns are separated by a tab. Multi-value columns are separated by ';'. Rows are separated by new line
application/vnd.openxmlformats - Returns objects as an MS Excel document
application/dna+fasta - Returns DNA sequences for queries in FASTA format
application/dna+jsonh+fasta - Returns DNA sequences for queries in JSONH-FASTA format
application/sralign+dna+fasta - Returns DNA sequences aligned from SRA data in FASTA format

EXAMPLES

Query for genome_sequence data objects with a sequence_id equal to 170673.13.con.0100. Return results as a JSON Array.
```
https://www.bv-brc.org/api/genome_sequence/?eq(sequence_id,170673.13.con.0100)
```
Try It!

Query for genome sequences for genome 1765.317, limit to 5 sequences. Return DNA FASTA.

https://www.bv-brc.org/api/genome_sequence/?eq(genome_id,1765.317)&limit(5)&http_accept=application/dna+fasta

Try It!