FTP Site

BV-BRC FTP Site: Transition from Unencrypted FTP to Encrypted FTP (FTPS) [Webpage]

The BV-BRC FTP site has long been a primary mechanism for distributing large-scale genomic and related datasets. To strengthen security and align with modern best practices, BV-BRC has transitioned from unencrypted FTP (ftp://ftp.bv-brc.org) to encrypted FTP (FTPS). This change ensures that data transfers between BV-BRC and its users are protected from interception or tampering.

Why the Change Was Necessary

  • Security: Traditional FTP transmits usernames, passwords, and data in plain text, making it vulnerable to eavesdropping and unauthorized access.

  • Compliance: Many institutions (including the one hosting BV-BRC) and funding agencies now require encrypted transfer protocols for research data to meet cybersecurity and data protection standards.

  • Reliability: Modern clients and firewalls are increasingly dropping support for plain FTP, while encrypted FTPS remains widely supported and more robust.

What This Means for BV-BRC Users

  • The BV-BRC FTP server is still available at the same host: ftp.bv-brc.org, but connections must now use explicit FTPS (FTP over TLS/SSL).

  • Plain FTP connections (e.g., ftp://ftp.bv-brc.org) are no longer supported.

How to Access the BV-BRC FTP Site You can connect using most modern FTP clients with FTPS support. Here are few examples.

Command-Line Access using lftp lftp -u anonymous,guest ftp.bv-brc.org Within lftp, ensure that FTPS is enabled: set ftp:ssl-force true set ftp:ssl-protect-data true set ssl:verify-certificate no

Graphical FTP Clients

  • FileZilla: Set protocol to FTP – File Transfer Protocol, and encryption to Require explicit FTP over TLS.

  • WinSCP / Cyberduck / Transmit: Choose FTP with explicit TLS/SSL (FTPS).

Programmatic Access

If you are using scripts or pipelines (e.g., wget or curl), you may need to switch to a client that supports FTPS or use lftp/curl –ftp-ssl options. Example:

  • curl –ssl-reqd –user anonymous:guest ftp://ftp.bv-brc.org/<path-to-file>

  • wget ftps://ftp.bv-brc.org/<path-to-file> or

  • wget –ftp-user=anonymous –ftp-password=guest –secure-protocol=auto ftps://ftp.bv-brc.org/<path-to-file> [Recommended]

Impact on Existing Workflows

  • Old scripts using ftp:// without encryption will fail and must be updated to use FTPS.

  • Users behind institutional firewalls may need to ensure that outbound FTPS traffic is allowed.

  • The directory structure, file organization, and content remain unchanged – only the connection method is different.

RELEASE_NOTES/

Provides list of all public genomes, related metadata, and AMR phenotype data in tab-delimited formats.

  • genome_summary: Basic summary of genomes and their annotation in tab-delimited format

  • genome_metadata: All genome metadata in tab-delimited format

  • genome_lineage: taxonomy lineage for all public genomes, presented as taxon ids and taxon names) for all

  • PATRIC_genome_AMR.txt: AMR phenotype data generated by laboratory methods in tab-delimited format (

genomes/

Genomes directory procvides access to data for all public genomes in various standard file formats. The data is organized by genomes. There is a separate directory for each genome, with genome_id as the directory name.

For example, below is the genome directory for Escherichia coli MG1655 genome.

ftp://ftp.bvbrc.org/genomes/511145.12

Each genome directory provides the following data files for PATRIC and RefSeq annotations (when available).

  • .fna: FASTA contig sequences

  • .faa: FASTA protein sequence file

  • .features.tab: All genomic features and related information in tab-delimited format

  • .ffn: FASTA nucleotide sequences for genomic features, i.e. genes, RNAs, and other misc features

  • .frn: FASTA nucleotide sequences for RNAs

  • .gff: Genome annotations in GFF file format

  • .pathway.tab: Metabolic pathway assignments in tab-delimited format

  • .spgene.tab: Specialty gene assignements (i.e. AMR genes, virulance factors, essential genes, etc) in tab-delimited format

  • .subsystem.tab: Subsystem assignments in tab-delimited format

Downloading data for large number of genomes

Because of the large number of genomes currently available on the FTP site, it is not very efficient to download them using FileZilla/lftp clients or the website. However, it is very easy to do it using a simple shell script as follows.

You can download list of all genomes you are interested in from the website and then copy list of genome ids into one text file, called “genome_list”. Alternatively, you can get a list of all public genomes from the FTP (link below) and filter it for the genomes / species you are interested in.

ftp://ftp.bvbrc.org/RELEASE_NOTES/genome_summary

Once you have copied list of genome ids you are interested in a separate file “genome_list”, you can use the following one line shell script to read the list of genome ids from your file and download corresponding .fna files from the PATRIC FTP site. If you are interested in other file type, say .PATRIC.faa or .PATRIC.features.tab, simply replace .fna with that extension.

for i in `cat genome_list`; do wget -qN "ftp://ftp.bvbrc.org/genomes/$i/$i.fna";
done

Uploading/Downloading Data From Your Private Workspace using FTP Client

You can use a FTP client to batch upload/download data to your private workspace.
Host: workspace.patricbrc.org
User name: your BV-BRC user name
Password: your BV-BRC password

Support

If you experience issues updating your workflows or clients to support FTPS, please consult the BV-BRC help documentation or contact help@bv-brc.org for assistance.