Protein Structure Prediction — Recipes¶

This is a cookbook for the Protein Structure Prediction service. If you’re new to the service, start with the Tutorial — it walks you through a single-protein prediction end-to-end. Once you’re comfortable with the form, the recipes here cover the patterns that go beyond the defaults: multi-chain complexes, ligands, MSAs you supply or have computed, and reruns.

For the full reference of every form field see the Quick Reference Guide.

Choosing a recipe¶

Your situation	Recipe
Single protein, just want a structure	Tutorial — uses Auto → ESMFold
Multi-chain protein complex	Multi-chain complex with Boltz
Protein + a small-molecule cofactor (ATP, NAD, …)	Protein + ligand by CCD code
Protein + a novel small molecule	Protein + SMILES ligand
Need an MSA but don’t have one	Let BV-BRC compute the MSA
Have an MSA, want best-quality fold	Boltz / OpenFold / Chai with uploaded MSA
Already ran one — try a different engine	Rerun with a different tool

Recipe 1 — Multi-chain protein complex with Boltz¶

Goal: predict the structure of a complex (e.g. an antibody Fv with heavy + light chains, or a homo-tetramer).

Inputs: one FASTA file with multiple > records — each record becomes one chain. Soft limit: 26 chains and 10,000 total residues per job.

>heavy_chain
QVQLVQSGAEVKKPGASVKVSCKASGYTFT...
>light_chain
DIQMTQSPSSLSASVGDRVTITCRASQDIS...

Form settings:

Field	Value
Prediction Tool	`Boltz` (or Auto if you’ve uploaded an MSA)
Protein	the multi-chain FASTA
MSA Source	Use MSA Server or Service (recommended) or Precomputed MSA from Workspace if you already have one
Other inputs	leave empty
Job Name	something descriptive — e.g. `antibody-fv-boltz`

Why Boltz for complexes? Boltz, OpenFold 3, and Chai-1 are all AF3-class diffusion models with strong multimer performance. Boltz is the auto-selector’s first choice when an MSA is available; OpenFold and Chai are reasonable swaps if you want to compare.

Recipe 2 — Protein with a cofactor (CCD code)¶

Goal: fold a protein with a known cofactor in place (e.g. an ATP-binding domain with ATP modeled).

Inputs: the protein FASTA plus a CCD code in the Ligands input.

Form settings:

Field	Value
Prediction Tool	`Boltz` (or Auto)
Protein	your FASTA
Ligands → Notation	`CCD codes`
Ligands → input	`ATP` (one per line if you have more)
MSA Source	Use MSA Server or Service or Precomputed

Common CCD codes: ATP, ADP, GTP, NAD, FAD, HEM, MG, ZN, CA. Glycans use their CCD codes too (NAG, MAN, BMA).

Limitation: the form’s Notation selector carries one notation per submission. To mix CCD ligands and SMILES ligands in the same job, you need the CLI (see CLI Reference).

Recipe 3 — Protein with a SMILES ligand¶

Goal: fold a protein with a novel small molecule (no CCD code yet) in place — e.g. a drug candidate.

Inputs: the protein FASTA plus the molecule as a SMILES string.

Form settings:

Field	Value
Prediction Tool	`Boltz`
Protein	your FASTA
Ligands → Notation	`SMILES strings`
Ligands → input	the SMILES (e.g. `CCO` for ethanol, one per line for multiple)
MSA Source	Use MSA Server or Service

The form validates SMILES as you type; the first invalid line is reported inline. Standard SMILES syntax — branching parentheses must balance, atoms come from the typical organic set + halogens.

Tip: if you have the molecule’s 2D structure in another tool (e.g. RDKit, ChemDraw), export to canonical SMILES first.

Recipe 4 — Let BV-BRC compute the MSA with ColabFold¶

Goal: predict a structure that needs an MSA, but you don’t have one and don’t want to build it yourself.

Form settings:

Field	Value
Prediction Tool	`Boltz`, `OpenFold`, or `Chai`
Protein	your FASTA
MSA Source	Use MSA Server or Service

The service runs MMseqs2 against UniRef + ColabFoldDB and feeds the result to the selected engine. Expect 30 s – 3 min of extra wall time before folding starts.

When not to use this: if you already have an MSA from your own pipeline (JackHMMER, ColabFold local, etc.), upload it instead — your MSA is likely deeper / better filtered than the server’s defaults.

Recipe 5 — Uploaded MSA with Boltz, OpenFold, or Chai¶

Goal: highest-quality fold, with an MSA you’ve curated.

Inputs: a .a3m, .sto, or .pqt MSA file uploaded to your workspace.

Form settings:

Field	Value
Prediction Tool	`Boltz` (default), or `OpenFold` / `Chai` to compare
Protein	the query FASTA matching the MSA’s first sequence
MSA Source	Precomputed MSA from Workspace
MSA File	the uploaded `.a3m` / `.sto` / `.pqt` file

The service auto-converts A3M ↔ Parquet for the engine that needs it. Stockholm is converted to A3M-internally for Boltz / OpenFold.

Tip: a deeper MSA (more sequences) usually helps until ~256 effective sequences, after which gains flatten. ColabFold’s default depth of 5,000 raw → ~256 effective is a reasonable target.

Recipe 6 — Rerun an existing job with a different tool¶

Goal: you’ve run a fold and want to compare a different engine on the same inputs.

Open the completed job in your workspace.
Click the Rerun button in the Action Bar.
The form re-opens pre-filled with the prior submission’s parameters.
Change the Prediction Tool to the engine you want to compare against.
Update the Job Name (the default will collide otherwise — crambin-esmfold → crambin-boltz, etc.).
Submit.

This is the cleanest way to A/B-test engines without rebuilding the form from scratch.

What’s not on the form¶

These advanced knobs are not surfaced on the web form. Reach them via the CLI or JSON-RPC API:

num_samples — how many independent predictions to generate (default 5; useful for assessing ensemble disagreement)
num_recycles — recycling iterations (engine-specific defaults; AlphaFold defaults to 3)
seed — random seed for reproducibility (engine-specific)
output_format — restrict output to a subset (pdb, cif, report, confidence, …)
Per-engine flags — see each engine’s adapter in PredictStructureApp/adapters/

See the CLI Reference and API Reference for the full parameter surface.

Protein Structure Prediction — Recipes¶

Choosing a recipe¶

Recipe 1 — Multi-chain protein complex with Boltz¶

Recipe 2 — Protein with a cofactor (CCD code)¶

Recipe 3 — Protein with a SMILES ligand¶

Recipe 4 — Let BV-BRC compute the MSA with ColabFold¶

Recipe 5 — Uploaded MSA with Boltz, OpenFold, or Chai¶

Recipe 6 — Rerun an existing job with a different tool¶

What’s not on the form¶

See also¶