ESM-2 35M is a 12-layer, 35M-parameter protein language model trained on UniRef50/UniRef90 sequences with a masked language modeling objective. The API exposes two actions: encoder, which returns sequence-level and per-residue embeddings, BOS embeddings, attention-based contact maps, logits, and attentions for up to 8 protein sequences of length ≤2048; and predictor, which performs masked-token prediction on single or multiple <mask> sites. Typical uses include representation learning, mutation scoring via logit differences, and contact-map extraction for structure-aware protein design workflows.
Predict¶
Masked language modeling with ESM2-35M: predict the amino acid at one masked site in each sequence
- POST /api/v3/esm2-35m/predict/¶
Predict endpoint for ESM-2 35M.
- Request Headers:
Content-Type – application/json
Authorization – Token YOUR_API_KEY
Request
params (object, optional) — Configuration parameters:
repr_layers (array of integers, default: [-1]) — Layer indices to include in embeddings
include (array of strings, default: [“mean”]) — Embedding components to return; allowed values: “mean”, “per_token”, “bos”, “contacts”, “logits”, “attentions”
items (array of objects, min: 1, max: 8) — Input sequences for encoding:
sequence (string, min length: 1, max length: 2048, required) — Protein sequence using standard amino acid codes; may include “-” gap characters
Example request:
- Status Codes:
200 OK – Successful response
400 Bad Request – Invalid input
500 Internal Server Error – Internal server error
Response
results (array of objects) — One result per input item, in the order requested:
logits (array of arrays of floats, shape: [num_masks, vocab_size]) — Per-mask unnormalized prediction scores over the amino acid vocabulary
sequence_tokens (array of strings, length: input sequence length) — Tokenized input sequence including special tokens such as
<mask>vocab_tokens (array of strings, length: vocab_size) — Amino acid vocabulary tokens corresponding to indices in
logits
Example response:
Encode¶
Compute mean, per-token, BOS embeddings and contact maps from ESM2-35M for two short protein sequences
- POST /api/v3/esm2-35m/encode/¶
Encode endpoint for ESM-2 35M.
- Request Headers:
Content-Type – application/json
Authorization – Token YOUR_API_KEY
Request
params (object, optional) — Configuration parameters:
include (array of strings, default: [“mean”]) — Types of embeddings or logits to return; allowed values: “mean”, “per_token”, “logits”
items (array of objects, min: 1, max: 5) — Input sequences:
sequence (string, min length: 1, max length: 2048, required) — Protein sequence using standard unambiguous amino acid codes; ambiguous amino acids not allowed
Example request:
- Status Codes:
200 OK – Successful response
400 Bad Request – Invalid input
500 Internal Server Error – Internal server error
Response
results (array of objects) — One result per input item, in the order requested:
pdb (string) — Predicted protein structure in standard PDB file format.
mean_plddt (float, range: 0.0 - 1.0) — Mean predicted Local Distance Difference Test (pLDDT) confidence score for the predicted structure, indicating prediction accuracy (0.0 = low confidence, 1.0 = high confidence).
Example response:
Performance¶
ESM-2 35M is deployed for CPU-only inference with 2 vCPUs and 8 GB RAM; no GPU is required for either
encoderorpredictorendpoints.Typical latency is on the order of 1–2 seconds per single-sequence request at maximum schema limits (up to 8 sequences per request, up to 2048 residues each), and scales approximately linearly with both sequence length and batch size.
On unsupervised structure-related benchmarks, ESM-2 35M shows lower accuracy than larger ESM-2 variants while remaining usable for coarse structural signals:
Long-range contact precision at L: 0.30 vs. 0.44 (150M), 0.52 (650M), 0.54 (3B).
TM-score on CASP14: 0.41 vs. 0.47 (150M), 0.51 (650M), 0.52 (3B); on CAMEO: 0.56 vs. 0.65, 0.70, 0.72 respectively.
Within BioLM’s ESM-2 family, the 35M model offers the best throughput and lowest resource footprint, making it well suited for high-volume embedding or masked-prediction workloads where the slight loss in structural signal relative to 150M/650M is acceptable.
Applications¶
Rapid structural feature extraction from protein sequences for engineering workflows, using embeddings and contact maps from the encoder endpoint to prioritize candidates by foldability, packing, or putative stability before expensive simulations or assays; especially useful when screening many variants up to 2048 residues; less informative for proteins with large conformational changes or long disordered regions where single-sequence models underperform.
High-throughput structural annotation of metagenomic or proprietary protein collections by computing embeddings and contact maps through the encoder API, enabling teams to cluster sequences by inferred structural class, detect remote structural similarity, and flag candidates with novel folds for follow-up characterization; performance may be reduced for sequences with very low similarity to training distributions.
Single-sequence protein design assessment by embedding designed variants with the encoder endpoint and comparing their representations or contact patterns to known, functional backbones, helping design teams quickly filter unstable or misfolded designs without MSAs; accuracy is typically lower than full structure-prediction pipelines for very large or multidomain proteins.
Embedding-based feature generation for downstream ML models in protein engineering, where mean or per-token embeddings from the encoder are used as inputs to custom predictors of stability, expression, localization, or other assay readouts; valuable when paired with in-house lab data to train task-specific models; embeddings alone are not sufficient for precise prediction of binding affinities or complex interfaces without additional training.
Limitations¶
Maximum Sequence Length: Sequences in
itemsmust be at most2048amino acids (ESM2Params.max_sequence_len). Longer proteins must be truncated or split across multipleESM2EncodeRequest/ESM2PredictRequestcalls.Batch Size: The maximum
batch_sizeis8sequences per request (length-constrained bymax_sequence_len). Larger datasets require batching across multiple API calls and downstream aggregation.Single-Sequence Context Only: Both
encoderandpredictoroperate on individual sequences without multiple sequence alignments or template structures. Tasks that depend critically on deep MSAs (e.g. AlphaFold2-style high-accuracy folding of low-depth or orphan proteins) may perform better with MSA-based models.No End-to-End Structure Prediction: The
encodercan returncontacts(inter-residue distance/contact scores) and rich embeddings (mean,per_token,bos,attentions,logits), but it does not produce full 3D atomic coordinates. For atomic-resolution structures, use dedicated folding models such as ESMFold or AlphaFold2.Language-Model-Based Reliability: All outputs (embeddings,
contacts, masked-tokenlogits) are derived from a masked language model trained on natural protein sequences. For highly artificial, low-homology, or strongly out-of-distribution sequences, representations and contact maps may be less biologically meaningful, and masked-residue predictions less informative.Model Size vs. Accuracy: The 35M-parameter variant is optimized for speed and CPU-only deployment (no
gpurequired inESM2_VARIANT_RESOURCE_SPECS). It is less accurate than larger ESM-2 models (e.g. 150M, 650M, 3B, 15B) on structure- and function-related benchmarks, so for applications where accuracy is critical and latency/memory are less constrained, larger ESM-2 variants may be more appropriate.
How We Use It¶
ESM-2 35M enables rapid, scalable exploration of protein sequence space in early-stage protein engineering and optimization campaigns. We use its embeddings, attention maps, and contact predictions as standardized API features to drive downstream predictive models and rank large mutational libraries, then combine these with structural metrics from ESMFold and other BioLM models to prioritize designs for synthesis, screening, and multi-round optimization.
Integrates efficiently into predictive and generative modeling workflows, providing consistent sequence encodings for tasks such as enzyme design, antibody maturation, and epitope optimization.
Supports fast sequence-based ranking and filtering across large variant pools, reducing wet-lab screening burden and accelerating iteration cycles.
References¶
Lin, Z., Akin, H., Rao, R., Hie, B., Zhu, Z., Lu, W., Smetanin, N., Verkuil, R., Kabeli, O., Shmueli, Y., dos Santos Costa, A., Fazel-Zarandi, M., Sercu, T., Candido, S., & Rives, A. (2023). Evolutionary-scale prediction of atomic-level protein structure with a language model. Science, 379(6637), 1123–1130.
