ESM-2 150M is a transformer-based protein language model trained on UniRef evolutionary sequence data, exposed here for sequence encoding and masked-token prediction. The API provides GPU-accelerated embeddings (mean, per-residue, BOS), self-attention maps, unsupervised contact scores, and logits for sequences up to 2048 amino acids, with batched processing up to 8 sequences. Typical uses include feature extraction for downstream ML models, zero-shot mutation scoring, contact-based structure analysis, and design workflows in protein engineering and variant prioritization.
Predict¶
Predict properties or scores for input sequences
- POST /api/v3/esm2-150m/predict/¶
Predict endpoint for ESM-2 150M.
- Request Headers:
Content-Type – application/json
Authorization – Token YOUR_API_KEY
Request
params (object, optional) — Configuration parameters:
repr_layers (array of ints, default: [-1]) — Indices of model layers to include in the output representations
include (array of strings, default: [“mean”]) — Output representation types to compute for each input sequence
items (array of objects, min items: 1, max items: 8) — Input sequences:
sequence (string, min length: 1, max length: 2048, required) — Protein sequence using extended amino acid codes, optionally including “-”
Example request:
- Status Codes:
200 OK – Successful response
400 Bad Request – Invalid input
500 Internal Server Error – Internal server error
Response
results (array of objects) — One result per input item, in the order requested:
logits (array of arrays of floats) — Per-position output scores over the model vocabulary, shape: [L, V] where L is the input sequence length (including <mask> tokens) and V is the vocabulary size
sequence_tokens (array of strings) — Tokenized input sequence, one token per position
vocab_tokens (array of strings) — Vocabulary tokens corresponding to indices in logits
Example response:
Encode¶
Generate embeddings for input sequences
- POST /api/v3/esm2-150m/encode/¶
Encode endpoint for ESM-2 150M.
- Request Headers:
Content-Type – application/json
Authorization – Token YOUR_API_KEY
Request
params (object, required) — Configuration parameters:
repr_layers (array of ints, default: [-1]) — Indices of model layers to include in the output representations
include (array of strings, default: [“mean”]) — Output representation types to compute, using values from {“mean”, “per_token”, “bos”, “contacts”, “logits”, “attentions”}
items (array of objects, min: 1, max: 8) — Input sequences:
sequence (string, min length: 1, max length: 2048, required) — Protein sequence using the extended amino acid alphabet (may include “-” as an extra character)
Example request:
- Status Codes:
200 OK – Successful response
400 Bad Request – Invalid input
500 Internal Server Error – Internal server error
Response
results (array of objects) — One result per input item, in the order requested:
sequence_index (int) — Zero-based index of the input sequence corresponding to this result
embeddings (array of objects, optional) — Per-layer sequence-level embeddings, included when
includecontains"mean"or whenrepr_layersis set:layer (int) — Identifier of the model layer that produced this embedding
embedding (array of floats) — Mean-pooled embedding vector for the sequence (length depends on model size, e.g., 640 for 150M parameters)
bos_embeddings (array of objects, optional) — Per-layer beginning-of-sequence token embeddings, included when
includecontains"bos":layer (int) — Identifier of the model layer that produced this embedding
embedding (array of floats) — Embedding vector for the BOS token at this layer
per_token_embeddings (array of objects, optional) — Per-layer per-residue embeddings, included when
includecontains"per_token":layer (int) — Identifier of the model layer that produced these embeddings
embeddings (array of arrays of floats) — Per-residue embeddings with shape
[L, D]whereLis sequence length andDis embedding dimension
contacts (array of arrays of floats, optional) — Predicted inter-residue contact or distance scores, included when
includecontains"contacts"; square matrix of shape[L, L]attentions (array of arrays of floats, optional) — Flattened self-attention weights, included when
includecontains"attentions"logits (array of arrays of floats, optional) — Per-position unnormalized token scores, included when
includecontains"logits"; shape[L, V]whereVis vocabulary sizevocab_tokens (array of strings, optional) — Token strings defining the vocabulary order used for
logits
Example response:
Performance¶
Deployed on NVIDIA T4 GPUs with 16 GB VRAM for the 150M-parameter ESM-2 variant, optimized for inference-only workloads (FP16 where applicable) to maximize throughput for embedding and masked-token prediction.
Compared to the 650M ESM-2 model served on the same GPU class, ESM-2 150M typically achieves a 2–3x higher throughput per GPU (sequences processed per unit time) for both encoder and predictor endpoints, due to the smaller transformer depth and hidden size.
On structure-related benchmarks derived from the ESM-2 paper (unsupervised long-range contact precision and structure-module TM-score), ESM-2 150M underperforms ESM-2 650M by roughly 5–10 percentage points, but is comparable to or better than the earlier ESM-1b 650M model at similar parameter count.
150M-parameter embeddings and attention-derived contacts obtained via the
encoderendpoint are suitable for rapid prototyping, large-scale screening, and downstream models, while larger ESM-2 variants hosted by BioLM are recommended when maximum accuracy on structure- or function-prediction tasks is required.
Applications¶
De novo protein sequence design for therapeutic and industrial proteins ESM-2 150M can propose novel amino acid sequences by filling in masked regions or scoring candidate variants, enabling exploration of protein designs not present in natural databases. This is useful for early-stage design of binders, scaffolds, or enzymes before moving to more specialized structure-based tools.
Fixed-backbone sequence optimization guided by structural priors Using per-residue embeddings and self-attention–derived contact maps, ESM-2 150M can help prioritize mutations that are consistent with the model’s learned structural constraints for a given backbone-like topology. Companies can use this to filter or rank large mutation libraries for stability or foldability before experimental screening or more expensive modeling.
Protein fitness and developability scoring in ML-guided engineering pipelines The encoder endpoint provides sequence embeddings that capture evolutionary and structural signals, which can be used as input features to downstream predictive models (e.g., activity, stability, expression, solubility). This enables higher-quality surrogate models for protein engineering campaigns in biopharma, industrial biotech, and synthetic biology.
Rapid in silico variant exploration for lead optimization The predictor endpoint supports masked language modeling on protein sequences, allowing users to evaluate many single or combinatorial substitutions around a lead sequence. This helps narrow down mutation hotspots and prioritize variants for synthesis when experimental throughput is limited.
Large-scale sequence space mapping for portfolio and IP strategy By embedding up to 2048-residue sequences and predicting approximate contact patterns, ESM-2 150M can be used to cluster portfolios, detect remote relationships, and identify underexplored sequence regions. This is valuable for scouting novel protein families or differentiable sequence space while avoiding obvious prior art.
Limitations¶
Batch Size: Maximum
batch_sizeper request is8sequences; higher throughput requires multiple requests or client-side batching across calls.Maximum Sequence Length: Each
sequencemust be at most2048characters; longer proteins must be split or truncated before calling eitherencoderorpredictor.Input Alphabet:
encoderaccepts standard amino acids plus"-";predictoradditionally allows"<mask>"but requires at least one"<mask>"token persequence. Non‑standard tokens are rejected.Representation Scope: Outputs such as
embeddings,per_token_embeddings,contacts,attentions, andlogitsare model-internal features, not calibrated biophysical quantities (e.g., not experimental ΔΔG, binding affinities, or folding energies).Structure Prediction: The
contactsoption exposes coarse inter-residue contact probabilities only; this API does not provide full 3D structure prediction (use dedicated structure models such as ESMFold or AlphaFold2 for atomic coordinates).Sequence & Family Coverage: As a masked language model trained on UniRef, ESM-2 can underperform on extremely unusual, synthetic, or very low-homology proteins; downstream validation with more specialized or supervised models is recommended for critical design or ranking decisions.
How We Use It¶
BioLM uses the ESM-2 150M model as a standardized, API-accessible sequence encoder to accelerate protein engineering campaigns, especially when large-scale screening or limited compute budgets rule out larger models. Its embeddings, attention-derived contact maps, and masked-token predictions integrate into multi-step workflows for enzyme design, antibody optimization, and metagenomic mining, where they inform variant generation, in silico filtering, and ranking alongside downstream structure predictors and biophysical property models. This enables rapid iteration on thousands of candidate sequences while keeping analyses reproducible and scalable across projects.
Combines ESM-2 150M embeddings and contact estimates with separate 3D structure and property predictors for sequence triage and downselection
Supports scalable, batched analysis (up to 8 sequences of length ≤2048) via consistent APIs, simplifying integration into automated lab-in-the-loop pipelines
References¶
Rives, A., Meier, J., Sercu, T., Goyal, S., Lin, Z., Liu, J., Guo, D., Ott, M., Zitnick, C. L., Ma, J., & Fergus, R. (2021). Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proceedings of the National Academy of Sciences.
Lin, Z., Akin, H., Rao, R., Hie, B., Zhu, Z., Lu, W., Smetanin, N., Verkuil, R., Kabeli, O., Shmueli, Y., dos Santos Costa, A., Fazel-Zarandi, M., Sercu, T., Candido, S., & Rives, A. (2023). Evolutionary-scale prediction of atomic level protein structure with a language model. Science. (Earlier versions available as bioRxiv preprint: https://www.biorxiv.org/content/10.1101/2022.07.20.500902v2.)
