SoluProt

SoluProt predicts the probability of soluble overexpression of recombinant proteins in *E. coli* directly from amino acid sequence, returning both a calibrated probability (0–1) and a binary solubility call (threshold ≥ 0.5). The model is based on a gradient boosting classifier trained on a curated TargetTrack-derived dataset and evaluated on an independent NESG-based test set (accuracy 58.5%, AUC 0.62). The API supports batched inference on 1–100 sequences of length 20–5000 aa for prioritizing soluble candidates in cloning, expression screening, and enzyme discovery pipelines.

proteinprediction

Capabilities

Predictor
Encoder
Explainer
Generator
Classifier
Similarity

Accelerate yourLead generation

BioLM offers tailored AI solutions to meet your experimental needs. We deliver top-tier results with our model-agnostic approach, powered by our highly scalable and real-time GPU-backed APIs and years of experience in biological data modeling, all at a competitive price.

CTA

We speak the language of bio-AI

© 2022 - 2025 BioLM. All Rights Reserved.