In a newly released preprint, researchers from the Massachusetts Institute of Technology and collaborators unveiled BoltzGen, a generative AI model that creates custom proteins and peptides capable of binding to a wide range of biological targets, including disordered proteins and small molecules.
BoltzGen’s standout feature is its ability to combine structure prediction and binder design in a single, controllable system, giving scientists new precision in molecular design.
The Problem: Current Binder Design is Powerful but Rigid
Designing de novo protein binders is really the key to automating drug development; however, existing tools tend to come with trade-offs. Most are limited to specific molecule types, such as nanobodies or short peptides. And what's worse is that many rely on target structures that closely resemble their training data, limiting their usefulness on new targets.
Real-world applications also demand flexible design inputs, such as specifying covalent bonds or avoiding non-specific interactions. Until now, these kinds of constraints were hard to implement in generative models.
BoltzGen directly tackles these issues with a unified, all-atom model that allows researchers to design binders with customizable features - and it’s already been validated across a diverse set of experimental campaigns.
How BoltzGen Works: A Two-Part System with Full Design Control
The model’s architecture centered on two components. A Trunk module processed input information (target structures, optional constraints, and supplementary design instructions) and converted it into a detailed mathematical representation. A Diffusion module then used that representation to refine random atomic noise into a complete three-dimensional binder structure across hundreds of iterative steps.
A major contribution of the work was the model’s controllability. Researchers could specify covalent bonds to create cyclic peptides, lock specific parts of a structure, or define precise binding regions. These instructions were fed directly into the model and influenced the structure it produced.
To build a general-purpose system, the team trained BoltzGen on a mixture of tasks, including protein folding, binder generation, motif scaffolding, and unconditional structure creation. Randomly masking parts of input structures forced the model to infer the missing regions, allowing it to internalize the physical principles behind molecular interactions from multiple directions.
The complete pipeline included generating binder candidates, refining sequences to improve foldability, and evaluating designs using physics-based metrics such as hydrogen bonding patterns. A final filtering step ranked candidates by predicted quality and interaction confidence, while a diversity mechanism ensured that the shortlisted designs were not redundant.
Real-World Results: Success Across 26 Biological Targets
BoltzGen was evaluated across eight experimental campaigns targeting 26 biomolecules, demonstrating both breadth and robustness.
In a key test, the system designed nanobodies and general protein binders for nine novel targets that shared less than 30 % sequence identity with the training data in any bound state.
Testing only 15 designs per target yielded nanomolar-affinity binders for six of the nine, an overall 66 % success rate for both protein modalities. No binding to human serum albumin was detected, reducing concerns about non-specific activity. On five established benchmark targets with known binders, the success rate reached 80 %.
The model performed well across several challenging modalities. It generated binders to bioactive peptides such as melittin, protegrin, and indolicidin, which displayed nanomolar to micromolar affinity and neutralized their cytotoxic effects. It produced a peptide that localized to the nucleolus in live cells, a proxy for binding to the disordered protein Nucleophosmin.
For structured targets like RagC GTPase, BoltzGen created both linear and disulfide-cyclized peptides. Additional efforts produced weak binders to small molecules, nanobodies against viral proteins, and antimicrobial peptides that disrupted the bacterial GyrA interaction, with nearly 20 % of designs showing strong inhibitory effects.
Computational analysis confirmed that BoltzGen matched state-of-the-art folding accuracy. It also produced more target-specific and structurally diverse binders than earlier systems such as RFdiffusion, indicating that it conditioned its designs more effectively on the input target.
Limitations Identified by the Authors
Despite its strong performance, BoltzGen had important constraints. High-affinity binding was only the first step toward functional molecules suitable for therapeutic use. Practical drug development also required selectivity, long-term stability, manufacturability, and other properties that were not yet incorporated into the generative process.
A concrete technical issue also emerged: the model occasionally reproduced ubiquitin-like sequences for binders around 73–76 residues, likely due to their overrepresentation in the training set. The researchers mitigated the problem by downsampling those structures, but it remained a reminder of the risks of dataset imbalance.
The authors emphasized that BoltzGen should not be treated as a fully automated “zero-shot” solution. They encouraged users to carefully inspect generated designs and make use of the flexible control language to guide the system toward optimal outcomes.
Conclusion
BoltzGen represented a significant advance in AI-driven protein binder design. By bringing together all-atom structure prediction with a highly controllable generative framework, the research team was able to create diverse molecules for a wide range of biological targets, including several traditionally considered exceptionally difficult.
Its experimental validation demonstrated real-world practicality, and its open-source release provides a strong platform for future progress in molecular engineering.
Journal Reference
Stark et al. (2025). BoltzGen: Toward Universal Binder Design. bioRxiv (Cold Spring Harbor Laboratory). DOI:10.1101/2025.11.20.689494. https://www.biorxiv.org/content/10.1101/2025.11.20.689494v1
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.