New York, NY, June 25, 2024 – EvolutionaryScale, a frontier AI research lab for biology, launched today with ESM3, a milestone AI model capable of generating novel proteins. ESM3 generated a new Green Fluorescent Protein (GFP), a process that would take 500 million years of evolution to occur naturally. This milestone generative AI model allows interactive prompting to create proteins, empowering scientists to advance applications from drug discovery, and materials science, to carbon capture.

The founding team at EvolutionaryScale and behind ESM3 are pioneers in applying AI to biology, building what is widely considered to be the first transformer language model for proteins ESM1. The ESM models have empowered groundbreaking scientific research, including a breakthrough in protein folding that helped reveal the structures of hundreds of millions of metagenomic proteins; the models have been used by scientists across the world to model and understand proteins.

EvolutionaryScale described ESM3 today in a scientific preprint and released an open version of the model for scientific researchers (links below).

The Frontier Language Model for Biology
ESM3 was trained with 1 trillion teraflops – more compute than any other known model in biology – on a dataset of 2.78 billion proteins across the Earth’s natural diversity. It is the first generative model for biology that simultaneously reasons over the sequence, structure and function of proteins. This enables scientists to understand and create new proteins, making biology programmable.

“ESM3 takes a step toward a future of biology where AI is a tool to engineer from first principles, the way we engineer structures, machines, and microchips, and write computer programs,” said EvolutionaryScale co-founder and chief scientist, Alexander Rives. “We’ve been working on this for a long time, and we’re excited to share it with the scientific community and see what they do with it.”

With this capability, the model has the potential to accelerate discovery across a broad range of applications, ranging from the development of new cancer treatments to creating proteins that could help capture carbon. 

Simulating 500 Million Years of Evolution with a Language Model
Prompted through a chain of thought to reason over possible sequences and structures of GFP, ESM3 stepped across 500 million years of evolution to create a new fluorescent protein. GFP is one of the most beautiful and unique proteins in nature, responsible for the glowing of jellyfish and the vivid fluorescent colors of coral. It is the only protein that emits light, and the biological mechanism for this is unique – it is a protein that transforms itself forming a light emitting chromophore out of its own atoms.

GFP has become an important tool in molecular biology, helping scientists to see molecules inside cells. The mechanism that powers this phenomenon is incredibly complex, and ‌generating a variant this distant by computational or experimental laboratory techniques has not been scientifically documented. New fluorescent proteins this distant from known ones have only been found through the discovery of new GFPs in the natural world. Our analysis suggests that under natural evolution it could take more than 500 million years for a protein this different to evolve. 

ESM3: A Tool for Scientists
ESM3’s success in generating a new GFP underscores the model’s potential for advancements in biological research and life sciences.

EvolutionaryScale will be opening an API for closed beta today and code and weights are available for a small open version of ESM3 for non-commercial use. EvolutionaryScale is also collaborating with Amazon Web Services (AWS) and NVIDIA to accelerate applications from drug discovery to synthetic biology with AI.

By working with AWS, Evolutionary Scale is making the full ESM3 model family easily accessible to hundreds of thousands of researchers around the world and nine out of the top ten global pharma companies, who already use AWS’s generative AI and health services — Amazon SageMaker, Amazon Bedrock, and AWS HealthOmics. This move will make it easier for researchers to fine-tune the ESM3 models using their own proprietary data securely, and at scale.

All versions of ESM3 will be optimized for training and inference performance through the company’s ongoing collaboration with NVIDIA, including NVIDIA BioNeMo NIMs to accelerate runtime performance and support through the NVIDIA AI Enterprise software license and at ai.nvidia.com. 

Closes More Than $142 Million in Seed Funding
EvolutionaryScale also announced a seed round of more than $142 million, led by Nat Friedman and Daniel Gross, and Lux Capital, with participation from Amazon, NVentures (NVIDIA’s venture capital arm) and angel investors. Funding will be used to further expand the capabilities of its models.

A link to the ESM3 release blog post describing the preprint paper can be found on our website at https://www.evolutionaryscale.ai/blog/esm3-release.

About EvolutionaryScale
EvolutionaryScale is a frontier AI research lab and Public Benefit Corporation dedicated to developing artificial intelligence for the life sciences. EvolutionaryScale’s models support groundbreaking research and development in health, environmental science, and beyond. The company was founded in July 2023 and has raised more than $142 million in seed funding led by Nat Friedman and Daniel Gross, and Lux Capital, with participation from Amazon, NVentures (NVIDIA’s venture capital arm) and angel investors. For more information, visit https://evolutionaryscale.ai

Working at Meta’s FAIR (Fundamental AI Research) unit, EvolutionaryScale’s founding team built ESM1 in 2019 which is widely recognized to be the first large language model (LLM) for proteins. EvolutionaryScale’s founding team left Meta in April 2023, starting EvolutionaryScale to develop and launch a next generation model, ESM3.

EvolutionaryScale is committed to developing artificial intelligence for the benefit of human health and society, through open, safe, and responsible research, and in partnership with the scientific community. The scientific team behind the ESM models are among the more than 160 signatories across the world, committed to advancing a framework for responsible development. For more information, please visit: https://responsiblebiodesign.ai/#supporters