Skip to content Skip to footer

Ex-Meta Scientists Roll Out AI That Designs Novel Proteins Beyond Nature’s Constraints

Ex-Meta scientists have introduced ESM3, an AI model that can design novel proteins by leveraging insights from over 2.7 billion protein sequences, structures, and functions.

Short Summary:

  • ESM3 is a protein language model capable of generating new proteins from scratch.
  • The model was created by EvolutionaryScale, a startup founded by ex-Meta researchers.
  • EvolutionaryScale raised $142 million in seed funding to fuel further AI-driven biological discoveries.

The intersection of artificial intelligence and biology has taken a leap forward as EvolutionaryScale, founded by former researchers from Meta, has unveiled a groundbreaking AI model—ESM3. This protein language model is designed to reason over the sequence, structure, and function of proteins, thereby enabling the generation of entirely novel proteins. The commercial applications are expansive, encompassing drug development, sustainability, and more.

We live in an era where AI programming tools are becoming increasingly capable, and proteins, often referred to as “nature’s machines,” have not been left out. The concept behind ESM3 is remarkably similar to OpenAI’s GPT-4, which powers chatbots like ChatGPT. Instead of human language, however, ESM3 deals with the language of proteins. The model was trained on a colossal dataset comprising 2.7 billion proteins, allowing it to generate new proteins based on given prompts.

Raising the Stakes in AI-Driven Biology

In a significant financial boost, EvolutionaryScale secured $142 million in a seed funding round. Prominent investors such as Lux Capital, Amazon Web Services, Nat Friedman, Daniel Gross, and Nvidia’s venture capital arm NVentures participated. According to Josh Wolfe, co-founder and managing partner at Lux Capital, the model signifies a “ChatGPT moment for biology.”

The analogy isn’t just marketing fluff. ESM3 can create proteins with specific functions using its integrated data on protein sequences, structures, and functions. For example, one notable achievement involves the generation of a new green fluorescent protein (GFP). Researchers reported that the GFP generated by ESM3 shares only 58% similarity with its closest natural counterpart, a feat equivalent to simulating 500 million years of evolution.

“We want to build tools that can make biology programmable,” stated Alexander Rives, Chief Scientist at EvolutionaryScale and former lead of Meta’s AI protein team, in a conversation with Nature.

The Path to ESM3

The journey to ESM3 began at Meta, where Rives and his team established a database of over 600 million protein structures. This database could be mined for drug development and other purposes. Earlier iterations of the ESM model, such as ESMFold, set the groundwork by using large language models trained on biological data to predict protein structures. However, strategic changes at Meta in 2023 led to the disbandment of the AI protein team, shifting focus towards commercial AI products.

Unlike traditional approaches that require real-world experimentation and modeling to understand the transcription of genes and the folding of proteins, ESM3 offers a more efficient method. It utilizes AI algorithms to understand the missing pieces and propose new proteins that could potentially accelerate drug discovery.

In a conversation thread, one user pointed out, “When AI produces a result that is surprising, we still have to validate it in the real world, and work backward through hard research to understand why we are surprised.”

Implications and Future Directions

The implications of ESM3 are far-reaching. From aiding drug development to designing new chemicals for plastic degradation, the model promises to be a versatile tool for various industries. The fact that ESM3 can generate new proteins beyond nature’s constraints places it at a strategic advantage, particularly for companies that rely on bioengineering.

The technology also addresses some of the bottlenecks in existing methods. Protein structures have traditionally been mapped out using X-rays—a time-consuming and expensive process. With ESM3’s ability to predict and generate new protein sequences rapidly, researchers can bypass some of these traditional hurdles, albeit with the need for subsequent real-world validation.

To offer open access, EvolutionaryScale has released a smaller, non-commercial version of the model, called ESM3-open. This version excludes protein sequences associated with viruses and other dangerous pathogens, making it safe for academic and exploratory research. Commercial researchers will have access to the more powerful model under licensing arrangements.

“From the rate of diversification of GFPs found in nature, we estimate that this generation of a new fluorescent protein is equivalent to simulating over 500 million years of evolution,” reported EvolutionaryScale in their latest publication.

A Glimpse into the Future

As AI continues to evolve, the advent of models like ESM3 provides a glimpse into the future of biological research and potential applications. Beyond exciting immediate applications, these advancements provoke ethical considerations and broader implications for the future of AI in science and industry.

For instance, Zachary Voase remarked, “The classical approach was to understand how genes transcribe to mRNA, and how mRNA translates to polypeptides; how those are cleaved by the cell, and fold in 3D space; and how those 3D shapes result in actual biological function. It required real-world measurement, experiment, and modeling in silico using biophysical models. Those are all hard research efforts.”

Nonetheless, the benefits of this AI-driven approach are compelling. Through advanced algorithms, intricate biological puzzles that used to take years—and enormous financial investment—can now be resolved more efficiently. As this technology matures, it may very well redefine the landscape of not just biological research, but various industries reliant on biomolecular sciences.

“I mean, it’s more than a sellable product; the reason we’re doing this is to be able to advance medicine. A good understanding of the ‘why’ would be great, but if we can advance medicine quicker in the here and now without it, I think that’s worth doing?” commented one user in a discussion about the model’s implications.

The disruptive potential of ESM3 is emblematic of the larger trajectory of artificial intelligence. Just as companies like Autoblogging.ai are revolutionizing content creation through an advanced AI Article Writer, similar technology is now creating pathways for unprecedented advancements in various scientific fields.

In light of this, one can contemplate broader questions around the ethics and future of AI writing, reflecting how these developments could alter societal structures and professional domains.

As ESM3 and similar technologies burgeon, they necessitate a rethinking of prevailing methodologies in scientific inquiry and beyond. It’s a brave new world, one where the synthesis of AI and biology can potentially outperform nature, offering innovations that were once relegated to the realm of science fiction.