Evogene Ltd. has unveiled a first-in-class generative AI foundation model for small-molecule design, marking a breakthrough in how new compounds are discovered. Announced on June 10, 2025, in collaboration with Google Cloud, the model expands Evogene’s ChemPass AI platform and tackles a long-standing challenge in both pharmaceuticals and agriculture: finding novel molecules that meet multiple complex criteria simultaneously. This development is poised to accelerate R&D in drug discovery and crop protection by enabling the simultaneous optimization of properties like efficacy, toxicity, and stability in a single design cycle.
From Sequential Screening to Simultaneous Design
In traditional drug and agriculture chemical research, scientists usually test one factor at a time—first checking if a compound works, then later testing for safety and stability. This step-by-step method is slow, expensive, and often ends in failure, with many promising compounds falling short in later stages. It also keeps researchers focused on familiar chemical structures, limiting innovation and making it harder to create new, patentable products. This outdated approach contributes to high costs, long timelines, and a low success rate—around 90% of drug candidates fail before reaching the market.
Generative AI changes this paradigm. Instead of one-by-one filtering, AI models can juggle multiple requirements at once, designing molecules to be potent and safe and stable from the start. Evogene’s new foundation model was explicitly built to enable this simultaneous multi-parameter design. This approach aims to de-risk later phases of development by front-loading considerations like ADME and toxicity into the initial design.
In practice, it could mean fewer late-stage failures – for instance, fewer drug candidates that show great lab results only to fail in clinical trials due to side effects. In short, generative AI allows researchers to innovate faster and smarter, concurrently optimizing for the many facets of a successful molecule rather than tackling each in isolation.
Inside ChemPass AI: How Generative Models Design Molecules
At the heart of Evogene’s ChemPass AI platform is a powerful new foundation model trained on an enormous chemical dataset. The company assembled a curated database of roughly 40 billion molecular structures– spanning known drug-like compounds and diverse chemical scaffolds – to teach the AI the “language” of molecules. Using Google Cloud’s Vertex AI infrastructure with GPU supercomputing, the model learned patterns from this vast chemical library, giving it an unprecedented breadth of knowledge on what drug-like molecules look like. This massive training regimen is akin to training a large language model, but instead of human language, the AI learned chemical representations.
Evogene’s generative model is built on transformer neural network architecture, similar to the GPT models that revolutionized natural language processing. In fact, the system is referred to as ChemPass-GPT, a proprietary AI model trained on SMILES strings (a text encoding of molecular structures). In simple terms, ChemPass-GPT treats molecules like sentences – each molecule’s SMILES string is a sequence of characters describing its atoms and bonds. The transformer model has learned the grammar of this chemical language, enabling it to “write” new molecules by predicting one character at a time, in the same way GPT can write sentences letter by letter. Because it was trained on billions of examples, the model can generate novel SMILES that correspond to chemically valid, drug-like structures.
This sequence-based generative approach leverages the strength of transformers in capturing complex patterns. By training on such a huge and chemically diverse dataset, ChemPass AI overcomes problems that earlier AI models faced, like bias from small datasets or generating redundant or invalid molecules The foundation model’s performance already far outstrips a generic GPT applied to chemistry: internal tests showed about 90% precision in producing novel molecules that meet all design criteria, versus ~29% precision for a traditional GPT-based modelevogene.com. In practical terms, this means nearly all molecules ChemPass AI suggests are not only new but also hit their target profile, a striking improvement over baseline generative techniques.
While Evogene’s primary generative engine uses a transformer on linear SMILES, it’s worth noting the broader AI toolkit includes other architectures like graph neural networks (GNNs). Molecules are naturally graphs – with atoms as nodes and bonds as edges – and GNNs can directly reason on these structures. In modern drug design, GNNs are often used to predict properties or even generate molecules by building them atom-by-atom. This graph-based approach complements sequence models; for example, Evogene’s platform also incorporates tools like DeepDock for 3D virtual screening, which likely use deep learning to assess molecule binding in a structure-based context By combining sequence models (great for creativity and novelty) with graph-based models (great for structural accuracy and property prediction), ChemPass AI ensures its generated compounds are not just novel on paper, but also chemically sound and effective in practice. The AI’s design loop might generate candidate structures and then evaluate them via predictive models – some possibly GNN-based – for criteria like toxicity or synthetic feasibility, creating a feedback cycle that refines each suggestion.
Multi-Objective Optimization: Potency, Toxicity, Stability All at Once
A standout feature of ChemPass AI is its built-in ability for multi-objective optimization. Classic drug discovery often optimizes one property at a time, but ChemPass was engineered to handle many objectives simultaneously. This is achieved through advanced machine learning techniques that guide the generative model toward satisfying multiple constraints. In training, Evogene can impose property requirements – such as a molecule must activate a certain target strongly, avoid certain toxic motifs, and have good bioavailability – and the model learns to navigate chemical space under those rules. The ChemPass-GPT system even enables “constraints-based generation,” meaning it can be instructed to only propose molecules that meet specific desired properties from the outset.
How does the AI accomplish this multi-parameter balancing act? One approach is multi-task learning, where the model is not just generating molecules but also predicting their properties using learned predictors, adjusting generation accordingly. Another powerful approach is reinforcement learning (RL). In an RL-enhanced workflow, the generative model acts like an agent “playing a game” of molecule design: it proposes a molecule and then gets a reward score based on how well that molecule meets the objectives (potency, lack of toxicity, etc.). Over many iterations, the model tweaks its generation strategy to maximize this reward. This method has been successfully used in other AI-driven drug design systems – researchers have shown that reinforcement learning algorithms can guide generative models to produce molecules with desirable properties. In essence, the AI can be trained with a reward function that encapsulates multiple goals, for example giving points for predicted efficacy and subtracting points for predicted toxicity. The model then optimizes its “moves” (adding or removing atoms, altering functional groups) to net the highest score, effectively learning the trade-offs needed to satisfy all criteria.
Evogene hasn’t disclosed the exact proprietary sauce behind ChemPass AI’s multi-objective engine, but it’s clear from their results that such strategies are at work. The fact that each generated compound “simultaneously meets essential parameters” like efficacy, synthesizability and safety. The upcoming ChemPass AI version 2.0 will push this further – it’s being developed to allow even more flexible multi-parameter tuning, including user-defined criteria tailored to specific therapeutic areas or crop requirements. This suggests the next-gen model could let researchers dial up or down the importance of certain factors (for instance, prioritizing brain penetrance for a neurology drug or environmental biodegradability for a pesticide) and the AI will adjust its design strategy accordingly. By integrating such multi-objective capabilities, ChemPass AI can design molecules that hit the sweet spot on numerous performance metrics at once, a feat practically impossible with traditional methods.
A Leap Beyond Traditional R&D Methods
The advent of ChemPass AI’s generative model highlights a wider shift in life-science R&D: the move from laborious trial-and-error workflows to AI-augmented creativity and precision. Unlike human chemists, who tend to stick to known chemical series and iterate slowly, an AI can fathom billions of possibilities and venture into the unexplored 99.9% of chemical space. This opens the door to finding efficacious compounds that don’t resemble anything we’ve seen before – crucial for treating diseases with novel chemistry or tackling pests and pathogens that have evolved resistance to existing molecules. Moreover, by considering patentability from the get-go, generative AI helps avoid crowded intellectual property areas. Evogene explicitly aims to produce molecules that carve out fresh IP, an important competitive advantage.
The benefits over traditional approaches can be summarized as follows:
-
Parallel Multi-Trait Optimization: The AI evaluates many parameters in parallel, designing molecules that satisfy potency, safety, and other criteria. Traditional pipelines, in contrast, often only discover a toxicity issue after years of work on an otherwise promising drug. By preemptively filtering for such issues, AI-designed candidates have a better shot at success in costly later trials.
-
Expanding Chemical Diversity: Generative models aren’t limited to existing compound libraries. ChemPass AI can conjure structures that have never been made before, yet are predicted to be effective. This novelty-driven generation avoids reinventing the wheel (or the molecule) and helps create differentiated products with new modes of action. Traditional methods often lead to “me-too” compounds that offer little novelty.
-
Speed and Scale: What a team of chemists might achieve via synthesis and testing in a year, an AI can simulate in days. ChemPass AI’s deep learning platform can virtually screen tens of billions of compounds rapidly and generate hundreds of novel ideas in a single run. This dramatically compresses the discovery timeline, focusing wet-lab experiments only on the most promising candidates identified in silico.
-
Integrated Knowledge: AI models like ChemPass incorporate vast amounts of chemical and biological knowledge (e.g. known structure-activity relationships, toxicity alerts, drug-like property rules) in their trainingThis means every molecule design benefits from a breadth of prior data no single human expert could hold in their head. Traditional design relies on the experience of medicinal chemists – valuable but limited to human memory and bias – whereas the AI can capture patterns across millions of experiments and diverse chemical families.
In practical terms, for pharma this could lead to higher success rates in clinical trials and reduced development costs, since fewer resources are wasted on doomed compounds. In agriculture, it means faster creation of safer, more sustainable crop protection solutions – for example, an herbicide that is lethal to weeds but benign to non-target organisms and breaks down harmlessly in the environment. By optimizing across efficacy and environmental safety together, AI can help deliver “effective, sustainable, and proprietary” ag-chemicals, addressing regulatory and resistance challenges in one go.
Part of a Broader AI Toolbox at Evogene
While ChemPass AI steals the spotlight for small-molecule design, it’s part of Evogene’s trio of AI-powered “tech-engines” tailored to different domains. The company has MicroBoost AI focusing on microbes, ChemPass AI on chemistry, and GeneRator AI on genetic elements. Each engine applies big-data analytics and machine learning to its respective field.
This integrated ecosystem of AI engines underscores Evogene’s strategy as an “AI-first” life science company. They aim to revolutionize product discovery across the board – whether it’s formulating a drug, a bio-stimulant, or a drought-tolerant crop – by harnessing computation to navigate biological complexity. The engines share a common philosophy: use cutting-edge machine learning to increase the probability of R&D success and reduce time and cost.
Outlook: AI-Driven Discovery Comes of Age
Generative AI is transforming molecule discovery, shifting AI’s role from assistant to creative collaborator. Instead of testing one idea at a time, scientists can now use AI to design entirely new compounds that meet multiple goals—potency, safety, stability, and more—in a single step.
This future is already unfolding. A pharmaceutical team might request a molecule that targets a specific protein, avoids the brain, and is orally available—AI can deliver candidates on demand. In agriculture, researchers could generate eco-friendly pest controls tailored to regulatory and environmental constraints.
Evogene’s recent foundation model, developed with Google Cloud, is one example of this shift. It enables multi-parameter design and opens new areas of chemical space. As future versions allow even more customization, these models will become essential tools across life sciences.
Crucially, the impact depends on real-world validation. As AI-generated molecules are tested and refined, models improve—creating a powerful feedback loop between computation and experimentation.
This generative approach isn’t limited to drugs or pesticides. It could soon drive breakthroughs in materials, food, and sustainability—offering faster, smarter discovery across industries once constrained by trial and error.
Leave a Reply