MegaSyn

Developing new compounds with desirable drug-like properties such as increased target activity while maintaining good ADME is a challenging feat. During the cycle of novel chemical design, chemists are tasked with creation of new analogs from a target molecule. Machine learning (ML) is often integrated into this cycle, providing a way to predict activity and score new compounds according to learned data representation. These traditional ML models, however, lack the ability to generate new compounds from learned data, leaving the task to chemistry experts.

Recently, ML  models for generating de novo libraries of compounds have been introduced into the literature, including Recurrent Neural Networks (RNNs). RNNs are a neural network architecture which learn the structure of sequential data, keeping track of the most salient information at every step and every previously seen step, affecting the interpretation of the current step. 

RNNs have been used as generative models, successfully learning SMILES representation of chemicals and are capable of generating new, synthetically reasonable compounds with desirable drug-like properties. We have implemented several different RNNs which can benefit from the thousands of datasets/models we have curated in MegaTox, MegaTrans, MegaPredict, and other datasets. 

Our RNN implementation called MegaSyn, utilizes a state-of-the-art multi-objective optimization algorithm to optimize multiple parameters simultaneously during the RNN training. Our in-house automated analog designer scores these generated molecules on their synthetic feasibility, allowing us to select optimized generated compounds that are synthetically feasible. 

MegaSyn logo
Collaborative feedback cycle diagram

General Approach

  • Train a generative RNN with optimized target/ADME/Tox parameters.
  • Compounds are generated, integrate retrosynthetic analysis.
  • Client feedback is integrated into the generative model.
  • Top scoring generated compounds are experimentally validated.
  • A feedback loop with drug discovery experts and our machine learning experts is integrated into our platform to guide the new round of compound generation and testing.
  • The cycle is repeated until a desired compound or sets of compounds is generated and validated
  • The approach can be used to design out ADME/Tox liabilities.
  • MegaSyn can be used to design any molecules from a sequence (e.g. peptides).
  • MegaSyn can be used for polymer design and optimisation.

Access

  • We can use MegaSyn in fee for service work for you.
  • We can provide an annual license for you to access this software on your own server.
  • We provide maintenance and customization options.