GPT-4 is a generative AI model that responds to text and images. You’re probably familiar with ChatGPT, which is a product based on GPT-3 and GPT-4. GPT-4 is the newest version released from OpenAI and has created a variety of reactions, but the key theme is that it represents a significant advance in what is possible with generative AI.
One of the surprises in the GPT-4 technical report was that it seemed to be able to propose molecules that could be a new drug. I actually devised those examples in the system report and am here to tell you: GPT-4 cannot do drug discovery. However, it can assist in the process by proposing new compounds. It is early, and these ideas are not in practice yet, but let’s try to glimpse the future of generative AI in drug discovery.
One of the earliest tasks of drug discovery is retrieving and looking at known molecules that bind to a target protein. This could lead to a knowledge-based screening approach, where we try to screen by only examining these molecules. Or, we could do some other process like fit a model to them. Let’s allow GPT-4 to do this process, and go a bit further, to see how generative AI could actually impact drug discovery. In this article, I will walk you through the potential of GPT-4 in the field of drug discovery and share an example of how GPT-4 can propose new drugs for the treatment of psoriasis by targeting the known protein TYK2.
Literature Searches and Compound Identification
To start, we’re trying to fill a list of plausible compounds that could lead to new drugs based on research papers. This is one small step in drug discovery. There are many others! Let’s start with an example of proposing a new drug for psoriasis by targeting a known protein TYK2.
To begin the process, I made tools for GPT-4 to use, instructing it to rely on these tools when working with molecules directly. First, GPT-4 uses one of these tools to conduct literature searches on the target protein, TYK2. It then parses the literature review, which is itself constructed from GPT-3.5-turbo, to identify drugs that have been studied in relation to TYK2. At times, GPT-4 may not know which drugs are small molecules and which are antibodies, so it uses another tool to differentiate between the two.
Once GPT-4 has identified a list of potential drugs, it determines which of them are patented using yet another tool. GPT-4 can then propose modifications to these compounds in an effort to create novel compounds that may be effective in treating psoriasis. However, it is important to note that these modifications are simplistic and do not reflect the true complexity of drug discovery. In many cases, a real medicinal chemist would have to conduct much more extensive modifications to develop a viable drug candidate.
Novelty and Synthesis
After proposing modifications to the identified compounds, GPT-4 checks to see if the modified compounds are novel. Novel compounds are those that are not present in the SureChEMBL database, which is an approximation of a real patent search. If GPT-4 determines that a compound is novel, it may propose it for further study. However, it is important to note that just because a compound is novel does not mean it will be effective in treating psoriasis. Many other factors must be considered, such as toxicity and side effects.
Finally, GPT-4 determines which of the proposed compounds are not purchasable and must be synthesized. GPT-4 may then propose an email for synthesis to be sent to a lab. This is where the proposed compounds begin the path toward becoming a viable drug candidate.
The Impact of GPT-4 on Drug Discovery
While GPT-4’s ability to propose new compounds is impressive, it is important to note that this is just one small step in the complex process of drug discovery. The compounds that GPT-4 proposes must be created and tested to determine if they are effective in treating disease. This requires extensive testing and experimentation through clinical trials, which cannot be fully automated. However, with the help of contract research organizations (CROs) like Vial, the clinical trial process can be expedited through technology that streamlines operations.
So what will the impact be on drug discovery? Unknown. GPT-4’s ability to propose new compounds does open the door to automating more parts of the drug discovery process which could lead to faster and more efficient drug discovery, as well as the discovery of new drugs that may not have been identified through traditional methods. The example above certainly hints at this, but ultimately shows that GPT-4 will not dramatically change drug discovery just yet. It is important to remember that GPT-4 has potential but is not a substitute for the expertise of medicinal chemists and other experts in the field.
So how much was GPT-4 doing chemistry in the example above? Not much – it’s mostly used for reasoning, selecting tools, and identifying compound names. The potential of GPT-4 in the field of drug discovery is exciting, but it still has a way to go. While GPT-4 cannot directly discover new drugs yet, it can assist in the process by proposing new compounds for further study.
If you’re interested in AI, simulation on peptides, and molecules, follow me on Twitter, where I share my thoughts and insights on these topics.
About Andrew White
Andrew White is the VP of AI at Vial, a CRO powered by technology, and an associate of professor at the University of Rochester. Andrew was an external consultant (”red teamer”) for OpenAI’s GPT-4 model.
If you want to stay up to date with the latest developments in these topics and have the opportunity to learn from one of the leading minds in AI and Chemistry, follow Andrew White on Twitter.