Exercise 10 - Drug Design (Docking and Generative)

Exercise 10 - Drug Design (Docking and Generative)

In this exercise you are asked to look at some of the generative models for small molecules we talked about in the lecture and see how well they work via docking.

Theory

We talked a lot about generative models for small molecules in the lecture. Charlie already mentioned some of the pitfalls currently present in the space. Read this and this blogpost by Pat Walters to get a better understanding of the current problems in the space and answer the following questions:

  • What are some problems with the current datasets used in molecular machine learning, and what could be steps to address them?
  • Will future progress in molecular machine learning be driven by better models or better data? Why?
  • What criteria should publications in the space fulfill to be actually useful for the community?

You can also read through one of the interesting Twitter threads by Keith Hornberger where he critizes the reductionist approach many people take to drug discovery via machine learning. What are your thoughts on this?

Practical: Building your own molecule generator

In the lecture, we talked about the different ways to generate molecules. To reinforce the connection between the theory and the practice, we will now build our own molecule generator (sort of). Follow this tutorial and try to understand what is happening at each step. The final result will not be a realistic molecule, but the more important thing is to understand the logic behind how the model is built.

Practical: Run docking software

Learn more about docking in this practical from the IIBM repository. In that practical, you will learn how to use the docking software AutoDock to dock a small molecule into a protein.

After that, you can go to the DiffDock HuggingFace space and run that model to dock your ligand and protein (simply download the pdb file you created in the tutorial and upload it to the HuggingFace space, as well as the SMILES string of your ligand). Look at the output: does it look sensible?

What are the differences between the two docking methods? What are the advantages and disadvantages of each?

Practical: Check your docking results.

Get familiar with the PoseBusters and/or PoseCheck tools and look at the associated papers. Why were they created, and what problem do they try to address?

Use one of the tools to check the docking results you obtained in the previous practical. What do you find?