Introduction - Why this course?

Introduction - Why this course?

Why we offer this course and what you can expect to learn

You might be wondering why we introduced this course this year; the biochemistry courses worked nicely in the past. Well, the world is changing and so are the contents of the curriculum. To dwell on this point a bit more, let’s take a look at the biochemistry of today and tomorrow.

Biochemistry of Today

A biochemist in the present day performs a wide range of tasks, from conducting experiments to analyzing data. Their main focus is on understanding the chemical processes that occur within living organisms and how these processes can be altered to treat diseases. A biochemist might work in a laboratory setting to study the molecular structure of proteins or enzymes, or they may conduct research on the molecular basis of cellular signaling pathways. In addition to hands-on experimentation, biochemists also use computer programs and simulations to help them understand their data and make predictions about future experiments.

Biochemistry of Tomorrow

In the future, biochemists will have even more tools at their disposal for understanding and manipulating biological processes. As technology continues to advance, biochemists will be able to collect and analyze data at a faster pace, which will allow them to make more informed decisions about their research. Additionally, the increasing use of computational techniques in biochemistry will allow biochemists to model complex biological systems and make predictions about how different treatments or interventions will affect these systems. This will be particularly useful in developing new drugs or treatments for diseases.

Importance of Computational Thinking and Coding Skills

Computational thinking and coding skills are becoming increasingly important for biochemists, as the field continues to evolve and become more data-driven. By having the ability to write code, biochemists can automate routine tasks, analyze large amounts of data, and develop new algorithms and models to help them better understand complex biological systems. In addition, they can also collaborate with computer scientists and other professionals to develop new technologies and tools that will allow them to perform their research more effectively. In the future, biochemists who have strong computational thinking and coding skills will be in high demand and will have a significant advantage in their careers.

This course

With the above in mind, I designed this course to give you an insight into some of most widely used computational tools in biochemistry. We will start with PyMol, a free, open-source software package that allows you to visualize and analyze protein structures. It is a very powerful tool that can be used to study the structure and function of proteins. We will learn how to use PyMol to visualize and analyze protein structures, and we will also learn how to use other programs to design new proteins. Along the way, we will also learn how to use Python to automate routine tasks and make our lives easier as well as hear a bit of background around Machine Learning and Simulations and how we can use these two tools in Biochemistry.

The aim of this course is to take you the fear of coding and show you how easy it is to use it to automate routine tasks. We will start with the basics, work our way up to more advanced topics and hopefully make you hungry for more!

Here is a list of the modules that we will cover in this course (with the case studies we will look at in brackets):

  1. Introduction to Structural Bioinformatics (PyMol, Python, Jupyter Notebooks)
  2. ML Basics (Google Colab, PyTorch)
  3. ML Architectures (AlexNet, transformers)
  4. Language, Evolution and Bioinformatic (ESM)
  5. Geometric Deep Learning (GNNs) (PyTorch)
  6. Protein structure prediction (AlphaFold2, EMSFold)
  7. Generative Modelling (VAEs, Diffusion Models)
  8. Protein Design (RFDiffusion, ProteinMPNN, AlphaFold2)
  9. Simulations (GROMACS, Allegro)
  10. Drug Design, Docking and Generative (AutoDock, DiffDock, DiffSBDD)
  11. Further topics and conclusion

Prerequisites

There are no prerequisites for this course. We will do a lot of things in Python, so if you do not have any previous programming knowledge it would be great if you could have a look at the introductory materials we provide in our Python for Scientists course, both the beginner as well as the advanced version. However, we will also provide a lot of code snippets during the course we will explain some of the crucial concepts along the way.

We will cover how to install required packages on a case-by-case basis. However, you will need to have a computer with PyMol installed.

Generally, a mouse is required to use PyMol. If you are using a laptop, you will need to connect an external mouse. Trust me, using PyMol with a trackpad is just not fun.

You can also download the PyMol cheatsheet. It seems like a lot of information, but you will be surprised how quickly you will be able to find what you are looking for once you worked a bit with it.

There are two different versions of PyMol: the free version and the commercial version developed by Schrödinger.

Installation of free version

The instructions for downloading the open-source version differ depending on your operating system. Here I linked the instructions for Windows, Mac, and Linux from the PyMol Wiki (in general a good place to start looking for stuff regarding PyMol if you are stuck).

Installation of commercial version

The commercial version of PyMol is developed by Schrödinger. It is a very powerful tool with a lot of other functionality, but it is not free. However, you can get a free license for students and teachers via the PyMol website. After obtaining your educational license you can download the correct version for your operating system from the PyMol website and follow the installation instructions.