Epigenomics from the Cyber Valley
Cyber Valley Stuttgart-Tübingen is a European hotspot for artificial intelligence and home to many renowned experts and scientists. They are now joined by Gabriele Schweikert, who heads up the Computational Epigenomics research group in the Cyber Valley’s Division of Computational Biology. Schweikert is interested in exploring epigenetic mechanisms using machine learning methods.
Cyber Valley, founded in 2016, is a unique centre for artificial intelligence (AI) in the Stuttgart-Tübingen region and is considered one of Europe's largest research collaborations in this field. Partners from civil society, politics, science and industry involved in the centre include the Baden-Württemberg government, the Max Planck Society, the universities of Stuttgart and Tübingen as well as high-tech companies such as Daimler, Porsche and Bosch and several foundations. More partners are expected to join Cyber Valley in the coming years.
The topics covered are very broad, ranging from novel numerical algorithms designed to make learning machines faster and more reliable, to intelligent self-driving vehicles, traffic guidance systems, and soft robots modelled on nature. Medical applications are also hugely important. Cyber Valley has already been able to recruit top AI experts from around the world for ten new research groups and two university chairs.
Only a specific part of the programme repertoire is active in the cell
Dr. Gabriele Schweikert, a physicist who recently joined Cyber Valley, has been working on the application of machine learning in the life sciences for many years. Her initial interest was finding and predicting information from DNA sequences, genes for example. Subsequently, as a postdoc in Edinburgh, Scotland, Schweikert began to study the role of epigenetic DNA modifications. "What fascinates me is that all body cells have the same genetic code, but use it in different ways. This is why we have such a huge variety of highly specialised cells, such as nerves or muscle cells in our bodies," she says. “Many different genetic programmes are permanently stored in the DNA in the same way as computer programmes are stored on the hard drive of a computer. If we activate all the applications on our computer at the same time, they either run the risk of interfering with each other, or the computer slows down. This is why in each cell at any given time only a specific part of the programme repertoire is active while other applications are closed."
Schweikert goes on to say: "DNA is a very long, tightly packed molecule in the cell nucleus. It stores approximately the same number of letters as an 800-book library. The cell machinery needs to access this information in a targeted way, i.e. find specific words in the vast mountain of texts. It can only do so in areas that are less densely packed; only these areas are accessible to the cell machinery. The packaging of the cell is therefore not random, but controlled by so-called epigenetic mechanisms. The cell takes advantage of some tricks: chemical modifications - such as methylations - can alter the local physical properties of DNA without altering the genetic code. The DNA and epigenetic modifications can be thought of as a strand of wool with a piece of adhesive tape wrapped around some sections. The areas with adhesive tape are smooth and solid, while the non-taped areas can easily be moulded into a ball. It is then a straightforward matter to search for and detect specific sites in the ball of wool, reducing the size of the area that needs to be examined.
Epigenetic modifications are the cause of diseases
Another vital epigenetic mechanism of interest to Schweikert is histone modifications. "Inside cells, DNA is wrapped around histones, forming a structure that looks like pearls on a necklace. The histones look a bit like octopuses, and their “tentacles” can be chemically altered in a variety of ways,” she explains. "Depending on these changes, the interactions between histones can be stronger or weaker, which in turn leads to a denser or more open structure. In general, some epigenetic modifications often correlate with the beginning of highly expressed genes, while others appear to mark the beginning of genes that are switched off. It comes as no surprise that epigenetic mechanisms are key for many developmental processes. And we are starting to increasingly understand that epigenetic changes are the cause of a wide variety of diseases."
“During embryonic development, all cells need to be able to change their functions in order to evolve from pluripotent to specialised cells,” said Schweikert. "These functional changes are accompanied and partly controlled by specific epigenetic changes. Epigenetic dysfunctions are therefore associated with a number of developmental disorders, such as certain forms of autism. However, specialised cells must also reliably maintain their function and thus their epigenetic state over long periods of time. Unwanted epigenetic changes may contribute to tumour development. We know, for example, that a variety of cancers are characterised by dramatic changes in DNA methylations. This is not a local genetic change. The cell suddenly activates genetic programmes that should actually be switched off."
Cracking the epigenetic code with molecular biology and AI
Schweickert emphasises that a great deal is already known about the molecular functioning of epigenetic mechanisms. She comments: "But we do not yet understand whether the connections are causal. To find this out we would first have to crack the epigenetic code, which is very difficult because there are so many possibilities and influencing factors we do not yet understand. Every cell type has its own epigenome, which is very data-intensive and the epigenomes themselves can also change dynamically."
Schweikert's work in the newly founded Cyber Valley Computational Epigenomics research group focuses on the development of new machine learning methods for analysing epigenetic data in order to better understand these important molecular processes in living cells. At some point in the future it is hoped to be able to use them for suitable therapies and diagnostic approaches. Combining molecular methods to sequence the epigenome with machine learning methods to analyse the huge amounts of data generated, the physicist works in cooperation with research institutions in Vienna and Dundee as well as drawing on the results of other major international projects such as the Epigenomics Roadmap. This enables her to train systems and develop prototypes.
The researchers are specifically planning to target proteins involved in the production of epigenetic patterns and observe how disrupting them affects gene expression. In addition to carrying out concrete experiments, the researchers will also generate and use training data to understand the direction of the mechanism, in other words, the direction of causality based on the data. However, Schweikert is well aware that understanding such epigenetic mechanisms is also of immense importance for medical processes.