Entry for:The Peer Prize for Women in Science
1. Summary of your research (150 words max)
Every day in Australia, 8 babies are born with heart defects. It is a heavy burden for them and the families as the only effective treatment available is invasive surgery in the first year of life. More subtle types of heart defect can remain undetected until adulthood, confering a risk of sudden heart failure later in life. In a few cases, the causes of congenital heart disease CHD can be attributed to gene mutations and environmental factors, but to date, the majority of cases (80%) are of unknown origin. My laboratory focuses on understanding the origin of CHD, which is crucial for its early diagnosis and for the care of patients suffering from it. In particular, we are devising bioinformatics approaches to identify mutations in the genome that can trigger CHD and validating our predictions using the zebrafish as a model organism.
2. Describe your approach and broader findings (500 words max)
1- Finding needles in a haystack
Our genome is composed of 3 billion letters, of which only 2% code for genes. The rest (98%) has unknown function and has long been referred as the “dark matter of the genome” or “junk DNA”. However, it has become clear that the junk DNA actually contains vital information for the proper development of an embryo. Indeed, it contains specific DNA codes, known as regulatory sequences, that dictate whether genes are switched on or off. The integrity of this code is essential for formation of the healthy embryo, as this code orchestrates a very precise program of temporal and spatial expression of specific combinations of genes. Cracking this code is a very difficult task as these DNA regions do not have an easily discernable sequence pattern as do classical genes, making them much more challenging to localize. To identify these regulatory sequences in the genome, we have developed Trawler, a DNA-motif discovery software, that allows identification of the precise location of DNA sites bound by proteins, and that consequently are likely to play a vital role in the formation of the embryo.
2- The making of Trawler
I embarked on the Trawler journey as a PhD student at the European Molecular Biology laboratory in Germany. Together with a team of French, British and German scientists, we have developed the first code for Trawler. In 2010, we released a “geeky” version of Trawler, which could only run on the command line, requiring some programming knowledge to use. I applied Trawler for the first time on cardiac datasets in 2015 as a post-doc at the Victor Chang Cardiac Research Institute. Thanks to Trawler, we discovered a fundamental principle of heart disease: the DNA-binding capability of a protein important for the formation of the heart is changed when the protein is faulty, activating abnormal cellular processes. Finally, when I started my laboratory at the Australian Regenerative Medicine Institute in Melbourne, we made new version of Trawler that is entirely web-based, user-friendly and accessible for all (biologists and bioinformaticians). This new version was released in 2016.
3- The wonders of zebrafish
Trawler predicts which mutations in regulatory elements are able to trigger heart disease. To verify whether these predictions are accurate in vivo, we are recreating these mutations in the zebrafish using the CRISPR/Cas9 genome editing technology. Using the zebrafish model for these experiments is advantageous and ideal for several reasons. First, the hearts of zebrafish are very similar in structure to human hearts, only a simplified version as they have only 1 atrium and 1 ventricle. Second, zebrafish eggs are transparent and develop externally, hence heart defects will be observable under the microscope. Third, the rapid development of the zebrafish heart will allow us to quickly screen and identify the defects caused by the mutation since the heart is fully formed after only 2 days in zebrafish.
3. What is the wider contribution or impact to your scientific field(s)? (300 words max)
Since the release of the first draft of the Human genome in 2001, we know the exact composition of our genetic material, however the function of the vast majority of the sequences and the consequence of changes in these sequences are far less understood. Our bioinformatics approach showcases how the mining of genome-wide datasets can shed light on the links between genomic sequence and onset of congenital diseases. It is the right time to undertake such approaches as the technological advances in high-throughput sequencing have generated massive amounts of data that can now be leveraged to produce tangible outcomes for CHD. The earlier we exploit these technologies, the sooner we can provide counselling on the predisposition to CHD and advice on the inheritance risks of the disease, and in the longer term, develop treatments.
Moreover, genome-wide association studies have identified several sequence variants that are enriched in the genome of patients who suffer from a wide-range of diseases from cancer to systemic lupus. Unfortunately, the majority of them are not further investigated, as they do not fall into protein-coding genes. Applying our pipeline to these datasets will address the lack of knowledge in various diseases beyond CHD.
Finally, the power of our approach rests on our cross-disciplinary methods, marrying “dry-lab” techniques (the design of bioinformatics pipelines and software to mine the genome) and “wet-lab” approaches (the use of the zebrafish model organism to rapidly validate our computer predictions). Firmly believing that the future of biological science is intertwined with the advance of computing technologies, it is my passion to bridge the gap between computational and developmental biology: advocating for a new generation of biologists who are fluent in bioinformatics.
4. Potential ideas you would like to explore to take this research further? (300 words max)
As major advances in surgery have allowed more and more children affected by CHD to survive until adulthood, the risk of transmitting the faulty mutation to further generations is increased. Genetic counseling is indispensable for future parents affected by CHD not only to understand the origin of the disease, but also to understand the risks that their own children will suffer from CHD. Genetic counseling is based on genetic tests, and the more comprehensive these tests, the more accurate the risk prediction will be. To date, the only genetic test for CHD is limited to a panel of less than 100 genes, which do not capture the majority of CHD cases. We plan to design a new screening panel composed of regulatory sequences, the first of its kind, that will be complementary to the existing screening panel. The outcome of this project will contribute to developing more complete genetic tests that will support better counseling regarding the risks of transmitting CHD.
5. Please share a link for researchers to access your article, data-set or thesis
Trawler and Congenital Heart Disease
Trawler web version 2016 release (accessible for all):
Dang LT, Chiu HMH, Revote J, Paten B, Besse F, Quaife-Ryan G, Tano V, Cummings H, Drvodelic M, Hallab J, Nim H, Stolper JS, Tondl M, Bogoyevitch M, Jans D, Porrello E, Hudson J, Ramialison M
Decrypting mechanisms of congenital heart disease using Trawler:
Bouveret R, Waardenberg AJ, Schonrock N, Ramialison M, Doan T, de Jong D, Bondue A, Kaur G, Mohamed S, Fonoudi H, Chen C, Wouters MA, Bhattacharya S, Plachta N, Dunwoodie SL, Chapman C, Blanpain C, Harvey RP. NKX2-5 mutations causative for congenital heart disease retain functionality and are directed to hundreds of targets eLife 2015;4:e06942
Trawler standalone (command line version):
Haudry Y*, Ramialison M*, Paten B, Wittbrodt J, Ettwiller L. Using Trawler_standalone to discover overrepresented motifs in DNA and RNA sequences derived from various experiments including chromatin immunoprecipitation. (*=Co-first authors).Nature Protocols. 2010;5(2):323-34
Trawler original algorithm:
Ettwiller L, Paten B, Ramialison M, Birney E, Wittbrodt J. Trawler: de novo regulatory motif discovery pipeline for chromatin immunoprecipitation.Nature Methods. 2007 Jul;4(7):563-5.