Entry for:The Bioinformatics Peer Prize
Rare inherited diseases exert a massive combined burden of suffering, accounting for around 10% of child hospital admissions and one-fifth of all child deaths. For many families affected by these diseases the underlying genetic cause is unknown. Without a firm diagnosis, rare disease families live in uncertainty, unable to predict the future course of the disease or to ensure that future children are healthy. In recent years, exome and whole-genome sequencing have increasingly routine approaches in rare disease diagnosis. Despite their success, the current diagnostic rate for genomic analyses across a variety of rare diseases is approximately 15-50%.
We explored the utility of transcriptome sequencing (RNA-seq) as a diagnostic tool in a cohort of 50 patients with genetically undiagnosed rare neuromuscular disorders. In this effort, we assembled a cohort of 63 patients, 13 of which had already been diagnosed with DNA sequencing, and 50 patients that had not received a diagnosis despite comprehensive clinical and genetic testing. We first verified that RNA-sequencing could identify known pathogenic variants in the 13 diagnosed patients. We then developed an integrated approach to analyze patient muscle RNA-seq, leveraging an analysis framework focused on the detection of transcript-level changes that are unique to the patient compared to over 180 control skeletal muscle samples. We developed methods to detect splice aberrations and allele imbalance present in patients and missing in controls and performed variant calling from RNA-seq data to identify pathogenic events or to prioritize genes for closer analysis.
RNA-sequencing led to the diagnosis of 35% of patients in this challenging subset of rare disease patients for whom extensive prior analysis of DNA data had failed to return a genetic diagnosis. We identified both coding and non-coding pathogenic variants, resulting in a range of splice aberrations. For example, we identified synonymous variants causing exon skipping due the disruption of splice factor binding motifs and exonic splice gain caused by missense and synonymous variants in large-variation rich genes like TTN, which are uninterpretable by DNA sequence alone.
A notable example of the use of RNA-seq for genetic diagnosis was our identification of a deep intronic variant in COL6A1 that caused splice gain in four patients. who had previously gone through prior deletion/duplication testing, fibroblast cDNA sequencing of the collagen VI genes as well as clinical WES and WGS. Using this information, we genotyped for this recurrent variant in a larger, genetically undiagnosed collagen VI-like dystrophy cohort and identified 27 additional patients carrying the intronic variant. Based on this cohort analysis, it is now estimated that this mutation is the most common cause of undiagnosed collagen dystrophy in the US, accounting for ~12% of all mutations causing the disorder.
Our results show that RNA-seq is valuable for the interpretation of coding as well as non-coding variants, and can provide a substantial increase in diagnosis rate in patients for whom exome or whole genome analysis has not yielded a molecular diagnosis. The RNA-seq framework developed in this study can be adapted for other rare diseases where patient tissue is available. Overall, this work suggests that RNA-seq is a valuable component of the diagnostic toolkit for rare diseases and can aid in the identification of novel pathogenic variants in known genes as well as new mechanisms for Mendelian disease.
5. Future ideas/collaborators needed to further research?
Going forward, we would like to understand how RNA-seq can be applied to novel disease gene discovery and diagnosis through proxy tissue samples, when the affected tissue isn’t directly available for biopsy, such as neurodevelopmental disorders. We are currently analyzing fibroblasts samples from patients with complex undiagnosed disorders and neurodevelopmental disorders.
We are also continuing to work in rare disease where biopsies are available, including obtaining renal biopsies from patients with rare kidney disorders or any other disease-relevant tissue across a range of Mendelian disorders.
We are very interested in collaborations with clinicians and patients with rare disease where the genetic diagnosis has been elusive. If you are clinicians or research scientist, you can easily submit any candidate patient samples to us through the Broad Institute Center for Mendelian Genomics (https://cmg.broadinstitute.org/).
If you or a loved one has a rare and undiagnosed suspected genetic disorder, you can apply to participate in the Rare Genomes Project. To participate, families visit our website (coming in May!) and complete a simple online form to tell us about their condition. Families that meet criteria will sign an electronic consent, and give us permission to collect their medical information. If the patient has leftover material from a biopsy, we will contact the Department of Pathology at their primary care institution and obtain stored biopsies. If we identify the genetic cause of the patient’s condition, we will provide a clinical report to the patient’s family and their doctor, in hopes of directly impacting the course of a patient’s clinical care. For more information and updates, please contact email@example.com.