Moving Beyond Coding Variants: Challenges and Considerations
Moving beyond coding regions into non-coding territory poses challenges in variant interpretation and power calculations. Quality, scale, and integration of genomic annotations are crucial for prioritizing non-coding variants. Lessons from cancer genomics highlight the dichotomy between coding drivers and non-coding passengers, shedding light on the distribution of high-impact variants. Considerations also include the informative nature of WGS variant calls and the importance of accurate mapping, especially for non-coding regions.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Challenge 4 Moving beyond coding regions 1 - Lectures.GersteinLab.org Mark Gerstein Yale CMG
Moving beyond coding regions in CMG CMG projects has been mostly exome oriented, with few projects incorporating whole genome sequencing on subset of samples(including Dubowitz and unsolved cases). Compared to WES, interpreting variants in non-coding regions of the genome still remains challenging. - quality & scale of coding v non-coding annotation & its impact on power calculations. - Lesson from cancer genomics - Comprehensive variant calls from WGS 2 - Lectures.GersteinLab.org
Things to consider in moving beyond coding #1: Quality & scale of coding v. non-coding annotation & the impact of this on statistical power ENCODE has developed non-coding annotations & a number of tools have been developed to synthesize these (eg HaploReg, FunSeq, &c) Proximal Distal Distal element gene linkage - direct exon Distal element gene linkage - indirect exon Compared to coding regions, the underlying functional territory of non-coding regions is not as well defined nor is the differential effect of different mutations This creates power issues in non-coding variant prioritization. More precise (more compact) annotation may be useful. 3 - Lectures.GersteinLab.org ACTGA Also, integration of tissue-specific genomic annotations & epigenetic data is important for deciphering impact of non-coding variants [Nature 547: 40]
Things we need to consider in moving from coding to non-coding #2: The fact that the high-impact variants found so far may tend to occur in coding regions (lessons from cancer genomics) Somatic coding driver vs non- coding passenger as an example of extreme dichotomy. Or is this a function of ascertainment ? high Mendelian risk variants found from family studies Drivers found from cohort-level recurrence Common variants found in GWAS Effect Size Despite 1000s of WGS call sets, very few non-coding drivers have been found in cancer genomics[Khurana et al NRG 16, Rhienbay et al bioRxiv 17] 4 - Lectures.GersteinLab.org Common SNPs w/o clinical utility Passenger mutations VUS in Mendelian studies In general, high-impact variants may tend to occur in coding regions& softer regulatory ones may occur in non-coding regions. low [Adapted from Thomas et al., Lancet ('15)]
Things we need to consider in moving from coding to non-coding #3: Variant calls (even coding ones) from WGS maybe more informative & accurate WGS can detect full spectrum of variants including SNPs, INDELs, & SVs. SVs are harder to interpret just in terms of exomes [Yang et al. AJHG 15]. Amplification Deletion Arif s/Sushant s? example? II Accuracy of mapping can be better (even to coding), particularly with regard to repeats & pseudogenes [Zhang et al. PLOS Comp. Bio. 17]. Potentially better uniformity in coverage may lead to better accuracy in exome variant calling [Belkadi et al. PNAS. 15]. 5 - Lectures.GersteinLab.org WGS also allows for more precise mapping platforms ie individual- specific personal dipoloid genomes [Rozowsky et al. ( 11) MSB] and population specific references [Chaisson et al. NRG 15]. [Rozowsky et al., MSB ('11)]