
Integrating OMIM Mendelian Database to SPOKE for Genetics Research
Explore the integration of OMIM Mendelian database with SPOKE for genetic research, focusing on disease ontology, gene-disease relationships, and inheritance patterns. Discover how raw OMIM data is processed and utilized for mapping gene-disease associations. Dive into the analysis of disease-gene relationships and levels of evidence provided by OMIM.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Integrating OMIM Mendelian database to SPOKE Xiaoming (Sherman) Jia, MD MEng Baranzini Lab 4/19/2025 Genetics and SPOKE 1
Data sources Disease ontology Disease characteristics OMIM Gene-disease relationships Processed OMIM Extract relationships with highest level of evidence (phenotype mapping key = 3), inheritance patter (Mendelian or other), and modifiers Processed Disease ontology Extract OMIM ID to DOID mappings Integrate GENE-DOID mappings into SPOKE 4/19/2025 Genetics and SPOKE 2
OMIM raw data requires some text parsing Gene ENTREZ ENSEMBL Disease Cerebellar ataxia, nonprogressive, with mental retardation, 614756 (3), Autosomal dominant Parkinson disease 7, autosomal recessive early-onset, 606324 (3), Autosomal recessive {Epilepsy, generalized, with febrile seizures plus, type 5, susceptibility to}, 613060 (3), Autosomal dominant; {Epilepsy, idiopathic generalized, 10}, 613060 (3), Autosomal dominant; {Epilepsy, juvenile myoclonic, susceptibility to}, 613060 (3), Autosomal dominant CAMTA1 23261 ENSG00000171735 PARK7 11315 ENSG00000116288 GABRD 2563 ENSG00000187730 ?Charcot-Marie-Tooth disease, type 2A1, 118210 (3), Autosomal dominant; {Neuroblastoma, susceptibility to, 1}, 256700 (3), Autosomal dominant, Isolated cases; Pheochromocytoma, 171300 (3), Autosomal dominant KIF1B 23095 ENSG00000054523 CTRC 11330 ENSG00000162438 {Pancreatitis, chronic, susceptibility to}, 167800 (3), Autosomal dominant 4/19/2025 Genetics and SPOKE 3
Disease-gene relationships from OMIM: keep bolded Mapping code Inheritance Count Level of evidence Count Autosomal recessive Autosomal dominant unknown X-linked recessive X-linked dominant Multifactorial X-linked Mitochondrial Isolated cases Somatic mutation Digenic recessive Somatic mosaicism Y-linked 2828 2390 1229 208 90 69 68 49 45 23 15 4 1 Disorder is placed on the map based on its association with a gene, but the underlying defect is not known. 1 73 Disorder has been placed on the map by linkage; no mutation has been found. 2 355 The molecular basis for the disorder is known; a mutation has been found in the gene. 3 6233 a contiguous gene deletion or duplication syndrome, multiple genes are deleted or duplicated causing the phenotype 4 5 4/19/2025 Genetics and SPOKE 4
Edits to raw OMIM data Encode modifiers if disease name contains: susceptibility for (299) modifier of (27) protection against (30) resistance to (25) reduced risk of (6) Add to inheritance patterns if disease name contains : somatic or somatic mosaic (212) digenic (19) autosomal recessive (19) autosomal dominant (15) X-linked (9) Y-linked (1) 4/19/2025 Genetics and SPOKE 5
Formatted OMIM data (ready for integration) GENE OMIM DOID INHERITANCE MODIFIER DISEASE Myasthenic syndrome, congenital, 8, with pre- and postsynaptic defects AGRN 615120 110657 AR - B3GALT6 DVL1 615349 616331 50802 60765 AR AD - - Ehlers-Danlos syndrome, spondylodysplastic type, 2 Robinow syndrome, autosomal dominant 2 TMEM240 GNB1 607454 613065 50972 9952 AD - - Spinocerebellar ataxia 21 Leukemia, acute lymphoblastic, somatic SOMATIC GNB1 SKI 616973 182212 70072 2340 AD AD - - Mental retardation, autosomal dominant 42 Shprintzen-Goldberg syndrome CEP104 NPHP4 616781 606966 110994 111115 AR AR - - Joubert syndrome 25 Nephronophthisis 4 MTHFR ALPL 188050 146300 2452 110913 AD AD,AR SUSCEPTIBILITY - Thromboembolism, susceptibility to Hypophosphatasia, adult Total: 3,858 mappable gene-disease relationships 4/19/2025 Genetics and SPOKE 6
Recommended filtering after integration High-confidence Mendelian relationships (3,220): Keep Mendelian inheritance: autosomal dominant (AD), autosomal recessive (AR), X-linked dominant (XLD), X-linked recessive (XLR), X-linked (XL), Mitochondrial (MT), Digenic recessive (DR), or Y- linked (YL). May include Mendelian AND SOMATIC (hereditary cancer syndromes). Exclude relationships with modifiers (i.e. susceptibility = - ) Moderate-confidence Mendelian relationships (137): Mendelian relationships that have modifiers: susceptibility for (SUSCEPTIBILITY), modifier of (MODIFIES), protection against (PROTECTIVE), resistance to (RESISTANCE), reduced risk of (REDUCED). Low-confidence relationships (335): Relationships thatdon t have a Mendelian inheritance(i.e. inheritance = - ) Somatic (166): Inheritance = SOMATIC (i.e. not Mendelian and not unknown) 4/19/2025 Genetics and SPOKE 7