Summary-data-based Mendelian randomization to identify pleiotropic genes

Slide Note

Use summary data-based Mendelian randomization to study the correlation between diseases and risk factors, analyze randomized controlled trials, and explore genetic variants as instruments. Understand the factors affecting MR analysis and the limitations of individual-level data requirement.

medidoc Follow

Uploaded on Feb 13, 2025 | 1 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

Summary-data-based Mendelian randomization to identify pleiotropic genes

Epidemiology study Regression: Disease ~ Risk factor Example: BMI ~ type 2 diabetes, cardiovascular disease, hyperlipidemia Disadvantage: 1. Sample size is probably not large for some diseases 2. It might be caused by confounding effects The correlation may be not repeatable.

Randomized controlled trial Risk factor Intervention Disease

Correlation or association? A successful RCT: LDL -> coronary artery disease, 90,056 individuals, 14 randomized trials Reduction in major vascular events vs. reduction in LDL Reduction in major coronary events vs. reduction in LDL Cholesterol Treatment Trialists (CTT) Collaborators 2005 Lancet

Correlation or association? A failed RCT: Vitamin supplementation ~ common disease? Heart Protection Study Collaborative Group 2002 Lancet

Genetic variants could be instruments KCTD13 -> head size of zebra fish 16p11.2 Deletion Duplication Suppression Overexpression KCTD13

Medelian randomization

An example of MR analysis HDL Profile score (HDL) CAD Profile score (LDL) LDL Disadvantage of MR: individual-level data are required

Factors that affect MR analysis 1. R2zx, variance in exposure (x) explained by instrument (z) z should be strongly associated with x 2. R2xy, variance in outcome (y) explained by exposure (x) the association may be no significant, if R2xy is small. Thus we require a very large sample to identify the association between x (e.g. gene) and y (e.g. trait). Whereas we do not have GWAS and eQTL study available in the same cohort with very large sample size. 3. Confounding factors in a cohort

Summary-data-based Mendelian randomization Single instrument bxy s.e. MR SMR the same if two methods are applied in a single cohort Different

False positive rate and statistical power False positive rate Null Causality assumption Null Pleiotropy assumption R2XY = 0 R2XY = 0.01 R2XY = 0.05 R2ZY = 0 R2ZY = 0.005 R2ZY = 0.01 Statistical power R2ZX = 0.2 MR one sample 0.99 2.44 8.76 0.99 6.09 11.10 SMR one sample 0.98 2.35 7.73 0.98 5.87 10.49 SMR two samples Table 1 Mean chi-squared statistics for testing the association between gene expression and trait 0.98 13.98 55.05 1.00 42.23 71.74 R2ZX = 0.1 MR one sample 1.01 1.77 5.24 1.01 6.05 11.23 SMR one sample 0.98 1.66 4.34 0.98 5.57 9.89 SMR two samples 0.99 7.33 27.13 0.96 34.37 52.59 R2ZX = 0.05 MR one sample Power is slightly smaller than MR in one sample. 1.01 1.43 3.51 1.00 6.07 11.19 SMR one sample 0.95 1.29 2.59 0.95 5.13 8.68 SMR two samples 0.96 4.07 13.68 0.96 25.25 34.17

Genes associated with trait Single instrument: we assume only one causal variant is associated with a single gene. Multiple instruments: instruments are not independent, the association might be caused by close linkage

Causality or pleiotropy ? Null R2 Causality assumption Null R2 Pleiotropy assumption R2 R2 R2 R2 XY = 0 XY = 0.01 XY = 0.05 ZY = 0 ZY = 0.005 ZY = 0.01 R2 ZX = 0.2 b(Y~GX), one sample 0.0 (0.022) 0.098 (0.022) 0.212 (0.023) 0.0 (0.022) 0.158 (0.023) 0.223 (0.022) bXY, one sample -0.002 (0.070) 0.096 (0.070) 0.212 (0.066) -0.002 (0.072) 0.157 (0.069) 0.223 (0.068) bXY, two samples 0.0 (0.023) 0.098 (0.024) 0.214 (0.031) 0.0 (0.022) 0.159 (0.027) 0.225 (0.031) 2= 0.1 RZX b(Y~GX), one sample 0.0 (0.032) 0.097 (0.032) 0.212 (0.032) 0.0 (0.032) 0.224 (0.032) 0.317 (0.031) bXY, one sample -0.003 (0.103) 0.094 (0.098) 0.211 (0.095) -0.003 (0.102) 0.223 (0.097) 0.314 (0.098) bXY, two samples 0.0 (0.032) 0.099 (0.034) 0.215 (0.041) 0.0 (0.032) 0.226 (0.042) 0.321 (0.050) 2= 0.05 RZX b(Y~GX), one sample 0.0 (0.044) 0.097 (0.045) 0.213 (0.045) 0.0 (0.045) 0.316 (0.044) 0.447 (0.045) bXY, one sample -0.004 (0.146) 0.091 (0.143) 0.208 (0.136) -0.004 (0.147) 0.319 (0.143) 0.451 (0.144) bXY, two samples 0.0 (0.046) 0.100 (0.050) 0.218 (0.058) 0.0 (0.047) 0.325 (0.071) 0.457 (0.086) We can not distinguish pleiotropy from causality, thus we interpret all the association as pleiotropy.

Pleiotropy or linkage ? Transcription Causality Phenotype 100 Causal variant 80 -log10(PeQTL) Pleiotropy Phenotype Transcription 60 40 Causal variant Linkage Transcription Phenotype 20 0 0.0 0.1 0.2 0.3 0.4 0.5 Causal variant 1 Causal variant 2 bxy

Heterogeneity in dependent instruments 100 80 -log10(PeQTL) 60 di= bxy(i) bxy(top) 40 20 We keep those genes where di= 0, opposite to traditional hypothesis test. Thus we use 0.05 as threshold without correcting for multiple tests. 0 0.0 0.1 0.2 0.3 0.4 0.5 bxy

Power to detect heterogeneity Probes that passed the SMR test Probes that failed to pass the heterogeneity test 100 Proportion of identified probes (%) 80 60 40 20 0 0.1 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.6 0.6 0.7 0.7 0.8 0.8 0.9 0 0.1 0.9 1 LD r2 between two causal varaints

Selecting instruments in HEIDI test 1. Instruments must be associated with the gene expression trait 2. Including more instruments that have moderate LD with the top SNP 3. SNPs in high LD may reduce the power We improved our HEIDI test (to be published) 1. Threshold: 1e-3; 2. Removing SNPs that are in LD with the top SNP (LD R2 > 0.9); 3. Removing SNPs with pair-wise LD R2 > 0.9; 4. Selecting the top 20 SNPs.

Summary 1. MR requires instrument strongly associated with exposure. 2. R2zx andR2xy determine the power of mendelian randomization Thus large samples are required. 3. SMR largely increases the power by utilizing summary data from two independent data. 4. False positive rate is well controlled in SMR. 5. All the identified genes are interpreted as pleiotropic ones. 6. HEIDI test can be applied to distinguish close linkage.

Thank you!

Summary-data-based Mendelian randomization to identify pleiotropic genes

Download Presentation

Presentation Transcript

Related

More Related Content