Understanding Statistical Analysis Techniques in Medical Research
Explore appropriate statistical analysis techniques in medical research, covering types of studies, data analysis measures, averages, cohort study design, and associations in pregnancy outcomes. Dive into case-control study results and analysis methods.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Appropriate techniques of statistical analysis Anil C Mathew PhD Professor of Biostatistics & General Secretary ISMS PSG Institute of Medical Sciences and Research Coimbatore 641 004
Types of studies Case study Case series Cross sectional studies Case control study Cohort study Randomized controlled trials Screening test evaluation
Data analysis-Case series Measures of averages Mean, Median, Mode Length of stay for 5 patients 1,3,2,4,5 Mean length of stay 3 days Median length of stay 3 days Mode length of stay No mode
Which is the best average Mean Median Mode DBP 81 79 76 Height 180 180 180 SAL 7.5 7.6 8.1
Data analysis-case series Frequency distribution RBC Frequency Relative frequency 0.029 0.229 0.400 0.257 0.057 0.029 1.000 5.95-7.95 7.95-9.95 9.95-11.95 11.95-13.95 13.95-15.95 15.95-17.95 Total 1 8 14 9 2 1 35
Design of Cohort Study Time Direction of inquiry disease Exposed no disease People without the disease Population disease Not Exposed no disease
Is obesity associated with adverse pregnancy outcomes? Women with a Body Mass Index > 30 delivering singletons. Ref- University of Udine, Italy,2006 % Preterm Birth No preterm birth 35 T=51 487 T=533 Obese 16 31.4 Normal 46 8.6 RR= 3.65
Design of Case Control Study Exposed Disease Not Exposed Exposed No Disease Not Exposed
Results of a Case Control Study Lung Cancer (D+) 80 a No Lung Cancer (D-) 30 b Totals Exposed (E+) a + b Non exposed (E-) 20 c 70 d c + d Totals 100 a + c 100 b + d
Analysis of Case-control study Odds ratio = a*d/b*c =80*70/30*20 =9.3
Data Analysis-Screening Test Evaluation-Whether the plasma levels of (Breast Carcinoma promoting factor) could be used to diagnose breast cancer? Positive criterion of BCPF >150 units vs. Breast Biopsy (the gold standard) D+ D- BCPF Test T+ 570 150 720 T- 30 850 880 600 1000 1600 TP = 570 FN = 30 FP = 150 TN = 850
Sensitivity = P (T+/D+)=570/600 = 95% Specificity = P(T-/D-) = 850/1000 = 85% False negative rate = 1 sensitivity False positive rate = 1 specificity Prevalence = P(D+) = 600/1600 = 38% Positive predictive value = P (D+/T+) = 570/720 = 79%
Tradeoffs between sensitivity and specificity When the consequences of missing a case are potentially grave When a false positive diagnosis may lead to risky treatment
Data analysis-case series Measures of variation Group 1 29 30 31 Group 2 25 30 35 Range Standard deviation
Data analysis- Analytical studies Tests of significance
Case Study 1: Drug A and Drug B Aim: Efficacy of two drugs on lowering serum cholesterol levels Method: Drug A 50 Patients Drug B 50 Patients Result: Average serum cholesterol level is lower in those receiving drug B than drug A at the end of 6 months
A) Drug B is superior to Drug A in lowering cholesterol levels : Possible/Not possible
B) Drug B is not superior to Drug A, instead the difference may be due to chance: Possible/Not possible
C) It is not due to drug, but uncontrolled differences other than treatment between the sample of men receiving drug A and drug B account for the difference: Possible/Not possible
D) Drug A may have selectively administrated to patients whose serum cholesterol levels were more refractory to drug therapy: Possible/Not possible
Observed difference in a study can be due to 1) Random change 2) Biased comparison 3) Uncontrolled confounding variables
Solutions: A and B Test of Significance p value P<0.05, means probability that the difference is due to random chance is less than 5% P<0.01, means probability that the difference is due to random chance is less than 1% P value will not tell about the magnitude of the difference
Solutions: C and D Random allocation and compare the baseline characteristics
Table 1-Baseline Characteristics Vitamin group (n = 141) Placebo group (n = 142) Characteristic Mean age SD, y 28.9 6.4 29.8 5.6 Smokers, n (%) 22 (15.6) 14 (9.9) Mean body mass index SD, kg/m2 25.3 6.0 25.6 5.6 Mean blood pressure SD, mm Hg Systolic Diastolic 112 15 67 11 110 12 68 10 Parity, n %) 0 1 2 >2 91 (65) 39 (28) 9 (6) 2 (1) 87 (61) 42 (30) 8 (6) 5 (4) Coexisting disease, n (%) Essential hypertension Lupus/antiphospholipid syndrome Diabetes 10 (7%) 4 (3%) 2 (1%) 7 (5%) 1(1%) 3 (2%)
t Test Ho: There is no difference in mean birth weight of children from HSE and LSE in the population CR = t = | X1 - X2 | SD 1 + 1 n1 n2 SD = (n1-1)SD12 + (n2-1)SD22 n1 + n2- 2 SD = 14*0.272 + 9*0.222 = 0.25 23 t = | 2.91 2.26| = 6.36 0.25 1 + 1 15 10 DF = n1 + n2 2 CAL > Table REJECT Ho
GENERAL STEPS IN HYPOTHESIS TESTING 1) State the hypothesis to be tested 2) Select a sample and collect data 3) Calculate the test statistics 4) Evaluate the evidence against the null hypothesis 5) State the conclusion
Commonly used statistical tests T test-compare two mean values Analysis of variance-Compare more than two mean values Chi square test-Compare two proportions Correlation coefficient-relationship of two continuous variables
Data entry format Age weight Vomiting Painscore-b Painscore-a Treatment Diabetes 1 1 1 1 1 1 0 0 0 0 0 0 21 24 25 28 29 20 26 25 24 28 22 22 50 53 55 50 60 65 60 90 80 89 86 45 1 0 1 0 0 0 0 1 1 0 1 0 9 10 9 10 10 10 9 9 9 10 10 10 6 9 9 6 5 8 9 9 9 8 9 9 0 0 1 1 0 0 0 1 1 1 1 0
Example t test Body temperature c Simple febrile seizure N = 25 Febrile without seizure N =25 P value Mean 39.01 38.64 P<0.001 SD 0.56 0.45
Example-Analysis of variance Serum zinc level in simple febrile patients based on duration of seizure occurred Duration min n Mean SD P value < 5 3 10.27 0.25 P <0.001 5 to 10 18 9.02 0.81 >10 4 6.90 0.98
Example Chi-square test Characteristics of patients in the two groups Duration of fever (hour) Simple febrile seizure 16 Febrile without seizure 6 P value < 24 P<0.05 More than 24 9 19
Example Correlation We found a negative correlation between serum zinc level and simple febrile seizure event r = - 0.86 p <0.001
Type 1 and Type 2 Errors Ho TrueHo False / H1 True Correct decision Type 2 error = P (Type 2 error) Accept Ho Reject Ho Type 1 error = P (Type 1 error) Correct decision Power = 1-
Multivariate problem Main outcome Continuous variable-Linear regression Dichotomous variable-Logistic regression
Bradford Hills Questions Introduction- Why did you start? Methods-What did you do? Results- What did you find? Discussion- What does it mean?
How to begin writing? Data Tables Methods, Results Introduction , Discussion Abstract Title, Key words, References