
Study Design and Statistical Analysis for Medical Research
"Learn about study design, statistical analysis, and using JMP Pro software for medical research. Understand the importance of inferential statistics and the branches of descriptive and inferential statistics. Explore topics such as research questions, statistical methods, data visualization, and more."
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Jumping into Statistics: Introduction to Study Design and Statistical Analysis for Medical Research Using JMP Pro Statistical Software WINTER/SPRING 2021 DR. CYNDI GARVAN & DR. TERRIE VASILOPOULOS
Meet the Instructors CYNTHIA GARVAN, MA, PHD TERRIE VASILOPOULOS, PHD Research Assistant Professor in Anesthesiology and Orthopaedics and Rehabilitation Research Professor in Anesthesiology
Course Objectives Review fundamentals of study design and research methodology Understand how to choose best statistical test for your research question Practice basic statistical analysis use JMP Pro Software
Course Topics Asking a Good Research Question How to Chose Correct Statistical Method and Run Some Analyses T-tests, ANOVA, Non-Parametric Chi-square, odds ratio, relative risk Regression and Correlation Survival Analysis Test Diagnostics (e.g. sensitivity, specificity, etc.) Life Cycle of Research and the Scientific Method Study Design Data types and Database Construction Descriptive Statistics Comparing Statistical Modeling and Machine Learning Data Visualization Population and Sample, Probability, Statistical Inference
Population and Sample, Probability, Statistical Inference 4/7/2021
Why is this topic important? Not being able to study an entire population is why we need inferential statistics! The GOAL of statistical inference is to understand something about populations using sample data. For example: we may wish to know a population proportion (symbol = ) or a population mean (symbol = ). Often the goal of clinical trials is to compare groups on population proportion (e.g., post operative delirium) or population mean (e.g., length of stay after surgery).
Statistics is a branch of mathematics dealing with data collection, organization, analysis, interpretation, and presentation. There are two branches of the Statistics discipline: Descriptive statistics and Inferential statistics.
Taxonomy of Statistics Statistics Descriptive Inferential Graphs Summary measures Estimation Test of Hypothesis Evidence: Confidence Interval Evidence: Effect size P-value
POPULATION Inferential Statistics - Use observations from sample to make inferences/answer questions about total populations Descriptive Statistics Make observations about/summarize sample SAMPLE
What is What is a a hypothesis? hypothesis? Statement about a population that can be tested in a sample Examples: Smokers are more likely to develop lung cancer Blood pressure s higher in men than in women Drug A is better than Drug B
Null hypothesis Null hypothesis Null hypothesis (Ho) usually states that no difference between test groups really exists Fundamental concept in research is the concept of either rejecting or conceding the Ho State the Ho: There is no difference in blood pressure between men and women Then you compare the null to an alternative hypothesis (HA) using statistical testing (p-values): Blood pressure is different between men and women.
Courtroom Courtroom analogy analogy Innocent until proven guilty The null hypothesis is that the defendant is innocent. The alternative hypothesis is that the defendant is guilty. If the jury acquits the defendant, this does not mean that it accepts the defendant s claim of innocence. It merely means that innocence is plausible because guilt has not been established beyond a reasonable doubt. This is similar reason to why we conclude in hypothesis testing to either reject or fail to reject the null.
Hypothesis testing is based Hypothesis testing is based on probability probability on In all hypothesis testing, the numerical result from the statistical test is compared to a probability distribution to determine the probability of obtaining the result if the result is not true in the population. Examples of two probability distributions: normal distribution t distribution the normal and t- distributions -4 -3 -2 -1 0 1 2 3 4
Understanding P-Value 0.037 is the p-value. 9.000 is the value of the test statistic. The p-value is the probability of observing the Test Statistic or a value which is more extreme. In this example the p-value = .037.
Probability of heads when flipping 3 coins Probability of heads when flipping 3 coins Result Number of Heads Probability 0 1/8 1 1/8 1 1/8 2 1/8 1 1/8 2 1/8 2 1/8 3 1/8
If we flip 3 coins we can summarize the probability distribution for observing heads when the coin is fair (i.e., Probability of heads = Probability of tails = ) Number of Heads Probability 0 .125 1 .375 2 .375 3 .125
Suppose we flip 10 coins, the probability distribution of observing a certain Suppose we flip 10 coins, the probability distribution of observing a certain number of heads when the coin is fair is given below. number of heads when the coin is fair is given below. Number of Heads Probability 0 .001 1 .010 2 .044 3 .117 4 .205 5 .246 6 .205 7 .117 8 .044 9 .010 10 .001
Probability and Coin Flips Suppose you suspect that the coin is not fair. You suspect that the probability of heads is higher than . How would you convince yourself? You can obtain evidence through experimentation. Flip the suspect coin ten times and observe the number of heads. What you are testing is whether in an infinite amount of coin flips, the proportion ( ) of heads is equal to . Hypotheses in words Hypotheses in symbols = Null: The proportion of heads in an infinite number of coin flips is 50% (i.e., coin is fair) a Alternative: The proportion of heads in an infinite number of coin flips greater than 50% (i.e., coin is biased with the probability of heads greater than 50%)
Probability and Coin Flips Number of Heads Probability If you observe 9 heads in 10 coin flips, you would strongly suspect that the coin is biased. The p-value from your experiment (the probability of what you observed and more extreme values) = .010 + .001 = .011. You would reject the null hypothesis. 0 .001 1 .010 2 .044 3 .117 4 .205 5 .246 6 .205 7 .117 8 .044 This is the probability distribution under the null hypothesis, namely, the hypothesis that the coin is fair. 9 .010 10 .001
Probability and Coin Flips Number of Heads Probability If you observe 6 heads in 10 coin flips, you would be pretty convinced that the coin is fair. The p-value from your experiment (the probability of what you observed and more extreme values) = .205 + .117 + .044 + .010 + .001 = .377. You would fail to reject the null hypothesis. 0 .001 1 .010 2 .044 3 .117 4 .205 5 .246 6 .205 7 .117 8 .044 This is the probability distribution under the null hypothesis, namely, the hypothesis that the coin is fair. 9 .010 10 .001
P-values indicate how compatible the observed data are with the probability distribution under the null hypothesis. If compatible, fail to reject the null. If incompatible, reject the null. KEY POINT
However The scientific community has become increasingly concerned that a black and white comparison of the p-value to the traditional alpha level of .05 is a source of irreproducibility in research. What to do? The American Statistical Association (ASA) has offered some guidelines.
ASA Statement on Statistical Significance and P-Values (2016) 1. P-values can indicate how incompatible the data are with a specified statistical model. 1. P-values can indicate how incompatible the data are with a specified statistical model. p-values can be useful 2. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone. 2. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone. don t misinterpret them 3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold. 3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold. don t just compare to ? = .05 4. Proper inference requires full reporting and transparency. 4. Proper inference requires full reporting and transparency. 5. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result. 5. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result. 6. By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis. 6. By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.
Type I and Type II error Type I and Type II error Your Statistical Decision True state of null hypothesis H0 True H0 False (example: the drug works) (example: the drug doesn t work) Reject H0 (ex: you conclude that the drug works) Correct Type I error ( ) False positive Do not reject H0 (ex: you conclude that there is insufficient evidence that the drug works) Type II Error ( ) False Negative Correct
H0 (NULL): You are not pregnant versus HA (ALTERNATIVE) : You are pregnant This example may help you to remember! Type II error is when the test fails to conclude the alternative hypothesis when the alternative hypothesis is true (false negative). Type I error is when the test fails to conclude the null hypothesis when the null hypothesis is true (false positive).
Relationship between Type Relationship between Type- -I and Type II error There is an inverse relationship between the probabilities of the two types of errors. Increase probability of a type I error decrease in probability of a type II error (vice-versa) I and Type II error .01 .05 Graduate Workshop in Statistics Session 4. Hamidieh K. 2006 Univ of Michigan
Error and Power Error and Power Type I error rate (or significance level): the probability of finding an effect that isn t real (false positive). If we require p-value <.05 for statistical significance, this means that 1/20 times we will find a positive result just by chance. Type II error rate: the probability of missing an effect (false negative). Statistical power: the probability of finding an effect if it is there (the probability of not making a type II error). 1- When we design studies, we typically aim for a power of 80% (allowing a false negative rate, or type II error rate, of 20%).
Online Power Calculators Online Power Calculators There are online calculators which compute sample sizes needed for different study designs: http://hedwig.mgh.harvard.edu/sample_size/size.html G*Power (free download for Mac and Windows) https://www.psychologie.hhu.de/arbeitsgruppen/allgemeine- psychologie-und-arbeitspsychologie/gpower.html
Pitfalls of P Pitfalls of P- -values values Clinically unimportant effects may be statistically significant if a study is large. Statistical significance clinical importance
Effect size Effect size Effect size is a measure of the direction and magnitude of an effect of a treatment, or a difference between two variables or groups. The choice of the effect measure depends on the data type of the variables. Effect size quantifies the magnitude of differences found. Statistical significance examines how likely the findings are to have occurred given the null hypothesis. Effect sizes are comparable across studies; P-values and statistical significance change across studies.
Effect Effect size and clinical relevance size and clinical relevance Scenario 1. Suppose a new pain reliever works on average 10 minutes faster than a pain reliever that has been used for decades. Would a 10 minute difference be clinically meaningful to the patient? Scenario 2. Suppose a new pain reliever works on average 5 minutes faster than a pain reliever that has been used for decades. Would a 5 minute difference be clinically meaningful to the patient? Scenario 3. Suppose a new pain reliever works on average 1 minute faster than a pain reliever that has been used for decades but costs 20 times more than the older drug. Would a 1 minute difference be clinically meaningful to the patient (especially considering the cost of the medication)?
Effect size is the answer to the So what? question. Effect size is the answer to the So what? question. Reporting effect size in a study helps the reader to understand the importance of the study s findings.
Confidence Intervals Confidence Intervals P values give no indication about the clinical importance of the observed association Relying on information from a sample will always lead to some level of uncertainty. Confidence interval is a range of values that tries to quantify this uncertainty: For example , 95% CI around a mean indicates that, under repeated sampling, in 95 out of 100 samples, the CIs would contain the true mean Also can apply to CIs around other test statistics (coefficients, mean differences, ORs)
The table above shows estimates of POCD for adults in different age groups.
Confidence Intervals Point estimate we can estimate a population parameter with a single number. A descriptive statistic is a point estimate. Confidence interval (CI) it is preferable to give a more sophisticated estimate. The CI is a plausible range of values for a population parameter with a measure of precision in the method used to construct the interval
Suppose 100 researchers construct a 95% CI for . We consider to be a fixed value (a truth that is unknown unless we could measure all of the population data). The figure below graphs 10 of the 100 constructed CIs. 95% CI constructed by an individual researcher, the probability that the true value of is captured in the interval is 1.00. 95% CI constructed by a second researcher, the probability that the true value of is captured in the interval is 0. possible values for true value of
Key points You can also compute CIs for different levels of confidence: 90%, 99%, etc. We need to determine the sample size to estimate a population parameter to specified level of precision (or margin of error). Sample size needed depends on variance of data, specified margin of error, and choice of confidence level. An online calculator to determine the sample size needed to estimate a population mean or population proportion can be found at: https://www.sample-size.net/sample-size-conf-interval-mean/
Questions? Questions?
Summary Tips Learn as much as you can about data types Make a data management plan before starting your study! Consult with Statistician when constructing a database
JMP Pro! https://software.ufl.edu/