Chi-Square Test for Independence: Understanding and Application

13 3 chi 13 3 chi square test for independence n.w
1 / 10
Embed
Share

Learn about the Chi-Square Test for Independence, a statistical method used to determine if there is a relationship between two categorical variables within a population. Explore theoretical concepts, hypothesis testing, sample selection, and practical examples like analyzing gender and political party affinity in a survey. Discover how to test independence and interpret results effectively.

  • Chi-Square Test
  • Independence
  • Hypothesis Testing
  • Categorical Variables
  • Statistical Analysis

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. 13.3 Chi 13.3 Chi- -Square Test for Independence Square Test for Independence MAT 1372 Stat w/ Prob NYCCT (CUNY) Ezra Halleck

  2. Testing for Independence of Two Testing for Independence of Two Characteristics Within a Population Characteristics Within a Population Consider a population in which each member is classified according to two distinct characteristics: X and Y. Suppose that the possible values for X characteristic are 1, 2, ,r and the possible values of Y are 1, 2, , s. Thus, there are r possible values for the X characteristic and s possible values for the Y characteristic.

  3. Setting the notation Denote the proportion of the population that has X = i and Y = j by Pij X = i by Qi Y = j by Rj Thus if X and Y denote the values of the X and Y characteristics of a randomly chosen member of the population, then P{X = i, Y = j} = Pij P{X = i} = Qi P{Y = j} = Rj

  4. What we want We are interested in developing a test of the hypothesis that the X and Y characteristics are independent. Recall that X and Y are independent if P{X = i, Y = j} = P{X = i}P{Y = j} In terms of our notation this is equivalent to Pij= Qi*R

  5. Independence Test Hence, we want to test the null hypothesis H0: Pij= Qi*Rjfor all i = 1, . . . , r, j = 1, . . . , s against the alternative hypothesis H1: Pij Qi*Rjfor some (at least 2, why?) values of (i, j) To test this hypothesis of independence, we start by choosing a random sample of size n of members of the population. Let Nijdenote # of elements of the sample that have X = i and Y = j.

  6. Party affiliation and gender example The results of a survey of gender and political party sympathy of 300 upstate New York adults given as a contingency table is Democrat Republican Independent Women 68 56 32 Men 52 72 20 Do we have enough evidence to show that gender does affect political sympathy?

  7. Party affiliation and gender example (cont.) The given table will serve as our observed values. Our H0is that gender does not affect political sympathy. We need to find the expected values based on this assumption: Find the marginal totals for the observed. You can do so by selecting the data as well as the target cells and clicking on autosum. The result is: Democrat Republican Independent Total 68 56 52 72 120 128 Women Men Total 32 20 52 156 144 300

  8. Party affiliation and gender example (cont.) Copy table and paste special as text (so that there are no formulas). Clear references to the observed data leaving just the marginal totals: Democrat Republican Independent Total Women Men Total 156 144 300 120 128 52 Fill the inside of this 2nd table, by taking product of each marginal row and column values and dividing by the total. Write the formula for one cell with carefully placed $ s and fill right and then down.

  9. Party affiliation and gender example (cont.) Copy table and paste special as text (so that there are no formulas). Clear references to the observed data leaving just the marginal totals: Democrat Republican Independent Total 62.4 66.56 57.6 61.44 120 128 Women Men Total 156 144 300 27.04 24.96 52 Fill the inside of this 2nd table, by taking product of each marginal row and column values and dividing by the total. Write the formula for one cell with carefully placed $ s and fill right and then down. The result is above.

  10. Party affiliation and gender example (cont.) In a 3rd table, take the square of the difference between each observed and expected values and divide by the expected. Democrat Republican Independent Women 0.50 1.68 0.91 Men 0.54 1.82 0.99 Sum these to get the test statistic TS. To find the pvalue, use chi.dist with inputs TS, d.o.f. = (r-1)(s-1), (where r,s are the # of rows, columns respectively) and false.

Related


More Related Content