Describing Bivariate Data: Graphical Analysis and Interpretation

Describing Bivariate Data: Graphical Analysis and Interpretation
Slide Note
Embed
Share

Bivariate data analysis involves exploring the relationship between two variables through graphical representations such as comparative bar charts, line charts, and scatterplots. These visualizations help in understanding patterns, strength of relationships, and identifying outliers or clusters in the data.

  • Bivariate data
  • Graphical analysis
  • Interpretation
  • Scatterplot
  • Relationships

Uploaded on Feb 20, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Introduction to Probability and Statistics Twelfth Edition Chapter 3 Describing Bivariate Data

  2. Bivariate Data Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable individually, and you can also explore the relationship between the two variables. Bivariate data can be described with Graphs Numerical Measures

  3. Graphs for Qualitative Variables Graphs for Qualitative Variables When at least one of the variables is qualitative, you can use comparative pie charts or bar charts. Variable #1 = Opinion Do you think that men and women are treated equally in the workplace? Variable #2 = Gender Men Women

  4. Comparative Bar Charts Comparative Bar Charts 120 Gender Men Women 70 60 100 50 80 Percent 40 Percent 60 30 40 20 10 20 0 Gender Men Women Men Women Men Women 0 Opinion Agree Disagree No Opinion Opinion Agree Disagree No Opinion Stacked Bar Chart Describe the relationship between opinion and gender: More women than men feel that they are not treated equally in the workplace. Side-by-Side Bar Chart

  5. Line Charts Line Charts Year Income ($000) 1998 42 Expenses ($000) 33 Income vs Expenses 55 Variable Income Expense 50 1999 44 35 45 $000 40 2000 47 40 35 2001 51 43 30 1998 1999 2000 2001 2002 2002 53 44 Year

  6. Two Quantitative Variables Two Quantitative Variables When both of the variables are quantitative, call one variable ? and the other ?. A single measurement is a pair of numbers (?,?) that can be plotted using a two-dimensional graph called a scatterplot. y (2, 5) y = 5 x x = 2

  7. Describing the Scatterplot Describing the Scatterplot What pattern or form do you see? Straight line upward or downward Curve or no pattern at all How strong is the pattern? Strong or weak Are there any unusual observations? Clusters or outliers

  8. Lines & Slopes (Recall) Lines & Slopes (Recall) Line y=-1 + 0.5x Upward (uphill) Positive slope 1.0 0.5 0.0 y -0.5 -1.0 0 1 2 3 4 x Line y = 1 - x Downward (downhill) Negative slope 2 1 0 y -1 -2 -1 0 1 2 3 x

  9. Examples Examples Positive linear - strong Negative linear -weak Curved pattern No relationship

  10. Numerical Measures for Two Numerical Measures for Two Quantitative Variables Quantitative Variables Assume that the two variables x and y exhibit a linear pattern, relationship or form. (All data points lying around a line.) There are two numerical measures to describe The strength and direction of the linear relationship between x and y. The form of the relationship.

  11. The Correlation Coefficient The Correlation Coefficient The strength and direction of the linear relationship between x and y are measured using the correlation coefficient r. s r = ( )( ) x x y y = i i s xy s xy 1 x n ( s x y )( n ) y x y where i i i i = s Covariance sxy xy 1 n sx = Standard Deviation of the x s sy = Standard Deviation of the y s

  12. Example Example Living area? and selling price ? of 5 homes. Residence x (hundred sq ft) y ($000) 1 14 178 2 15 230 3 17 240 4 19 275 5 16 200 The scatterplot indicates a positive linear relationship.

  13. Example Example x y 14 178 15 230 17 240 19 275 16 200 81 1123 xy 2492 3450 4080 5225 3200 18447 Calculate = y = 16 2 . . 1 = 924 x s x = 224 6 . 37 360 . s y s ( )( n ) x y xy s r = x y i i i i s = s x 63 y 6 . xy 1123 )( 81 ( 1 n = = 885 . ) 18447 . 1 924 37 ( . 36 ) 5 = = 63 6 . 4

  14. Interpreting Interpreting r r -1 r 1 Sign of r indicates direction of the linear relationship. r 0 Weak relationship; random scatter of points Strong relationship; either positive or negative r 1 or 1 All points fall exactly on a straight line. r = 1 or 1

  15. Example Example r = -0.1 Very weak negative linear relationship Very strong positive linear relationship r = 0.93 Weak negative linear relationship r = -0.4 All points fall exactly on a downward straight line. r = 1

  16. x 1 2 3 4 y 50 80 105 120 Bivariate Data & Scatterplot Bivariate Data & Scatterplot Scatterplot of x vs y 120 110 100 Utility (y) 90 80 70 60 x - Number of Residents (independent variable) y - Monthly Utility (dependent variable) 50 1 2 3 4 Number of Residents (x) Fitted Line Plot y = 30.00 + 23.50 x 130 120 110 100 Utility (y) 90 Strong Positive Linear 80 70 60 50 1 2 3 4 Number of Residents (x) Best fitting line?

  17. The Regression Line The Regression Line The form of the linear relationship between ? and ? can be described by fitting a line as best as we can through the points. This best fitting line is the (least squares) Regression Line: ? = ? + ? ?. (Intercept-slope form) ?is the y-intercept of the line ?istheslope of the line

  18. The Regression Line The Regression Line To find the slope and y-intercept of the best fitting line, use: s = y b r s x = a y b x The least squares regression line is y = a + bx

  19. Example Example x y Recall 14 178 15 230 17 240 19 275 16 200 = = 16 2 . 9235 . 1 = x s x = 224 6 . 3604 . 37 y s y = 885 . r s 37 . 3604 b = y = = 885 (. ) 17 . 189 r . 1 6 . y = 9235 17 53 s x = = = y 224 : . 189 86 . 16 ( + ) 2 . 17 53 x . 86 a Regression b x Line 189 .

  20. Example Example Predict the selling price for another residence with 16 thousand square feet of living area. Predict: = = + 17 + 16 ( 53 . . 86 17 ) 189 221 = . y x 53 . 86 189 . $221,160 or 16

  21. Key Concepts Key Concepts I. Bivariate Data 1. Both qualitative and quantitative variables 2. Describing each variable separately 3. Describing the relationship between the variables II. Describing Two Qualitative Variables 1. Side-by-Side pie charts 2. Comparative line charts 3. Comparative bar charts Side-by-Side Stacked

  22. Key Concepts Key Concepts III. Describing Two Quantitative Variables 1. Scatterplots Linear or nonlinear pattern Strength of relationship Unusual observations; clusters and outliers 2. Covariance and correlation coefficient 3. The best fitting line Calculating the slope and y-intercept Graphing the line Using the line for prediction

  23. Question Question x y 1 50 2 80 3 105 4 120 Scatterplot of x vs y 120 110 100 Utility (y) 90 80 70 60 50 1 2 3 4 Number of Residents (x) 1. 2. 3. ? - Number of Residents ? - Monthly Utility Find Covariance between x and y; Find Correlation between x and y; Find the Regression line.

  24. x y xy Solution Solution 1 50 50 2 80 160 Calculate = y 3 105 315 = 5 . 2 = . 1 = 29 x s 4 120 480 x 88 75 . 30 7 . s 10 355 1005 (sum) y ( )( n ) x y s x y i i xy s 39 r = i i = s s xy 1 n x y 10 ( )( 355 ) . 17 1005 = = 99 . 4 = = 39 . 17 . 1 29 30 ( ) 7 . 4 1

  25. x y (? ?)(? ?) ( )( ) x x y y Solution Solution i i 1 50 58.125 2 80 4.375 Calculate = y 3 105 8.125 = 5 . 2 = . 1 = 29 x s 4 120 46.875 x 88 75 . 30 7 . s 10 355 117.5 (sum) y s = n ( )( ) x x y y xy s 39 r = i i s xy s 1 x y . 17 117 5 . = = 99 . = = 39 . 17 . 1 29 30 ( ) 7 . 4 1

  26. s = y b r Solution Solution Regression Line s x = + y a bx = a y b x s Fitted Line Plot y = 30.00 + 23.50 x y = b r s 130 x 120 30 7 . 110 = 99 . 23 5 . 100 Utility (y) . 1 29 90 80 70 = a y b x 60 50 = = 88 75 . 23 30+ ( 5 . ) 5 . 2 30 1 2 3 4 Number of Residents (x) 23 5 . y x

More Related Content