Understanding Types of Data and Data Collection Methods

mat 254 n.w
1 / 53
Embed
Share

Learn about different types of data - qualitative and quantitative, secondary and primary data, continuous and discrete data. Explore methods of data collection like surveys, observations, experiments, and registration to gather valuable information for research.

  • Data Types
  • Data Collection
  • Qualitative
  • Quantitative
  • Research

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. MAT 254 MAT 254 Probability and Statistics Probability and Statistics 2016 2016 - -2017 2017 Spring Spring

  2. TYPES of DATA

  3. TYPES of DATA Quantitative e e. .g g. . Height in cm. ,weight in kg. ,blood pressure (mm/Hg) Quantitative ( (or or Numerical Numerical) ) data data is numerical. Qualitative numerically; e.g Qualitative ( (or or Categorical Categorical) ) data data is data that is not given e.g. . favourite colour, place of birth, favourite food, type of car.

  4. Example Example: : Identify each of the following examples as qualitative or quantitative variables. 1. The amount of gasoline pumped by the next 10 customers. Quantitative 2. The amount of radon in the basement of each of 25 homes in a new development. Quantitative 3. The color of the baseball cap worn by each of 20 students. Qualitative 4. The length of time to complete a mathematics homework assignment. Quantitative 5. Birthplaces of the students in the class. Qualitative

  5. TYPES of DATA Secondary data, is data collected by someone other than the user. Common sources of secondary data for social science include censuses, organizational records and data collected through qualitative methodologies or qualitative research. Primary data, by contrast, are collected by the investigator conducting the research.

  6. TYPES of DATA Continuous data, can take any value (within a range). Ex : A person's height: could be any value (within the range of human heights), not just certain fixed heights, Time in a race: you could even measure it to fractions of a second, Discrete data can only take certain values. Ex: the number of students in a class (you can't have half a student). Discrete Continuous Discrete data Continuous data data is counted, data is measured.

  7. Data collection is the process of gathering and measuring information on variables of interest, in an established systematic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes. Data Collection Methods Survey Method: Standardized paper-and-pencil or phone questionnaires that ask predetermined questions. Observation Method: The engineer observes the process or population, disturbing it as little as possible, and records the quantities of interest. Experimental Method: The engineer designs an experiment and makes deliberate or purposeful changes in the controllable variables of the system or process. Registration Method: Registers and licenses are particularly valuable for complete enumeration. Use of Existing Studies

  8. When data collection entails selecting individuals or objects from a frame, the simplest method for ensuring a representative selection is to take a simple random sample.

  9. A A probability Every element in the target population or sampling frame has equal probability of being chosen in the sample for the survey being conducted. Scientific, operationally convenient and simple in theory. Results may be generalized. probability sampling sampling A A non non- -probability Every element in the sampling frame does not have equal probability of being chosen in the sample. Operationally convenient and simple in theory. Results may not be generalized. probability sampling sampling

  10. Simple Random Sampling Selected by using chance or random numbers Each individual subject (human or otherwise) has an equal chance of being selected Examples: Drawing names from a hat Random Numbers MAT234 MAT234 - - Probability & Statistics, Section #3 Probability & Statistics, Section #3

  11. Simple Random Sampling Example: I want one of the students in MAT254 class to answer my question. So I must make a selection among the students. I am selecting a student randomly using my signature list. There are 54 students. Each student is numbered: 01, 02, 03, etc. up to 54. All students have equal chance to be selected.

  12. Stratified Random Sampling Divide the population into at least two different groups with common characteristic(s), then draw SOME subjects from each group (group is called strata or stratum) Basically, randomly sample each subgroup or strata Results in a more representative sample MAT234 MAT234 - - Probability & Statistics, Section #3 Probability & Statistics, Section #3

  13. Systematic Sampling Select a random starting point and then select every kthsubject in the population where ? =? ? is the sample size, and ? is the population size. ? Simple to use so it is used often The systematic technique has some inherent dangers when the sampling frame is repetitive or cyclical in nature. In these situations the results may not approximate a simple random sample.

  14. HOW CAN YOU VISUALIZE HUGE AMOUNT OF DATA? HOW CAN YOU VISUALIZE HUGE AMOUNT OF DATA? Data Data presentation presentation

  15. Data Presentation Data Presentation Presentation is the process of organizing data into logical, sequential, and meaningful categories and classifications to make them meaningful to study and interpretation Analysis and presentation put data into proper order and in categories reducing them into forms that are intelligible and interpretable so that the relationships between the research specific questions and their intended answers can be established. Data presentation is putting results of experiments into graphs, charts and tables to understand the data easily.

  16. Data Presentation Data Presentation There are three ways of presenting data; Textual description. Textual presented in paragraph form to allow Tabular Tabular arranging data in rows and columns. Graphical certain trends Graphical pictorial representation of data to highlight MAT234 MAT234 - - Probability & Statistics, Section #3 Probability & Statistics, Section #3

  17. MAT234 MAT234 - - Probability & Statistics, Section #3 Probability & Statistics, Section #3

  18. Textual Methods - Rearrangement 38 17 50 44 46 23 39 25 48 38 28 45 34 49 34 9 17 18 20 23 35 37 38 9 42 27 24 39 44 38 43 43 20 39 29 50 45 46 23 46 26 35 18 50 45 Raw (original) data 23 24 25 26 27 28 29 34 34 35 35 37 38 38 38 38 39 39 39 42 43 43 44 44 45 45 45 46 46 46 48 49 50 50 50 Rearranged data

  19. Each item in the sample is divided into two parts: stem ;consisting of the left most one or two digits, and leaf, which consists of next digit MAT234 - Probability & Statistics, Section #3

  20. Textual Methods Stem-and-Leaf Plot 38 17 50 44 46 23 39 25 48 38 28 45 34 49 34 35 37 38 9 42 27 24 39 44 38 43 43 20 39 29 50 45 46 23 46 26 35 18 50 45 Raw (original) data Stem 0 1 2 3 4 5 Leaves 9 7,8 0,3,3,4,5,6,7,8,9 4,4,5,5,7,8,8,8,8,9,9,9 2,3,3,4,4,5,5,5,6,6,6,8,9 0,0,0 Stem & Leaf Plot MAT234 MAT234 - - Probability & Statistics, Section #3 Probability & Statistics, Section #3

  21. Tabular Methods A sample table with all of its parts is shown below. Table Number Table Title Column Header Row Classifier Body Source Note

  22. Tabular Methods Frequency Distribution Table A frequency distribution table is a table which shows the data arranged into different classes(or categories) and the number of cases(or frequencies) which fall into each class. Stem 0 1 2 3 4 5 Leaves 9 7,8 0,3,3,4,5,6,7,8,9 4,4,5,5,7,8,8,8,8,9,9,9 2,3,3,4,4,5,5,5,6,6,6,8,9 0,0,0 Frequency Distribution Table of same Data Scores Frequency 1 - 10 1 11 20 3 21 30 8 31 40 12 Stem & Leaf Plot of Data 41 50 16

  23. Tabular Methods Frequency Distribution Table Guidelines For Frequency Tables 1. Be sure that the classes are mutually exclusive. 2. Include all classes, even if the frequency is zero. 3. Try to use the same width for all classes. 4. Select convenient numbers for class limits. 5. Use between 5 and 20 classes. 6. The sum of the class frequencies must equal the number of original data values.

  24. Tabular Methods Frequency Distribution Table Relative FDT & Cumulative FDT Cumulative Frequency Relative Frequency Score Score Frequency Score 1 - 10 1 1 10 2.5% 1 - 10 1 11 20 3 11 20 7.5% 11 20 4 (=1+3) 21 30 8 21 - 30 20% 21 30 12 (=4+8) 31 40 12 31 40 30% 31 40 24 (=12+12) 41 50 16 41 - 50 40% 41 50 40 (=24+16)

  25. Tabular Methods Tabular Methods Frequency Distribution Table Frequency Distribution Table Example their accommodations as being excellent below average are shown below. Example: Guests staying at Hilton Hotel were asked to rate the quality of excellent, above below average, or poor above average average, average average, poor. The ratings provided by a sample of 20 quests Below Average Average Above Average Above Average Above Average Above Average Below Average Below Average Average Poor Above Average Excellent Average Above Average Average Above Average Poor Above Average Average Above Average

  26. Frequency Distribution Table Rating Poor Below Average Average Above Average Excellent Frequency 2 3 5 9 1 20 Total

  27. Graphical Methods

  28. Graphical Methods Graphic presentations used to illustrate and clarify information. Tables are essential in presentation of scientific data and diagrams are complementary to summarize these tables in an easy, attractive and simple way. The diagram should be: Simple Easy to understand Save a lot of words Self explanatory Has a clear title indicating its content Fully labeled The y axis (vertical) is usually used for frequency

  29. Graphical Methods-Dot Plot One of the simplest graphical summaries of data is a dot plot. A horizontal axis shows the range of data values. Then each data value is represented by a dot placed above the axis. A dot plot A dot plot example from textbook The o o values represent the nitrogen data and the x x values represent the no-nitrogen data.

  30. Graphical Methods-Scatter Plot It is useful to represent the relationship between two numeric measurements, each observation corresponding to its value on each axis. being represented by a point A A scatter scatter plot plot example from textbook

  31. Graphical Methods-Line Diagram It is diagram showing the relationship between two numeric variables (as the scatter) but the points are joined together to form a line (either broken line or smooth curve) Number of doctors working in each clinic during years 1995-1998. 6 5 Number of doctors 4 Clinic 1 3 Clinic 2 2 Clinic 3 1 0 1995 1996 1997 1998

  32. Graphical Methods- A cumulative frequency graph (ogive) A cumulative frequency graph or ogive, is a line graph that displays the cumulative frequency of each class at its upper class boundary.

  33. Graphical Methods - Bar Charts The data presented is categorical. Data is presented in the form of rectangular breadth. Each bar represent one variant. Suitable scale should be indicated and scale starts from zero. The width of the bar and the gaps between the bars should be equal. The length of the bar is proportional to the magnitude/frequency of the variable. The bars may be vertical or horizontal.

  34. Graphical Methods - Bar Charts Multiple Bar Charts Also called compound bar charts More than one sub-variant can be expressed

  35. Graphical Methods - Bar Charts Component Bar Charts When there are many categories on X-axis and they have further subcategories, then to accommodate the categories, the bars may be divided into parts, each part representing a certain item and proportional to the magnitude of that particular item.

  36. Graphical Methods - Pie Charts Most common way of presenting data The value of each category is divided by the total values and then multiplied by 360 and then each category is allocated the respective angle to present the proportion it has.

  37. Graphical Methods - Histogram It is very similar to the bar chart with the difference that the rectangles or bars are adherent (without gaps). It is used for presenting class frequency table (continuous data). Each bar represents a class and its height represents the frequency (number of cases), its width represent the class interval.

  38. Graphical Methods - Skewness of Data Frequency Poligon Derived from a histogram by connecting the mid points of the tops of the rectangles in the histogram. The line connecting the centers of histogram rectangles is called frequency polygon. We can draw polygon without rectangles so we will get simpler form of line graph.

  39. Graphical Methods - Frequency Poligon Ex: Age in Years Sex Mid-point of interval Males Females 20-30 3 2 (20+30)/2=25 30-40 5 5 (30+40)/2=35 40-50 7 8 (40+50)/2=45 50-60 4 3 (50+60)/2=55 60-70 2 4 (60+70)/2=65 Total 21 22

  40. Graphical Methods - Frequency Poligon Ex:

  41. Graphical Methods - Box & Whisker Plot (or Box Plot) Box Plots are another way of representing all the same information that can be found on a Cumulative Frequency graph. Lowest value Highest value Median Lower Quartile Upper Quartile Inter-Quartile Range Range Note: The minimum value is the lowest possible value of your first group, and the maximum value is the highest possible value of your last group

  42. Ex: Table 1.4 from textbok (Car battery life)

  43. Ex: Table 1.5 Stem and Leaf plot for Car battery life

  44. Double stem and leaf plot: the stems corresponding to leaves 0 through 4 have been coded by the symbol and the stems corresponding to leaves 5 through 9 by the symbol . MAT234 - Probability & Statistics, Section #3

  45. Ex: Table 1.6 Double Stem and Leaf plot for Car battery life

  46. Ex: Table 1.7 Relative Frequency Distribution for Car battery life Note that: total observation :40

  47. Ex: Figure 1.6 Relative Frequency Histogram for Car battery life

  48. Ex: Figure 1.6 Relative Frequency Histogram for Car battery life Rotating the stem and leaf plot CCW through an angle of 90 degree gives similar figure with Histogram. Skewed to the left !!!

  49. Ex: Exercise 1.20 from textbok

Related


More Related Content