Applied Statistics: Problem Definition, Data Analysis, Examples

Applied Statistics: Problem Definition, Data Analysis, Examples
Slide Note
Embed
Share

In Applied Statistics, a 4-step process involves problem definition, data collection, description/summarization, and analysis/interpretation. Practical problems include new drug testing, quality control, measuring animal populations, agricultural production, marketing, and economics/finance. Understanding populations vs. samples, statistical inference, and scientific studies like designed experiments are key concepts. Various methods are utilized for different data types and research questions.

  • Applied Statistics
  • Data Analysis
  • Statistical Inference
  • Examples
  • Scientific Studies

Uploaded on Feb 28, 2025 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Chapter 1 Introduction

  2. Introduction In many cases, Applied Statistics can be generalized into a 4-step process: Problem Definition/Description Data Collection Data Description/Summarization Data Analysis/Interpretation/Communication Various methods exist to complete this process for different types of data, experimental settings, and research questions

  3. Examples of Practical Problems New Drug Testing: Does a new drug improve patient condition more so than an existing drug or placebo? Quality Control: Does a manufacturing facility continue to have an acceptable rate of defectives? Measuring Animal Poulations: How many fish of a species live in a lake? Agricultural Production: Which of 4 varieties of fertilizer give best results in grain production? Marketing: What levels of pricing and advertising produce the highest profits for a new product? Economics/Finance: Does an investment/betting strategy provide profitable returns on average?

  4. Populations and Samples Population: All measurements of interest to researcher Sample: Subset of population that is obtained and to be analyzed Population Sample

  5. Statistical Inference Goal: Make statements regarding a population (or true state of nature) based on observations from a sample. What can be said of the average measurement in a population, when we obtain the average of a sample from the population? Can we conclude success rates differ for two treatments in nature (across a conceptual population of units), based on a certain difference in success rates in two samples of units that were assigned the treatments

  6. Scientific Studies Designed Experiment: Investigation to obtain/ compare measurements from subjects under various conditions Elements of Experiments: Factors: Variables to be controlled by experimenter Measurements/Observations: Responses that are recorded (but not controlled) by the experimenter. Outcome of interest Treatments: Conditions constructed from factor(s) to be assigned to units. Controlis benchmark condition. Experimental Unit: Physical entity receiving treatment Replication: Treatments are assigned to more than one unit so that experimental error/variation can be measured Measurement Unit: Unit on which observation is made. Could be experimental unit, or a smaller part (e.g. student in class)

  7. Treatment Designs 1-Factor: Completely Randomized Design Multi-Factor: Factorial Treatment Design Full factorial: All combinations of factor levels are observed in experiment. Fractional factorial: Subset of all possible factor level combinations observed (when too many exist) Randomized Block Design: Experimental units broken into multiple measurement units (blocks), and treatments assigned at random to measurement units within blocks Latin Square Design: Similar to Randomized Block Design, except positions within blocks have effects to be controlled (e.g. tire positions on an automobile)

  8. Factorial Treatment Design in CRD 2 Factors: A and B (A has a levels, B has b levels) 1-at-a-Time Approach: Vary levels of Factor A, while holding factor B constant and vice versa. Can obtain main effects for each factor, but not interaction. Interaction: When effects of levels of one factor depend on the level of the other factor, and vice versa Factorial Treatment Structures: Generate all ab combinations of levels of Factors A and B. Randomly assign experimental units to these treatments as in Completely Randomized Design with one factor.

  9. Observational Studies Sometimes cannot assign experimental units to treatments due to nature or ethics Gender, race, religion cannot be assigned to subjects Items cannot be assigned at random to manufacturer (they are built by firm) Would like to compare factor levels anyway More difficult to assess causal relationships since external factors may be related to identified factors in study which cause observed differences Often will attempt to control for other factors in analysis

  10. Surveys Instruments used to obtain demographic characteristics and attitudes or behavioral tendencies from subjects Passive in nature, obtaining naturally occuring information Many fields conduct surveys regularly: Public Opinions: Gallup, CNN, WSJ, TV Networks Government Bureaus: Census, Labor Statistics Business: Customer satisfaction, Quality, Practices Recreation: State parks and wildlife area usage

  11. Sampling Methods Simple Random Sampling: Frame listing all N elements of population exists. Random numbers used to obtain a sample of n elements such that all samples of size n had equal chance of selection Stratified Random Sampling: Population split into homogeneous groups (strata) based on auxiliary variable(s) such as gender, income, race. Simple random samples taken from each stratum. Cluster Sampling: Population broken into set of clusters (often based on location), and sample of clusters are selected, with all elements in sampled cluster measured Systematic Sampling: Element selected at random near top of list, then every kth element subsequently measured

  12. Survey Problems Nonresponse: If people who do not respond tend to differ systematically from responders, results will be biased Measurement Problems Recall: Tendency to forget occurences of certain things or be unable to give accurate counts of frequency of occurrence Leading Questions: Wording of questions can lead to certain responses that can bias survey results Unclear Wording: Different people can interpret the same question in different ways, making results inaccurate when responses depend on interpretations

  13. Survey Techniques Personal Interviews: In person, face-to-face meetings between interviewer and interviewee. Biases can occur due to the interaction. Telephone Interviews: Interview over the phone. Less costly than personal interviews. Bias can occur due to unlisted numbers and different schedules for different people. Self-administered Questionnaire: Inexpensive, but notoriously low response rates. Can be done by mail or on internet. Direct Observation: Measurements made directly using monitoring equipment or public records

  14. Variable Types Variables are attributes that are observed on experimental/observational units. Methods of data description and analysis differ among variable types. Categorical Variables Attributes that describe aspects of units that are not enumerated. Two sub- types are Nominal and Ordinal Numeric Variables Attributes that describe aspects of units that are enumerated. Two sub-types are Discrete and Continuous

  15. Categorical Variable Types Ordinal Variables with levels that have no inherent ordering. Examples include: Auto Brand: Toyota, Honda, Ford, Chevrolet, BMW, Hair Color: Black, Brown, Red, Blond, College at UF: CLAS, Engineering, Education, HHP, CALS, ... Nominal Variables with levels that have an inherent ordering. Examples include (Low to High levels): Home Soccer Team Game Outcome: Lose, Draw, Win Attitude Toward Online Service Experience: Very Poor, Poor, Neutral, Good, Very Good Rank of Student in Law Class of 100 Students: 100,99, ,2,1

  16. Numeric Variable Types Discrete Variables with levels that take on either a finite or countably infinite set of possible numeric outcomes. Examples include: Correct number of answers on a Multiple-Choice Exam Number of car accidents at an intersection in a month Number of winning numbers on a purchased Lotto ticket Continuous Variables with levels that take on values along a continuum of numeric values. High temperature observed at a weather station on a day Time for a collegiate swimmer in the 100m freestyle Distance covered by a race car in a 24-hour race Note: Some discrete vars with many levels are treated continuously, and many continuous reported discretely

Related


More Related Content