
Generalizability and Decision Studies in Observational Research on Teaching
Explore the importance of observational data in educational research, focusing on generalizability and decision studies for teaching practices. Learn about the design, validation, and reliability assessment using observational protocols and statistical techniques.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
When Seeing is Believing: Generalizability and Decision Studies for Observational Data in Research on Teaching Timothy J. Weston Charles N. Hayward Sandra L. Laursen University of Colorado, Boulder
Inquiry-based teaching & learning, Research Based Instruction (RBI) Reform efforts in the past 20+ years have advocated for more engaging and active teaching & learning. Research efforts to learn if instructors are teaching this way often involve observing classrooms with structured observational protocols.
How we started the study We designed a survey called the TAMI-S asking teachers to report on teaching over a semester. We validated the survey with observations. We wanted to know if teachers accurately reported their teaching practices. However, we didn t know how many observations we needed for the comparison. Many studies only had 2 or 3 observations for each teacher which didn t seem like enough.
Observations for research purposes Observations were made with modified TDOP observational protocol: The Toolkit for Assessing Mathematics Instruction (TAMI-OP) . We coded what teachers and students did during each 2-minute segment including codes for lecture, group work, student presentation, and question and answers. As part of the study we visited 15 teachers and made class observations of 297 undergraduate math courses at three institutions. Currently we have around 800 classes and 76 teachers in our database.
Studying reliability We used complex statistical techniques from Generalizability Theory and Decision Theory to calculate the reliability of semester-level measures. These techniques are similar to Analysis of Variance (ANOVA) and account for different sources of signal and error in a measure. Most discussions of reliability for observational protocols examine interrater reliability (IRR). For research purposes, high levels of IRR are established through training, so many reliability studies are checking to see if agreement is good enough.
Q1: How did rater agreement vary for each activity code in our study?
Q2: How many classes do you need to observe to get a reliable measure? Teacher 1 Teacher 2 For observations used in a research context, reliability also depends on the number of occasions or classes for each teacher. The amount of class time devoted for an activity such as lecture varies. Our study focused on this source of variation.
The real problem with observational data Forty-five class sessions of lecture Average = 33 minutes 60 50 Lecture Proportion 40 30 20 10 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45
A sample of 5 classes Average = 40 minutes 60 50 40 Minutes 30 20 10 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45
For teachers in our dataset we needed to observe 11 classes to have a reliable measure. Other data we have has lower variability, as low as 8 classes OK.
We also compared our study to other G-studies in the literature. Ours was near the top, but not too unusual. Number of observations needed for one teacher 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Elementary elementary & middle Our study K-12 teachers Infant behavior K-12 teachers Middle school lang. arts Middle school Elementary elementary & middle Middle school math High school history
Implications Researchers need a lot more observations than what is currently practiced to reliably characterize teaching at the semester level in research studies. Observing multiple classes costs a lot and is impractical for many researchers. It s a good idea to factor this when planning research studies with observational data. We found that rater agreement and bias can be almost eliminated through training, but that some activities such as question & answer are more difficult to agree on than others.