
Construct Validity in Educational Research
Discover the importance of construct validity in educational research, where evidence is crucial to ensure tests accurately measure intended constructs. Learn about convergent and discriminant validity and their role in establishing the validity of assessments. Explore a practical example in digital learning focusing on predicting boredom in educational settings.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
What is construct validity? Construct validity refers to the degree to which a test, tool, or model actually measures what it is intended to measure (the underlying construct). For example, consider a general knowledge test about fractions that is administered to primary school students. If the test is designed to assess knowledge of facts concerning performing arithmetic operations with fractions (adding, subtracting, multiplying and dividing), the test questions should be designed to measure these skills. This would strengthen one feature of the test s construct validity. But if the test questions are long and complex reading passages, then the test may not be fully measuring factual knowledge about fractions, which would be a threat to its construct validity.
What is needed to establish construct validity? Construct validity requires multiple sources of evidence. To demonstrate construct validity, we need: Evidence that the test, tool, or model measures what it is purported to measure Evidence that the test, tool, or model does not measure irrelevant attributes Both sources of evidence are required.
Convergent validity Evidence that the test, tool, or model measures what it is purported to measure This is known as convergent validity. To show convergent validity, two tests that are believed to measure closely related skills or types of knowledge should have a strong correlation with each other. As a result, the two tests should rank students in a similar way.
Discriminant validity Evidence that the test, tool, or model does not measure irrelevant attributes This is known as discriminant validity. To show this, we provide evidence that two tests do not measure closely related skills or types of knowledge that are not strongly correlated with each other. Thus, a test about fractions should primarily measure constructs related to the arithmetic operations of fractions and not on reading or literacy constructs. To determine the construct validity of a particular fractions-based tests, a researcher would need to demonstrate that the correlations of scores on that test with scores on other fractions-based tests are higher compared to scores on reading and literacy tests.
Example in digital learning For over a decade, one of the most central constructs of focus in educational and digital learning research is boredom (Miller et al., 2014). Evidence has suggested that boredom is associated with negative learning outcomes, more so than affective states such as frustration and confusion. Boredom can be detected by physical sensors or through interactions between students and educational software An important question is, how can we develop a model to accurately predict the affective state of boredom? Miller, W. L., Petsche, K., Baker, R. S., Labrum, M. J., & Wagner, A. Z. (2014). Boredom across activities, and across the year, within reasoning mind. In Workshop on Data Mining for Educational Assessment and Feedback (ASSESS 2014).
Example continued In many areas of research on learning systems, it is common to employ machine learning algorithms to extract features that are assumed to be strongly correlated with an outcome and to build analytic models that yield accurate predictions. To build these models or detectors of affective states, researchers can collect human assessments of disengagement and boredom and use machine learning approaches to develop models that can replicate the human judgements. To build the models, the machine learning approach requires that the data be split into a training and validation dataset. The training dataset is used to construct and test the model. In our example, to detect boredom, the machine learning algorithm will assess the correlation between various features and the boredom outcome and identify the ones that are strongly correlated. Once all features are selected, the model is fit on the validation dataset to assess its predictive accuracy.
Example continued One way to strengthen the construct validity of the models or detectors is to filter out features that are deemed irrelevant or weakly correlated with the outcome. This is based on the idea of strengthening the discriminant validity of the model by pre-emptively removing irrelevant features. Prior work (San Pedro et al., 2012) has shown that removing irrelevant features yields models with better overall predictive performance even with the data reduction. The process of selecting features that were less correlated with the constructs contributed the most to the predictive accuracy. Sao Pedro, M. A., Baker, R. S. D., & Gobert, J. D. (2012). Improving construct validity yields better models of systematic inquiry, even with less information. In User Modeling, Adaptation, and Personalization: 20th International Conference, UMAP 2012, Montreal, Canada, July 16-20, 2012. Proceedings 20 (pp. 249-260). Springer Berlin Heidelberg.
Concluding remarks Construct validity plays an important role in both testing and model prediction. Determining the strength of the construct validity of a model requires an assessment of the convergent and discriminant validity of the tool. Prior research suggests that centering models on strengthening construct validity leads to improvements in predictive accuracy, particularly when it is done with a combination of automated and manual feature selection.