Essentials of Test Quality in Psychometrics

Slide Note

Key factors that define a good test - reliability and validity. Learn about sources of error, types of reliability, and estimation methods in test assessment.

khyr_6 Follow

Uploaded on Mar 10, 2025 | 0 Views

Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

Download Presentation

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript

What Makes A Test Good? Psychometrics

What Makes a Test Good? A test must be: Reliable Valid Reliable = consistent Valid = measuring what it is suppose to measure

Face Reliability and Validity Validity Reliability Content Test-Retest Criterion IC Alternative Construct Convergent Interrater Discriminant Factorial

Reliability and Validity Reliable but not valid

Reliability and Validity Neither reliable nor valid

Reliability and Validity Reliable and valid

Definition of Reliability reliability = refers to test's consistency A reliable test minimizes error and provides repeatable consistent results

Sources of Error 1) low precision characteristics of the instrument used 2) state of the participants depressed, moody, irritable 3) state of the experimenter depressed, moody, irritable 4) state of environment noise, heat, etc.

Classical Test Theory Observed score True ability Random error + = e + T

Estimation of Reliability Correlations used Normally ranges from -1.0 to +1.0. Reliability coefficient is: 0 to 1.0

Types of Reliability Test-retest Alternate Forms Internal Consistency Inter-rater

Test-Retest Reliability Coefficient of stability is the correlation of the two sets of test scores.

Alternate Forms Two versions of the same test with similar content Forms must be equal Coefficient of equivalence

Internal Consistency Split half KR-20 Cronbach s alpha

Inter-rater Reliability Measures scorer reliability Do the judges agree?

Inter-rater reliability Scores by two independent judges and the scores are correlated Cohen s Kappa (k - Kappa Statistic)

How Reliable? research r = .7 and .8 clinical settings GT r = .9

Validity: Basic Concepts

Reliability and Validity Validity Reliability Face Content Test-Retest Criterion IC Alternative Construct Convergent Discriminant Interrater Factorial

Reliability and Validity Highly Reliable and Valid Highly Reliable but Not Valid Neither Reliable or Valid

Overview Definition Face Validity Content Validity Criterion-Related Validity Construct Validity

Definition of Validity The extent to which a test measures a trait that it claims to measure Involves Validation

Face Validity superficial examination of domain A non-statistical approach established by non- experts Does the test "look valid?

Content-related Validity Does the test assess the domain of the construct that is supposed to measure?

Content-related Validity involves examination of the literature domain must be systematically analyzed consult with experts in the field

Content Validity: Limitations biases Consider biases of the panelists level of expertise of the panelists Are they really experts?

Criterion-related validity relationship between test and criterion Criterion (Y) (behavior) GPA SAT = Scholastic Achievement Test GPA = Grade Point Average Predictor (X) (test) - SAT

Two Kinds of Criterion Validity Concurrent Predictive

Concurrent Criterion-related Validity test and criterion are measured at approximately the same time

Construct-related validity How does a test correlated with similar tests? Test Gold Standard Your Test

Construct-related validity: Types Convergent validity Discriminant validity

Convergent validation A test should correlate highly with another test that it is theoretically related to r = positive and strong Gold Standard WAIS WAIS = Wechsler Adult Intelligence Scale Your Test IQ Measure

Discriminant validation A test ought not to correlate with a theoretically unrelated test Extraversion r = close to zero Your Test IQ Measure

Multi-trait Multi-method Matrix Campbell & Fiske (1959) Correlation of: 2 or more traits by 2 or more methods Methods: e.g., self-report v. peer observations Traits: e.g., Extraversion and Warmth

Multitrait Multimethod Es Ep As Ap Es MM (reliability) MH HM HH Ep MM HH HM As MM MH Ap MM

Monotrait Monomethod (MM) reliability Same trait, same method High r, little error Low r, lots of error

Monotrait Heteromethod (MH) Same trait, different method Validity coefficients expected to be high High r, evidence of construct validity Low r, indicates opposite

Heterotrait Monomethod (HM) Different trait, same method Coefficients expected to be low to moderate high r, poor construct validity low r, good construct validity

Heterotrait Heteromethod (HH) Different trait, different method Coefficients expected to be the lowest high r, poor construct validity low r, good construct validity