
Examining U.S. Students' International Test Results 2013
In this analysis by Martin Carnoy and Richard Rothstein, the authors delve into the deceptive nature of average national and state test scores, emphasizing the importance of considering students' family academic resources for accurate assessment. They discuss how different types of tests such as TIMSS and PISA can yield misleading results and highlight the significance of defining family academic resources in evaluating educational outcomes. Country comparisons are made focusing on 7 selected nations, offering insights into the varied education systems globally.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
INTERPRETING THE RESULTS OF INTERNATIONAL DATA: A MORE CAREFUL EXAMINATION OF U.S. STUDENTS RESULTS ON PISA AND TIMSS MARTIN CARNOY, STANFORD UNIVERSITY RICHARD ROTHSTEIN, ECONOMIC POLICY INSTITUTE NOVEMBER, 2013
AVERAGE NATIONAL OR STATE SCORES ON ASSESSMENT TESTS CAN BE MISLEADING Average test scores for countries and states reflect not only the quality of schooling, but also students family academic resources (F.A.R.). In every country of the world, children who come from disadvantaged families - in terms of their families and communities cultural, social, and human capital - score much lower than their advantaged counterparts, on average. This is true even for those disadvantaged students attending excellent schools. This is true for all international and state assessment tests. The quality of education systems varies and we want to know why, but to begin to get at that quality, we have to compare students bringing similar resources to schools.
SCORES ON A SINGLE TEST CAN ALSO BE MISLEADING Different types of tests may measure different types of cognitive knowledge. The TIMSS test purports to be curriculum-based. Like any time-limited test, It measures only certain subsets of subject-area skills and, in the U.S, is more closely aligned than the PISA with the National Assessment of Educational Progress (NAEP). The TIMSS test is applied to students in a given grade (4th & 8th). The PISA test not only attempts to be a test of general knowledge within a subject area, but also purports to assess more intensively the application of skills, not only their acquisition. PISA is applied to 15 year-olds in whichever grade they are studying. Relative average scores among countries on each test may vary from year to year. Relative country rankings may change over time in part because the family academic resource composition of the students sampled may change differently in different countries. When we compare students with similar family academic resources, how average scores on the PISA change over time, compared to how they change over time on the TIMSS, may be characterized by even greater differences.
DEFINING FAMILY ACADEMIC RESOURCES We define family academic resources (F.A.R.) by cultural capital - books in the home - rather than human capital or consumer durables, for various reasons. We test how our achievement score estimates by F.A.R. group differ using other measures of students home resources that could and do affect students academic performance (mother s education, parents highest education, PISA social class index). When we compare our results using these different definitions of F.A.R., the differences in achievement score by F.A.R. group are very small.
A NOTE ON OUR COUNTRY COMPARISONS Because of the intense analysis required, we selected 7 countries test scores to examine: the United States, three similar post-industrial countries -- France, Germany, and the U.K. -- and three countries that are usually characterized as top-scoring -- Canada, Finland, and Korea. We believe these countries to be reasonably typical of countries like them. The discussion that follows is based on comparisons between the U.S. and these 6 countries alone.
POINT 1: WHEN WE ADJUST AVERAGE PISA RESULTS FOR THE MUCH LOWER F.A.R. OF STUDENTS IN U.S. SCHOOLS THAN IN COMPARISON COUNTRIES, U.S. STUDENTS DO BETTER THAN CLAIMED
IN MATH, THE DIFFERENCE BETWEEN THE U.S. AND THE TOP SCORING COUNTRIES IS REDUCED BY ABOUT ONE-THIRD WHEN WE ADJUST FOR FAMILY ACADEMIC RESOURCES (F.A.R.)
POINT 2: U.S. STUDENTS, PARTICULARLY LOWER F.A.R. STUDENTS, ARE MAKING GREATER GAINS IN PISA READING THAN STUDENTS EVEN IN STELLAR PERFORMERS SUCH AS FINLAND 600 580 560 540 PISA Reading Scale Score 520 500 480 460 440 420 400 2000 2003 2006 2009 Finland disadvantaged Finland advantaged U.S. disadvantaged U.S. advantaged
DISADVANTAGED U.S. STUDENTS PISA READING SCORES ARE HIGHER THAN COUNTERPARTS SCORES IN FRANCE, GERMANY, & U.K. 580 560 540 520 PISA Reading Scale Score 500 480 460 440 420 400 380 360 2000 2003 2006 2009 Germany disadvantaged Germany advantaged U.S. disadvantaged U.S. advantaged
IN MATH, U.S. ADVANTAGED STUDENTS UNDERPERFORM THEIR COUNTERPARTS IN THE BIG EUROPEAN ECONOMIES 580 560 540 520 PISA Math Scale Score 500 480 460 440 420 400 2000 2003 2006 2009 Germany disadvantaged Germany advantaged U.S. disadvantaged U.S. advantaged
IN MATH, ADVANTAGED AND DISADVANTAGED STUDENTS UNDERPERFORM THEIR COUNTERPARTS IN HIGH-SCORING COUNTRIES, YET THE DISADVANTAGED ARE MAKING LARGER GAINS SINCE 2000 600 580 560 540 PISA Math Scale Score 520 500 480 460 440 420 400 2000 2003 2006 2009 Finland disadvantaged Finland advantaged U.S. disadvantaged U.S. advantaged
YET OVERALL, WHEN CONTROLLING FOR F.A.R., U.S. STUDENTS ARE NOT MAKING MATH GAINS ON PISA COMPARED TO THE HIGH-SCORING PISA COUNTRIES 550 538 540 536 530 PISA Mathematics Scale Score 520 510 500 500 496 490 480 470 Finland PISA 2000*Finland 2000 BH Breakdown Finland PISA 2009*Finland 2000 BH Breakdown U.S. PISA 2000*Finland 2000 BH Breakdown U.S. PISA 2009*Finland 2000 BH Breakdown
POINT 3: THE PISA AND TIMSS TESTS SHOW VERY DIFFERENT RESULTS IN MATH FOR U.S. STUDENTS. WHEN CONTROLLING FOR F.A.R., U.S. MATH GAINS ARE MUCH LARGER ON THE TIMSS
U.S. TIMSS AND NAEP MATH SCORES RISE SIMILARLY IN 2000-2011, BUT U.S. PISA MATH SCORES DO NOT 0.4 0.3 Cumulative Math Gain (in standard deviations) 0.2 0.1 0 1999-2000 2003-04 2006-07 2008-09 2011 -0.1 -0.2 -0.3 Main NAEP TIMSS PISA NAEP LTT
POINT 4: THERE IS LARGE VARIATION IN THE PERFORMANCE OF APPARENTLY SIMILAR F.A.R. STUDENTS IN THE SCHOOLS OF DIFFERENT STATES (2011 TIMSS TEST). DIFFERENCES IN SCHOOL SYSTEMS MAY HELP EXPLAIN THESE DIFFERENCES US US US US US US US US US (North Carolina) (Massachu setts) Finland US (Alabama) (Colorado) (Connecticut) (California) (Florida) (Indiana) (Minnesota) 0-10 BOOKS 465 465 434 464 446 452 484 479 503 494 484 11-25 BOOKS 493 485 448 487 475 469 498 500 522 506 518 26-100 BOOKS 514 516 481 521 521 507 518 526 563 543 539 101-200 BOOKS 530 542 510 544 550 532 544 544 575 568 560 MORE THAN 200 535 548 502 557 565 535 553 558 598 574 585
POINT 5: ON U.S. NATIONAL ASSESSMENTS PUPILS HAVE ALSO MADE BIG GAINS IN THE PAST 20 YEARS, BUT DIFFERENCES EXIST BETWEEN STATES 290 285 NAEP 8th Grade Math Scale Scores, Students w/Mother's 280 275 Education HS complete or Less Alabama California Colorado Connecticut Florida Indiana Massachusetts Minnesota North Carolina 270 265 260 255 250 245 240 1992 1996 2000 2003 2005 2007 2009 2011
MATH GAINS (AS MEASURED BY THE NAEP 8TH GRADE TEST) ARE RELATED TO STATE MATH SCORE STARTING POINT, BUT EVEN SO, GAINS VARY GREATLY ACROSS U.S. STATES 1996-2011 state mathematics gains versus beginning score in 1996 for students with mothers who completed high school or less 25 2011-1996 Math Gain Students ME HS 20 15 Complete or Less 10 5 0 240 250 260 270 280 290 -5 -10 1996 NAEP 8th Grade Math Scale Score, Students with ME HS Complete or Less
LOWER F.A.R. STUDENTS IN SOME STATES MADE BIG GAINS ON THE 8TH GRADE MATH NAEP; LOWER F.A.R. STUDENTS IN OTHER STATES MADE SMALL GAINS Controlling for starting score, the highest gainers for students with lower family academic resource (mother s education HS complete or less*) in 1996-2011 were: Texas, Massachusetts, Montana, Delaware, Virginia, Vermont, and North Carolina Controlling for starting score, the lowest gainers for students with low family academic resource (mother s education HS complete or less) in 1996-2011 were: Utah, Nebraska, West Virginia, California, Alabama, and Michigan. *Lower F.A.R. students defined this way represent about 30-35 percent of the students taking the NAEP test.
HIGH F.A.R. STUDENTS IN SOME STATES MADE BIG GAINS ON THE 8TH GRADE MATH NAEP; HIGH F.A.R. STUDENTS IN OTHER STATES MADE SMALL GAINS The highest gaining states for students with high family academic resources (mother s education college complete*) in 1996-2011 were: Texas, Colorado, Maryland, Massachusetts, Vermont, Virginia, and Minnesota. The lowest gaining states for students with high family academic resources (mother s education college complete) in 1996-2011 were: Michigan, West Virginia, Nebraska, Iowa, Alabama, and New Mexico *High F.A.R. Students defined this way represent about 40-45 percent of the students taking the NAEP test.
THE LARGE DIFFERENCES IN MATH SCORES AND GAINS ACROSS STATES FOR STUDENTS WITH SIMILAR F.A.R. SHOULD TELL US MORE ABOUT OUR EDUCATIONAL SYSTEM THAN LOOKING AT FINLAND OR KOREA States such as Massachusetts not only had high levels of math achievement in the 1990s, but have made large gains since, now reaching performance levels on the TIMSS near those of Japanese students. This has happened without sending students to intensive after school programs, as in Japan and Korea. Students in states such as Texas, North Carolina, Virginia, and Arkansas began with lower scores in the 1990s and also made big gains. North Carolina, like Massachusetts, scored high on the 2011 TIMSS across F.A.R. groups. But other states that began with both low and high scores have made relatively little progress.
A CHALLENGE FOR RESEARCHERS The question is: Why did students with similar family academic resources particularly students with high family academic resources in different states score so differently on the TIMSS/NAEP, and, more importantly, make such different gains? Some of the differences may be due to differences between states in more finely specified definition of students F.A.R., some may be due to changes over time in the F.A.R. or ethnic/race/English language learner composition of the student sample, and some may be due to differences in, or changes in, educational effectiveness. We challenge our colleagues, educational researchers, to investigate these issues, rather than repeat oversimplified comparisons of point-in-time average scores that can mislead more than they clarify.
WHERE TO GET OUR RESEARCH ON U.S. STUDENTS AND INTERNATIONAL TEST SCORES: For the earlier report, go to: What do international tests really show about U.S. student performance? by Martin Carnoy and Richard Rothstein (January, 2013) http://www.epi.org/publication/us-student-performance- testing/ For questions or comments, write to us at carnoy@stanford.edu, riroth@epi.org To receive a copy of our forthcoming report (early 2014), with the most recent international test data, write to us.