The Concord Consortium logo

Interactive Models

Helping Students Learn and Helping Teachers Understand Student Learning

By Paul Horwitz, Janice Gobert, and Barbara Buckley

If a picture is worth a thousand words, then for science learning an interactive model may well be worth a thousand pictures. Why are models such powerful learning tools? And what can we learn by observing how students experiment with them?

In October 2001, in partnership with Harvard University, Northwestern University, and Massachusetts public schools in Lowell and Fitchburg, the Concord Consortium launched Modeling Across the Curriculum (MAC), a groundbreaking, five-year research project. Since then we have developed dozens of model-based activities that cover such diverse topics as Newtonian mechanics, gas laws, atomic structure, and genetics. We have also built an extensive suite of tools for collecting and analyzing the data generated as students use the models. Over the last three years approximately 400 schools in over 20 countries have registered with us and downloaded the MAC software. Each time students ran an activity, if they were online, we collected data—over 1.5 GB of log files. In all, over 18,000 students have contributed to our research in this way. As we wrap up the project this fall, our data analysis algorithms are converting those logs into useful information for researchers and teachers.

So what are we learning from all that data? First of all, students learned the science content from models. For instance, 98% of the physical science classes that used our Dynamica activities (which model Newtonian mechanics) showed significant learning gains on a post-test to pre-test comparison. More important, the students who ran more models also learned more—in our genetics classes, the number of activities attempted by the students in a class accounted for 17% of the learning gains. That's not too surprising, but it's gratifying nonetheless -- a "proof of principle," if you will. But the really interesting part comes when you consider how the students used the models.

Actions Mirror Understanding: Performance Predicts Learning

Let's distinguish "process" data from "outcome" data. Outcome data describes what a student learns. It often appears in the form of answers to questions or solutions to numerical problems— whether presented as a post-test or embedded within an activity. Process data, on the other hand, describes how a student goes about solving a problem. For example, one of our genetics activities requires students to figure out the genotype of two dragons given that all of their offspring have two legs. Students could perform “thought experiments” on the computer, altering the genes of either parent, and then breeding them and observing the result. They could also use a special kind of “magnifying glass” to view the chromosomes of any organism. We logged what steps the students took, and what tools they used.

Looking over the process data from the two-legged dragon challenge, we found a wide variation in the way students went about the task. Some students bred the same two parents over and over, apparently hoping for success just by chance; others perseverated on an incorrect model (e.g., if both parents have two legs, so will their offspring), seemingly unable to think outside the box. Still others approached the task systematically, examining the chromosomes of parents and offspring, varying only one parent at a time, and reasoning their way to the correct answer.

Once we had identified these different behaviors, we developed algorithms that enabled us to classify every student's investigations along a spectrum from "systematic" to "haphazard." We found a statistically significant correlation between a student's process score on this task and her subsequent learning gains as determined from the post-test. Moreover, this correlation persisted whether or not the student actually succeeded—in other words, the process variables predicted learning gains1 even when the student failed to accomplish the task. Even more surprising—and gratifying—was the extent of the “transfer effect” from this task. The post-test included some items that were directly related to the content underlying the two-legged dragon task (i.e., monohybrid inheritance, or the inheritance of a single characteristic), together with many other items that were not. One might expect a student's process score to correlate more strongly with the proximal items than with the test as a whole. In fact, the reverse was true: performance on the task was more predictive of the overall post-test score than of the score on the directly relevant items.

The task described above is not unique; many of our activities contain similar "hot spots" -- tasks that are open-ended and complex enough to engage students in authentic inquiry while giving us a glimpse into their cognitive processes. We have identified such "teachable moments" in each of the science areas covered by the project. And each of the hot spots examined so far shows the same intriguing correlation with learning gains, as measured by conventional assessments. The next step is to look for “learning progressions.” Do students improve their model-based reasoning skills as they progress from one activity to the next? And do such skills transfer between scientific domains? Does the tenth grader who learns to reason with models in genetics apply that skill when he encounters the model-based gas laws unit in junior year chemistry? We don't know yet, but we have enough data to find out.

Any Number Can Play

The IERI program that funded this work imposed two nearly mutually contradictory constraints. The research was to be methodologically rigorous and carefully controlled, yet it had to be potentially scalable to very large numbers of students and schools. We addressed these contrasting requirements by constraining our intervention to be entirely computer-based, thus minimizing local variability and the need for extensive professional development. We then made it freely available on our website to any school. When schools downloaded the software, they were given the option to register with us and allow us to collect data from them. In return, we reported to each of these “contributing” schools regarding the performance of their students2.

Over the period of the project, nearly 400 schools, located in over 20 countries, took advantage of this offer. However, many of these appear to have used the software offline (thus generating no data), and many others did not administer the pre- and post-tests. Nevertheless, 41 of the contributing schools did comply with these requirements, and we have subjected their data to the same analysis that we used for the 10 "member" schools that were officially part of the project. We were surprised to discover, in fact, that the contributing schools actually performed better than the member schools. For example, the 18 contributing classes that used the genetics model averaged a learning gain of 6.8 points, versus 5.8 for the member schools. This is a remarkable achievement, considering that the contributing schools received no support of any kind from the project3. Even allowing for the fact that the contributing schools were self-selected, their stunning success demonstrates the scalability of the technology and pedagogy.

Implications

In Massachusetts students are not permitted to graduate from high school unless they achieve a minimum score on a largely multiple-choice test called the Massachusetts Comprehensive Assessment System, or MCAS. Many other states have similar requirements. Reliance on such traditional assessment tools can have two adverse effects: (1) the tests are artificial and at best serve as imperfect markers for real-world skills and abilities, and (2) an over-emphasis on improving test scores is causing many schools to spend so much time getting students ready for the MCAS that they have precious little left for teaching.

The experimentation with models demanded of students in the MAC project is a task much more analogous to real-world scientific methods than the act of answering a collection of unrelated multiple-choice questions. The inferences we make by observing how students perform model-based tasks turns them into insightful formative assessments that can guide teaching without disrupting it.

Some day, perhaps, the MCAS will be redesigned and the question-and-answer items of today will be recast as tasks involving manipulable models. When that happens "teaching to the test" will be the right and proper thing to do.


Paul Horwitz (paul@concord.org) is the Principal Investigator of the MAC project. Janice Gobert (jgobert@concord.org) is a Co-Principal Investigator and Research Director of the MAC project. Barbara Buckley (bbuckley@concord.org) is a Research Scientist on the MAC project.

2006 Fall @Concord Newsletter

2006 Fall @Concord Newsletter




Notes

  1. The process variable score on this task accounted for approximately 10% of the variance in learning gains.
  2. We encrypted all log files before transferring them from the schools to our server, and we maintained strict anonymity by assigning each student a unique ID number and stripping names from our database.
  3. We conducted professional development workshops and provided monetary compensation to teachers at the member schools