The Score, Issue #3: Digital SAT Scoring Is Less Transparent. Should You Be Worried?
Navigating the Transition to the Computer-Based SAT: Understanding the Shift in Scoring Methods and Potential Concerns for Parents
As the College Board prepares to transition from the paper-based SAT to the digital SAT (dSAT) in 2024, many parents and students are left wondering how this change will impact scoring. While the scale itself will stay the same (400 - 1600), the computer-based SAT will offer significantly less transparency when it comes to the process of score calculation.
In this article, we'll explore the following:
Differences between Classical Test Theory (CTT), which underpins the paper-based SAT, and Item Response Theory (IRT), the foundation of the upcoming dSAT
Key pros and cons of each approach
The reasons behind the reduced scoring transparency
What parents can do to ensure accuracy in scoring and manage the transition
The Digital SAT: Stage-Adaptive Testing
Unlike the paper SAT, the dSAT will be adaptive, employing a “stage-adaptive” approach to better assess a student's proficiency. In adaptive testing, the difficulty of the test is adjusted based on the test taker's performance. This means that the dSAT will present questions that are more or less challenging depending on how well the student performs.
There will be two Reading and Writing modules and two Math modules. For both subjects, the first module will not be adaptive, serving as a baseline to gauge the student's ability level. However, Module 2 will be adaptive, its difficulty based on the student’s performance on Module 1. The second module will come in just two versions: one easier and one more difficult.
This adaptive approach aims to provide a more personalized assessment. By adjusting to each student's demonstrated range of competency, it can offer a more granular measure of their abilities.
Now that we've explored how the digital SAT adapts to a student's performance level, let's delve into the foundational theories that underpin this approach. The shift from Classical Test Theory (CTT) to Item Response Theory (IRT) marks a significant evolution in the way standardized tests are constructed and scored.
CTT vs. IRT: The Key Differences
The paper-based SAT is based on Classical Test Theory, which focuses on the overall test score and assumes that each test “item” (i.e., question) should contribute equally to the final score. In contrast, the digital SAT employs Item Response Theory, which is more tailored to how a test taker's ability stacks up against question properties such as difficulty and “discrimination.” (“Discrimination” relates to how precisely a question can differentiate between high and low performers with respect to a particular ability/proficiency).
Pros and Cons of CTT and IRT
CTT has its strengths, such as simplicity and the ease of interpreting raw scores. However, it has limitations, including the assumption that all questions should be equally weighted.
The Problem of Skipped Questions
Another notable disadvantage of Classical Test Theory is its susceptibility to bias from missing data. When a test-taker skips a question or does not complete the test, their observed score might not accurately represent their true ability.
For instance, consider two test-takers with the same true ability in a particular domain. Suppose one test-taker spends too much time on a challenging question, due to lax time management. This might leave them too short on time to even attempt a few easier questions towards the end of the test. Even though the test-taker might have answered those easier questions correctly with just a little more time, their score will be lower due to the skipped questions. In such cases, the score fails to accurately reflect the test-taker's true ability.
In contrast, Item Response Theory can handle incomplete data more effectively, as it models the probability of a correct response on a given question based on the test-taker’s demonstrated ability on similar questions. This provides a more accurate estimation of the test-taker's ability even in the case of unanswered questions.
The Problem of Mismatched Question Difficulty
Another advantage of IRT is stage-adaptive testing, which, as mentioned above, tailors the test to an individual's level of ability. For example, if a test-taker performs well on the first Reading and Writing module, they will receive the more difficult version of the second module. If they struggle with the first module, they will receive the easier version of the second module.
Stage-adaptive testing does not make the test unfair or easier for students with lower abilities. The purpose of this approach is to measure each test-taker's ability more precisely, not to give them an advantage over others.
The stage-adaptive design helps to ensure that test questions are neither too easy nor too difficult for a test-taker, providing more accurate estimations of their ability. When an item is overwhelmingly difficult for someone, it is less informative about their ability, as they are likely to answer it incorrectly regardless of their specific ability level within their percentile range.
Likewise, when a test presents a test-taker with a question that is far below their demonstrated ability level, their answer to that question yields limited additional information about their ability. If your goal is to get precise measurements of someone’s abilities, why waste time presenting them with problems you already know they can easily solve?
Stage-adaptive testing aims to present test-takers with items that are neither too easy nor too difficult for them. By focusing on items that are the most informative about a test-taker's ability, this approach allows for a more granular estimation of how they compare to others in their general percentile range. The result is scoring that is more precise and efficient.
Speaking of efficiency, another significant advantage of IRT is that it allows for a shorter test duration: the dSAT will take 2 hours and 14 minutes, compared to 3 hours for the paper-based SAT.
Trade-Offs: Less Transparency, More Limitations on Prep Materials
Transparency and the dSAT
One of the trade-offs of the shift to the dSAT is a reduction in scoring transparency. With the paper-based SAT, the scoring scales are publicly available, making scores easy to verify. However, with the dSAT, the scales will not be published. This will make it significantly more challenging, if not impossible, for test-takers and parents to assess the accuracy and fairness of the scoring process.
The primary reason dSAT scoring won't be transparent is the use of IRT and the stage-adaptive nature of the test. The scoring algorithm in IRT is complex, taking into account not only the number of correct answers but also the difficulty and discrimination parameters of each item. This complexity, combined with the stage-adaptive testing format that adjusts the second module based on the test-taker's performance on the first module, makes it difficult to provide a simple, straightforward scoring scale.
Additionally, each test-taker may receive a unique combination of questions, further complicating the scoring process and making it challenging to compare scores between test-takers. Along with helping to reduce cheating, this may ensure a more personalized and precise assessment. However, it comes at the cost of reduced transparency.
Say Goodbye to QAS Exams
Another disadvantage of the dSAT is that students will no longer have access to their tests using the College Board's Question and Answer Service (QAS), as was possible with the paper-based SAT. Consequently, students will not be able to use official exams from previous test administrations for practice and preparation. Instead, they will have to rely exclusively on the practice tests published by the College Board.
The College Board's Credibility and Past Challenges
While the College Board aims to maintain its credibility, it has faced challenges in the past, particularly during the last SAT overhaul in 2015-2016 and during subsequent administrations. This period saw several notable incidents that garnered significant media attention, including the June 2018 SAT administration controversy. The harsh scoring curve for the Reading and Writing sections resulted in lower scores for many students despite fewer incorrect answers compared to previous SAT administrations.
Additionally, during the same time frame as the 2015-2016 redesign, the College Board had to withdraw questions from the SAT scoring process. In the summer of 2015, two sections were not scored due to a printing error that led to incorrect time allotments for students. In June 2016, four questions were removed from scoring because of concerns about potential flaws.
The College Board has noted that withdrawn questions do not affect scores, as they are not included in the final calculation. However, this response may not be entirely satisfactory for some, as flawed questions can still impact a student's performance.
Encountering a confusing or flawed question can disrupt a test-taker's focus, affect their mindset, and cause them to spend more time than necessary trying to decipher the question, ultimately leaving them with less time to complete subsequent items.
What Can Parents Do?
Parents can help hold the College Board to a high standard by advocating for independent oversight or third-party audits of the scoring process, as well as staying informed about the College Board's testing methodologies. One great way to stay updated on the latest developments is by subscribing to this newsletter!
Additionally, parents should be assertive in urging transparent communication from the College Board and never hesitate to provide feedback when issues arise.
The ACT Alternative
As a test prep expert who has been tutoring both the SAT and ACT since before the 2015 SAT redesign, my best guess is that, overall, the benefits offered by the dSAT will outweigh any potential drawbacks. Nonetheless, the switch to the dSAT is a huge change. The early implementation period will introduce new challenges and uncertainty to what is already a stressful experience for many.
For parents and students who are apprehensive about the transition to the digital SAT, one option may be to consider taking the ACT in 2024. This would allow time for the College Board to address any potential issues arising from dSAT implementation. Planning to take the ACT in 2024 may help ensure that your child’s college applications are not negatively affected by unforeseen challenges during the transition.
Conclusion
The shift to the digital SAT comes with both benefits and trade-offs. While the adoption of IRT and stage-adaptive testing offers potential improvements in test precision and efficiency, the reduced transparency in scoring, limited access to previous test materials, and potential challenges of implementation may be a concern for some parents. By staying informed, advocating for transparency, and considering alternatives such as the ACT, parents can help ensure that their children have the best chances of success in their pursuit of higher education.
~ Dave