TeacherServer AI Tools: AI in Education

article Article Summary

May 18, 2025

Higher education's response to GenAI has relied on telling students how they should use AI in assessments rather than redesigning assessments themselves, creating an unenforceable "enforcement illusion" that undermines assessment validity.

Higher education's response to GenAI has relied on telling students how they should use AI in assessments rather than redesigning assessments themselves, creating an unenforceable "enforcement illusion" that undermines assessment validity.

Objective: This paper introduces a novel conceptual distinction between "discursive" and "structural" assessment changes to analyze current approaches to GenAI in higher education, arguing that most existing frameworks fail because they focus on communicating rules rather than redesigning the actual mechanics of assessment tasks.

Methods: The researchers conducted a critical analysis of prominent assessment frameworks developed in response to GenAI, including "traffic light" systems, the AI Assessment Scale (AIAS), and declarative approaches. They examined how these frameworks attempt to address the challenge of assessment validity when students can use GenAI to complete tasks without demonstrating genuine capability. The authors then developed and applied a conceptual distinction between discursive and structural changes to assessment to evaluate why these approaches may not effectively ensure assessment validity.

Key Findings:

Current approaches to GenAI in assessment primarily rely on "discursive changes" - modifications that depend entirely on student understanding and compliance with instructions, without changing the underlying assessment mechanics.
Most institutional responses (such as traffic light systems categorizing permitted AI use levels) create an "enforcement illusion" by borrowing language from structural systems like traffic lights while lacking their enforcement capabilities.
Discursive approaches face three critical limitations: students may not clearly understand what is permitted, students may not voluntarily comply with guidelines, and educators lack meaningful mechanisms to verify compliance.
The alternative "structural changes" directly alter the nature, format, or mechanics of assessment tasks so that their effectiveness doesn't rely on student compliance with instructions.
Structural approaches include shifting from product-focused to process-focused assessment, incorporating real-time demonstration of skills, and designing interconnected assessments where later tasks build on earlier authenticated work.
The metaphor of traffic lights is particularly misleading because real traffic lights work precisely because they are structural interventions embedded within robust systems of enforcement, while educational "traffic light" systems lack any meaningful enforcement mechanisms.

Implications: The distinction between discursive and structural changes provides a conceptual toolkit for educators and institutions to develop more effective assessment practices in the GenAI era. Rather than focusing on communicating rules about AI use, assessment design should shift toward building validity into the structure of assessments themselves. This may require viewing assessment validity at the unit or module level rather than the task level, designing interconnected assessments that build on authenticated work, and capturing students' development of understanding over time rather than just evaluating final products. The paper suggests that the University of Sydney's "two-lane approach" (distinguishing between "Secure" in-person assessments and "Open" assessments) provides a potential structural framework, though its effectiveness still depends on how assessments within each lane are designed.

Limitations: The paper acknowledges that while the conceptual distinction between discursive and structural changes provides a useful framework, specific structural solutions will vary significantly by discipline, learning outcomes, and specific tasks. The authors note there is no one-size-fits-all prescriptive model for assessment redesign, as validity means different things in different contexts. Additionally, the examples provided of structural changes are not claimed to represent perfect assessment security.

Future Directions: The authors suggest that as AI capabilities continue to advance, the challenge of maintaining assessment validity will only intensify. Future research and development should focus on fundamentally redesigning how assessments are structured to demonstrate student capability in an AI-enabled world, requiring significant effort and creativity from educators. The authors stress that long-term solutions require moving beyond increasingly sophisticated rules about AI use toward genuine structural changes to assessment design.

Title and Authors: "Talk is cheap: why structural assessment changes are needed for a time of GenAI" by Thomas Corbin, Phillip Dawson, and Danny Liu.

Published On: May 15, 2025

Published By: Assessment & Evaluation in Higher Education (Taylor & Francis Group)

Comments

Please log in to leave a comment.