AI and assessment redesign: a four-step process

If GenAI tools have ushered in an era in which institutions can no longer assure the integrity of each individual assessment, the sector must focus on assuring the integrity of awards, write Samuel Doherty and Steven Warburton


29 Apr 2024
bookmark plus
  • Top of page
  • Main text
  • More on this topic
AI projection coming out of a laptop

Created in partnership with

Created in partnership with

UoN logo

You may also like

AI has been trumpeted as our saviour, but it’s complicated
A robot marks a book. Aritificial intelligence has been trumpeted as our saviour, but the truth is much more complicated.

Risks to the academic integrity of HE assessments have increased significantly with the wider availability and sophistication of generative artificial intelligence (GenAI) tools. Addressing these requires a genuine institutional commitment to staff development, support and time for transformative work.

If GenAI tools have ushered in an era in which institutions can no longer assure the integrity of each individual assessment, the sector must focus on assuring the integrity of awards. There is a growing consensus that to do this, assessments should demonstrate a systemic approach to programme learning outcomes by securing academic integrity at meaningful points across a programme of study rather than at individual course/unit level. But how can we do this sustainably, given the existing pressures on academic and professional staff? ­

Review and categorise

Step one requires academics to complete a self-review of each assessment task, ie, how likely it is that AI can complete a task. Based on their knowledge and subject matter expertise, staff must consider several risk factors to establish a low/medium/high rating:

Risk factorDescription
TypeSome task types (for example, a written essay) will have a higher level of risk than others (eg, oral presentation or invigilated exam).
ContextTasks based on very specific material, or that involve authentic application to novel real-world situations, are likely to have lower risk than tasks based on more generally available information or basic concepts.
ConditionsFully online assessments may involve higher levels of risk than those that involve some in-person component.
OutputTasks that involve the creation of an artefact (for example an essay or report) may involve higher levels of risk than tasks that involve a performance component.
SubmissionTasks that end at submission may have a higher level of risk than those that involve a post-submission discussion or follow-up (for example, a Q&A post presentation).
Quality of AI outputHigh-quality AI output is a risk to academic integrity because it may not be recognised as AI output. Staff should test their assessment via a secure GenAI platform and consider the output quality.
InvigilationFully invigilated tasks have a comparatively low risk.

Map and analyse

Step two involves producing maps detailing the use of assessment tasks (with risk ratings) across programmes. Discipline groups carefully consider and identify tasks that are key to students’ achievement of programme-level learning outcomes. They must make reasoned decisions about whether the potential use of AI compromises the academic integrity of these key programme assessments, or whether there are gaps that need to be filled by potential assessment redesign.

They can then prioritise the critical programme assessments that are identified as at risk for assessment reform initiatives. Prioritisation should include triangulation with other factors such as the number of students enrolled in the programme and its overall strategic importance. Any assessment tasks identified as being of lesser importance to the achievement of programme outcomes may offer opportunities to incorporate the use of AI in learning and assessment to help students engage responsibly and ethically with AI.

Reform assessment

Step three involves the planning of assessment reform initiatives that secure the resources (human and technical) required to implement meaningful assessment reform on a potentially large scale. Learning designers who can help identify and implement appropriate measures to ensure the integrity of key programme assessments must support academic staff at this stage. Approaches will be sensitive to disciplinary norms but are likely to involve a mix of moving assessments to secure testing platforms, for example, online proctoring adding additional components to existing tasks such as a presentation or debate or introducing entirely new task types, such as interactive oral assessments.


Finally, in step four, given the effort required to undertake large-scale assessment reform, teaching and learning committees should review associated governance mechanisms to ensure an ongoing quality cycle.

Samuel Doherty is the education and innovation coordinator, and Steven Warburton is pro vice-chancellor for education and innovation at the University of Newcastle, Australia.


You may also like

sticky sign up

Register for free

and unlock a host of features on the THE site