AI and the future of assessment
BackI had the pleasure of giving a talk at the SFCA Summer Conference about generative artificial intelligence (AI) and its implications for the future of educational assessment. This blog post is a condensed version of that talk, providing an overview of the technology, its limitations, and opportunities.
Understanding generative AI
At a high level, creating a text generation system such as ChatGPT is simple. First, collect vast amounts of real-life text data from diverse sources such as websites, books, and Wikipedia. Then, feed the data into an AI programme that uses a neural network algorithm to learn to mimic human language. The resulting software functions like text autocomplete, predicting the next word in a sentence. We can see this when we peek under the hood. In the example output below, we see that the model considered other words instead of ‘inspiring’. If the model chose the determiner ‘a’ instead of the adjective ‘inspiring’, we could have ended up with a different sentence structure, and perhaps also different content for the paragraph.
One of the main questions in the debate on text-generating AI models is whether they are truly intelligent. Some would argue that these models can only regurgitate information but lack genuine understanding, much like a student who has memorised all the world’s textbooks. Others respond by saying that, since the model has memorised so much material and can answer questions without human-like reasoning, then maybe that is good enough to qualify as a new type of non-human intelligence. For a more in-depth discussion, I recommend reading The AI revolution in medicine by Lee et al. (2023), which provides intuitive insights into the inner workings of large language models and relatable examples from medicine.
Overcoming current challenges
While text generation models are powerful, they also present numerous challenges when applied in education. For example, they are known for inadvertently producing biased or toxic outputs. This is being addressed by AI researchers by fine-tuning models and realigning them to societal expectations of good behaviour through techniques such as reinforcement learning from human feedback. If you have ever rated the output of ChatGPT by clicking the thumbs up/down icon, then you may have helped in that process.
A related issue is reliability in generating accurate and trustworthy content. At Pearson, we are addressing this by aligning language models with our expert-curated content and following best practices for eliciting and evaluating reliable model outputs.
Finally, the increased reliance on AI in automated decision-making raises several ethical considerations. AI-based products should be built with appropriate transparency and explainability measures in mind. Teachers and students should be able to understand why and how a system made a given decision. And if they are not convinced, they should be able to contest the outcome.
Envisioning the opportunity
In late 2022, the initial reactions to generative AI at schools were cautionary. In early 2023, however, guidelines such as the JCQ AI brief offered a more balanced approach and practical tips on helping students make the most of free generative AI tools in appropriate ways.
In another report, the U.S Department of Education painted a longer-term vision for an AI-enabled education. The table below from this report summarises what the world of tomorrow may look like once teachers and learners are able to interact with machines in more natural ways. The overarching sentiment of the report is that AI will be able to help reduce educators’ workloads, allowing them to spend more time 1:1 with students, focusing on high-value teaching activities.
At Pearson, we are particularly thrilled about the prospects of AI in improving the validity of educational assessments. For example, rather than simply evaluating the final answer to a math question, the AI can analyse the reasoning process and identify misconceptions along the way. And while the technology is nascent, it allows us to design more authentic assessment experiences.
Bridging the gap
My follow-up discussions with SFCA Summer Conference participants revealed that they are excited about the opportunity of integrating AI in education. They appreciate that they need to embrace AI and encourage students to use it because it is becoming a 21st-century job skill. Indeed, generative AI may impact a wide range of white-collar jobs by automating cognitive skills, because the technology is being embedded in everyday tools, such as Microsoft Office. By helping our students embrace AI as a supplement to learning, and not a replacement of it, we will help future-proof them for AI-enabled careers.
If you would like to get more ideas about implementing AI in your teaching, then this blog has recently published posts on the topic here and here. Please also see our recent Pearson webinar.
Kacper Łodzikowski is VP of AI Capabilities at Pearson, focusing on natural language processing, computational psychometrics, human-computer interaction, and ethical AI governance. He is also a researcher and lecturer in AI at Adam Mickiewicz University in Poznań, Poland.