The game is up for essays as a general assessment strategy

Back in 2020, I wrote a post on this blog stating that GPT-3 was the beginning of the end for academic essays. Until recently, I assumed the bells would toll once GPT-4 came around in the near future, perhaps in early 2023. I was wrong.

Last week OpenAI launched ChatGPT, a chatbot based on a subset of GPT-3 but with an additional machine learning technique allowing it to provide results with just 1.3 billion parameters instead of the 175 billion of GPT-3.

The results from queries read great and can easily pass for an undergraduate who might just be able to compose some reasonable sounding and looking sentences connected with the topic of the question at hand. It does not do critical thinking but then that is a skill that unfortunately is often absent from undergraduate essays.

And here in lies the crux of the problem, particularly for UK universities who are heavily reliant  on essays as an assessment strategy. Over the last decade or so the standard of questions and difficulty has dropped, as can be seen by the grade inflation from 2013 onwards once the fees were raised to £9,000. Say what you may about students being better prepared, but the incentives are aligned for grades to rise and fails to be kept to a minimum since universities are mostly dependent on student fees. I still remember the 3% rule (ie, a 3% failure rate in your module would mean some explaining to do) as well as the emphasis in "good academic results".

If UK universities were not able to hold the line on standards after the 2013 fee raise or reacted appropriately to the use of essay mills*, how will they do so today now that AI is getting good enough to effectively mimic the answer of the average (and not so average) student?

Strategies

As for strategies to deal with this AI arms race between students and assessors, there are a couple of idea to float. First, narrowly defined and analytical questions that require clear connection with the specific content covered in the course. The opposite of the idea of the essay question where students go find more information outside the confines of course materials.

Second, moving to different types of assessment, such as timed exams without internet access or oral exams. The former is a fairly traditional type of assessment and one that universities have progressively moved from since it relies in student recall and memory. But it gets the job done. The latter is also a fairly traditional form of assessment and one which relies on memory too.

Specifically about oral exams I quite like those I have done at CBS. Students have to submit a 5 page synopsis about a topic of their choice from the course and then are examined orally for 15 minutes. The synopsis is the starting point for the oral exam and a softer way for them to get into the groove but the formal assessment is their performance in the oral exam which is expected to cover more ground than the synopsis.

What does it mean for me? For now, an immediate change to my exam for common law of contract where I had adopted the traditional essay question which is not common at least in CBS. Back to a timed exam and with a narrowly focused question, probably based on a specific case. My other exams seem less exposed to chatGPT risks for now but I would not be surprised  having to reconsider them next year.

*how many students were expelled for plagiarism offences? And if so, how was that information relayed to future cohorts as a deterrent? Again, alignment of incentives matter.