Would chatGPT need to resign as minister of higher education?

These days, Norwegian cabinet ministers are under fire for uncritically copying others' work. But would cabinet minister GPT UiO have to resign as a minister for higher education? Super simple research shows that the answer is yes.
(Norwegian version of this article)

Image may contain: Product, Sleeve, Font, Collar, Electric blue.

Edited image of VG's article about Sandra Borch's resignation as Minister of Research and Higher Education. The image is intended to be humorous.

What is Cheating?


To answer whether GPT UiO would be caught cheating, we must first discuss what cheating is. This is described on UiO's website about cheating, but to me, it seems like this page is more concerned with telling what can raise "suspicion" of cheating, rather than what cheating actually is. Perhaps this is intentional, because what constitutes cheating is a discretionary decision from subject to subject and context to context. But as I read these guidelines, it seems like cheating is one of three things (my summarized points):

1: Plagiarism


This is when you present other people's text and results as your own. In principle, it is allowed to refer to others' work, but the distinctions between what is yours and what is others' work should always be clear.

2: Incorrect Source Handling


This is when you cite sources in an incorrect way* or present non-existing sources for claims in your text.

3: Use of Unauthorized Aids in Assessment Situations


Different assessment situations have different rules for what is allowed. In some situations, textbooks, collaboration, or GPT UiO are allowed, and in others, they are not. If you use GPT UiO in a situation where it is not an approved aid, it is considered cheating, end of story.

*The line between what is negligence and what is cheating is probably not crystal clear. In principle, someone assessing whether a task is cheating might overlook incorrect source handling if it seems to be without malicious intent. At the same time, I do not believe one can hide behind an excuse of negligence if the entire written work is characterized by systematic incorrect source handling. The problem here is that what constitutes cheating is actually not so easy to define, which becomes evident when university staff need to discuss this on national debate.

The Study


The more interesting question is thus: If GPT UiO is allowed as a tool, can you end up cheating by accident? To test this, I wrote six (Norwegian) texts of about 200 words each about my field of study - physics - in quick succession: "Photoelectric Effect", "Collisions", "Quantum Physics", "Magnetism", "Thermophysics", and "Time Dilation". Each text was written very uncritically in about 5 minutes.

Then, I sent all the texts into GPT UiO with the instruction: "Improve this text." followed by [space], title, and text.

Image may contain: Font, Screenshot, Parallel.

Figure1: using GPT UiO to improve my text.

After that, I gave GPT UiO, in a new chat session, the following prompt: "Write a text of about 200 words about the physical topic: [TOPIC]". Thus, I got a total of 3 texts per topic: my own, GPT UiO's improved version, and GPT UiO's own. All texts are attached to this article, at the bottom.

Then, I sent all the documents into Copyleaks [1,2] to see what percentage of plagiarism it detected on all the different variants. When I first saw the overview, I was surprised by the numbers: several of the assignments seemed to come out with up to 80 and 90% plagiarism! Closer inspection showed that the high percentages came from similarities within the texts I had uploaded. For example, it turned out that "thermophysics - uiogpt improved" had a plagiarism percentage of 96%, but that this stemmed from "thermophysics - my text", which is not unreasonable. After filtering out these hits, I ended up with almost no plagiarism on any of the texts. Puzzled and somewhat suspicious of the result, I chose to repeat the process after translating all the texts to English (also attached). This time, the plagiarism check came back with many hits [3], and the result is given below.

Image may contain: Rectangle, Slope, Font, Parallel, Pattern.

Figure 2: Plagiarism and paraphrasing using GPT UiO.

I have chosen to differentiate between Plagiarism and Paraphrasing. Copyleaks differentiates between three categories: "Identical", "Minor changes", and "Paraphrased", and for simplicity, I have placed the first two categories under "Plagiarism".

Paraphrasing


Paraphrasing is very similar to what many of us millennials did in middle school in the early days of the internet; namely, take a source from the internet and paraphrase it so that we would not be "caught" by the teacher. Below is an example, where Copyleaks has color-coded with orange what it considers to be paraphrasing.

Image may contain: Product, Font, Screenshot, Software, Web page.

Figure 3: Paraphrasing in GPT UiO's own text about magnetism. Copyleaks quantifies this as 39.8% of the text being paraphrased from the source shown on the right. Including the other hits, Copyleaks concluded that 52.5% of this text was paraphrased from other sources.

Is paraphrasing cheating? Paraphrasing will not necessarily trigger a sanction, but it's walking on thin ice. In principle, if you paraphrase a source, you are required to cite that source, and violation of this can be in a gray area between points 1 and 2 of what constitutes cheating. But the lines are not super clear - and discretion comes into play. In the example above, we see an example of a paraphrasing of the definition of magnetism - something that might be general enough to be acceptable. If you paraphrase opinions, or less generally accepted truths, paraphrasing without citing sources could become a more serious issue. In the example above, it is also clear that GPT UiO has "borrowed" linguistic images from this internet source, and if I had seen this in my work as an examiner, it would definitely have negatively impacted the overall impression - and perhaps the grade.

Plagiarism


What is clearer, however, is that plagiarism is cheating. If you copy others' words into your own work without quotation marks and citation, it is cheating. But here too, some discretion must be applied. For example, we see that even in the small text I wrote for this study, some of the formulations were considered to be plagiarism. Closer investigations showed that some of these were extremely easily reproduced sentences, for example this one.

Image may contain: Font, Screenshot, Rectangle, Parallel.

Figure 4: Example of plagiarism in my text about collisions. The example shows that "plagiarism" is not necessarily plagiarism.

You would have to search long and hard before you find an examiner who considers this obvious sentence to be a particularly serious case of plagiarism. A few percentage points on a plagiarism test can thus be unproblematic - especially if the topic is as general as "collisions". In GPT UiO's own text about collisions, however, we see a more serious example:

Image may contain: Font, Screenshot, Rectangle, Parallel.

Figure 5: When GPT UiO plagiarizes, it quickly becomes actual plagiarism.

In this case, my examiner glasses would have become somewhat more foggy.

Some Advice for Essay Writing


From these investigations, it strikes me that the Minister of Education GPT UiO would have been in hot water - and would probably have been dismissed. But what should a poor, weary student think? In conclusion, I will offer three pieces of writing advice in the era of language models.

1. Think of Your Assignment Like Rakfisk


Rakfisk is mostly safe to eat, but in rare cases, the dreadful bacteria Clostridium botulinum can grow and cause botulism, a potentially deadly disease. This bacteria grows if the rakfisk is contaminated with soil and stored in an unsuitable place [4,5].

Your assignment is like rakfisk. If you first dabble in copying someone else's text (contaminated soil) directly into the assignment, even with the intent to "remove it later", it can cause problems further down the road. Therefore, keep your rakfisk clean of such bacteria - make sure you never paste someone else's text directly into your work and do not write in your document at the same time while reading another source.

2. Use GPT UiO as a Writing Coach, Not a Secretary


From what we have seen, asking GPT UiO to improve a text as of today does not entail a great risk of plagiarism or paraphrasing. Getting it to write the entire text for you, on the other hand, can, as we have seen, result in an increased chance of plagiarism or thoughtless paraphrasing. To reduce the chance of cheating, try not to succumb to the temptation to let it write for you "from scratch".

3. Take Ownership of the Entire Text


The most important piece of advice is that you must take ownership of the entire text. Even if you use GPT UiO to help you write, you must still stand by the content and continuously ask yourself "is this an uncontroversial claim?", "do I have academic coverage to assert this?", "does this feel like my own words?" If the answer to any of these questions is "no", you should take a moment to check if there are sources that can support what you are writing and clearly refer to these. Then perhaps the job as Minister of Research and Higher Education is not out of reach, after all.

The Texts


If you want to conduct your own analysis on the texts, you can find them all attached here.

Folder with the Norwegian texts.

Folder with the English texts.

Footnotes and Sources


[1] One might wonder why I did not use UiO's own tools "Ouriginal" here, and the answer is as follows. Ouriginal is reserved for examiners and administrative staff to check for actual plagiarism upon submission. The reason is partly that the files uploaded here are added to a source database against which future works are tested, so the rules for what gets uploaded there are strict.

[2] Copyleaks is a tool external to the organization UiO, but since I only sent in green data (see UiO's page for data classification), I consider its use justified in this case.

[3] Why Copyleaks gives almost no hits on plagiarism in Norwegian I am not sure, but there are several possible explanations. Maybe GPT UiO plagiarizes less in Norwegian than in English because there is less training data. Maybe Copyleaks has less access to Norwegian sites to check plagiarism against or is worse at detecting Norwegian plagiarism. I encourage curious readers to explore the matter further.

[4] Eldrid Borgan, How to Avoid Getting Sick from Rakfisk (translated), forskning.no. Retrieved 24.01.24. https://www.forskning.no/mat-og-helse-mikrobiologi-naturvitenskap/slik-unngar-du-a-bli-syk-av-rakfisken/1610995

[5] The Norwegian Food Safety Authority, Botulism. Retrieved 24.01.24. https://www.mattilsynet.no/mat-og-drikke/matproduksjon/lokalmat/lokalmat-innlandsfisk/botulisme-hos-innlandsfisk

By Vidar Skogvoll - senior lecturer at KURT
Published Feb. 7, 2024 2:50 PM - Last modified Feb. 7, 2024 2:53 PM