ChatGPT is an online tool that uses artificial intelligence (AI) to interact in a way that’s similar to how a human might write.
It’s pretty powerful, and the latest version has sparked lots of discussion about the various ways people might use it to make their jobs easier.
Perhaps the most obvious use of ChatGPT is creating new content. But it can do other things like write code, music and even limericks.
As I’m an editor and proofreader, I wanted to know how ChatGPT got on with that sort of thing.
Part of me was curious about whether a computer program could do what I do.
But more realistically, I was interested to know how an editor like me might use ChatGPT to become more efficient.
Can ChatGPT replace editors and proofreaders?
Can it help editors and proofreaders in their work?
What I mean by proofreading
A quick note before we start. The terms ‘proofreading’ and ‘editing’ mean different things to different people (read more about the difference between editing and proofreading).
In this article, I’m interested in whether ChatGPT can:
- find and remove errors
- make the text stylistically consistent
- suggest ways to make the content clearer and easier to read.
ChatGPT is more accurate than Grammarly and ProWritingAid
I’ve previously tested Grammarly and ProWritingAid, two tools that use AI to help writers and editors improve their content.
I was unimpressed by both programs. They missed several errors of grammar and punctuation, totally failed to detect stylistic inconsistencies and fared poorly with clumsy and confusing sentences. See my analysis of Grammarly and ProWritingAid.
(Note: I tested Grammarly and ProWritingAid in September 2021. I haven’t tested them since. It’s possible that they’ve got better since then.)
By contrast, ChatGPT found numerous errors that Grammarly and ProWritingAid missed, including:
- semicolon incorrectly used instead of a colon
- inconsistent capitalisation within the article
- inconsistent abbreviation
- non-standard capitalisation of headings
- the non-English phrase ‘must-to-learn’
- incorrect tense (‘I decided that I will…’ instead of ‘I decided that I would…’).
ChatGPT didn’t find all the errors
But it still wasn’t as accurate as a human editor. Out of a total of 75 errors that I found, ChatGPT found 54. That’s an accuracy rate of 72%.
It had mixed success with stylistic consistency. It correctly standardised the capitalisation of headings. But it missed the fact that the author had written ‘search engine optimisation’ in one place and ‘Search Engine Optimisation’ in another.
It didn’t check any brand names, so errors with how the author had styled the company names Semrush and Ahrefs remained uncorrected.
It corrected some punctuation errors but retained an incorrect question mark, a comma splice and some incorrect hyphens.
And while it corrected ‘must-to-learn’ to ‘must-learn’, it didn’t correct ‘hundreds of traffic’ – a phrase that you wouldn’t find in correct English.
ChatGPT also left several cumbersome and redundant phrases untouched.
ChatGPT introduced some errors
Like Grammarly and ProWritingAid, ChatGPT did introduce some errors.
The original text consistently referred to the ‘subtitle’ of an article. ChatGPT introduced an inconsistency by changing one reference to the hyphenated ‘sub-title’.
Elsewhere, it correctly noticed that a phrase needed a colon or a dash inserted to make sense. But where I’d suggested a colon, ChatGPT inserted a spaced hyphen. This is incorrect – it should have suggested either an en-rule (–) or em-rule (—), depending on house style.
And on the subject of style, it assumed without checking that I wanted the text returned with US English spelling and punctuation.
Big problem: ChatGPT doesn’t show its work
Normally an editor or proofreader will show their working. If they correct an error, they’ll use track changes (or proofreading marks) so the author can see exactly what changes they’re recommending.
An editor will also normally compile a project style sheet to record all stylistic decisions made.
This means the author or client can check that they’re happy with the changes the editor is recommending. On rare occasions, an editor may suggest a change that the client isn’t happy with – for example, if something has been deliberately done in a non-standard way. It is good practice to check through the work an editor has done.
The biggest problem with ChatGPT is that it doesn’t show its working. It simply returns the edited text with no marks to show what has been changed.
I asked ChatGPT to summarise the changes it had made. It showed very little understanding of what it had done and could not accurately list or quantify the changes it had made. It told me it had made ‘about 12’ changes, whereas in fact it had made many more.
Queries and subjective changes
Editors and proofreaders need to distinguish between concrete errors and subjective suggestions.
They also need to know when they can be confident making a change, and when they should check with the author or client before doing so.
When I’m working with clients, there are three main ways I might intervene in the text:
- ‘Silent’ changes, with no track changes or comments. I’ll do this only for completely uncontroversial changes, like removing double spaces.
- Track changes for errors of grammar, style or spelling where there’s an obvious solution to an issue of clarity or consistency.
- Comments in the margin for issues where I need more information from the client to be able to suggest a solution, or where I have a solution in mind but it’s of a subjective nature (‘This might read more clearly if…’).
ChatGPT doesn’t make these distinctions. It simply makes all the edits it sees fit.
The risks of ChatGPT’s over-confidence
I did a separate test with ChatGPT, using a piece of content where I’d deliberately introduced errors.
It was an article about the author Roald Dahl. I’d deliberately swapped two of the digits so that it wrongly claimed that Dahl ‘attended boarding schools from the age of 90’.
If I encountered something like this on an editing job, I’d definitely flag it. It looks like an error, because it’s unusual for someone to go to boarding school at the age of 90. And it seems likely that the correct figure is ‘9’, because it’s a common typing error to get two digits the wrong way round or accidentally type an extra digit.
But I couldn’t confidently make this change without some external validation – either checking with the author or, if it’s part of the brief, doing my own fact-checking.
ChatGPT confidently changed ‘90’ to ‘9’ without querying it.
I asked ChatGPT to explain why it made this change, and it gave very sound reasons. But when I asked if it knew the difference between concrete errors and issues that would need to be queried, it became very confused and started to give inconsistent answers.
ChatGPT’s over-confidence is an issue that’s been raised by other users. For example, some of the content it generates is factually inaccurate.
The thing to remember here is that ChatGPT is not a person and doesn’t have memories of what it has done.
If you asked me to explain why I’d changed ‘90’ to ‘9’ in an editing job, I’d have a memory of the incident and be able to explain my reason. If you told me that you’d prefer me to record that type of suggestion in a different way, I’d be able to take that on board and apply it to future work I did for you.
ChatGPT just doesn’t work like that. This is a risk because it might be introducing factual errors to your content, and it would be hard to spot because there’s no record of the changes it has made.
The interface doesn’t lend itself to editing or proofreading
Another detail that makes ChatGPT difficult to use for proofreading or editing is that it hasn’t been designed for that purpose.
As a plain-text chat tool, it can’t look at issues with formatting or spacing.
When it returns the edited text, it’s just plain text with no annotations.
And every time I’ve used it to test its editing or proofreading abilities, it’s clipped the end off the text. So I had to paste in the last part of the content and ask it to have another go.
Conclusion: ChatGPT has potential as a proofreading tool, but it’s not ready yet
When I tested Grammarly and ProWritingAid in late 2021, I was quite surprised by how poorly they performed. I knew there would be limitations to using computer software to proofread and edit content, but they detected fewer errors than I expected.
I expected that AI proofreading software would improve over time. My tests with ChatGPT show that the technology has definitely progressed – it performed significantly better than Grammarly and ProWritingAid.
The main problems with ChatGPT for editing and proofreading are:
- it still misses errors – it’s not as thorough as an editor or proofreader
- it doesn’t record its changes, so there’s no way to check its work
- the interface isn’t designed for this type of work
- it introduces unnecessary changes and occasional errors.
My view of AI tools is that they can help us get more efficient but they’re nowhere near ready to replace humans altogether.
As an editor, I use lots of tools to help me work more efficiently. Word’s spell check, PerfectIt (software that checks for stylistic consistency), the Hemingway readability tool – these all make it easier for me to spot potential issues with the text.
But they don’t replace me – to be effective, they need a competent user. I still decide what changes to make based on my knowledge, experience, training and skills.
When I tested ProWritingAid and Grammarly, I concluded that they weren’t really good enough to be helpful for an editor or proofreader. My recent test of ChatGPT shows that the technology is improving. But ChatGPT itself doesn’t work as a tool for editing or proofreading because it doesn’t record its recommendations and enable the writer or editor to choose which suggestions to adopt.
In its current form, ChatGPT cannot reliably proofread content. But with a few tweaks, it could be a powerful tool to help, rather than replace, human proofreaders and editors.