Methodology – Stop Cyber Slop

This page will cover what I mean by AI-Generated and AI-Proofread, as well as the issues with both. It also covers the tools and techniques I use to form my opinion. References are also included at the bottom of the page.

AI-Generated vs AI-Proofread

It can be difficult to tell AI-Generated and AI-Proofread text apart. Because of this, I err on the side of caution and will rarely make a definitive statement that something is fully AI-Generated.

Issues With AI-Generated Writing in the Context of News

While AI can be used to quickly publish content, it is also prone to error, leading to problems with inaccuracy. Misinformation caused by AI can have real world impacts, on small and large scales.

Using AI/LLM in this fashion also can be a sign that the author is not knowledgeable about a specific subject. This can be particularly problematic, as they may not be able to tell that the generated article contains errors or misinformation. I have already ran into issues with this from one subject on this site, where the response to being advised of inaccuracies was… less than acceptable.

It’s also just lazy as hell. Spend some time on actual research. Read through sources critically! AI tends to not do it, and will just take the sources at face value. That’s bad journalism.

People still seem to have a fundamental misunderstanding of how these LLMs work. It’s not reading and actually understanding what you’ve provided it. It’s simply following the prompt to generate the most likely text. You think that’s enough to actually intelligently write something? They’ve done brain surgery on and injected a billion monkeys with meth, and you’re seeing the result of that.

An additional consideration: what happens if the AI generates libelous text? I have yet to find a court case where this is the defense, but my hunch is that “Well, the AI wrote it so I can’t be held liable!” is not going to be an acceptable defense.

Is AI-Proofreading An Issue?

Yes, for a few reasons.

It can be difficult to tell the difference between generated and proofread content. That should make the users of AI proofreading nervous, as it opens them up to accusations that their writing is AI generated, and they don’t have much of a provable defense. It’s one of the reasons why I don’t use AI proofreading in my writing.

Similar to how AI generated writing can have issues with context being lost, errors being created, etc., AI proofreading can enable the same errors. It’s too much of a shortcut to be trusted in the hands of most people, when it comes to something as important as news and factual information.

How to detect AI/LLM Generated Writing

There are several tools and methods one can use to detect if writing may be generated by an LLM. The efficacy of the tools will vary, and can be impacted by the LLM model used (if AI), the person’s writing style (if human), amoung other factors.

Tools for detecting LLM writing that I may use include:

DetectGPT
Copyleaks
GPTkit
GPTZero
Undetectable.ai
My brain

When determining how much weight I should put behind a specific method, I first need to do my own testing. I run various samples of my own writing though the method first, including my formal and informal writing. I then run writing generated by some different AI models though the method. Finally, I run various samples of the subject’s writing though the method. If I have access to historical writing of the author, I analyze that text as well to see if it’s just a case of the author writing in a way that the method detects as AI. There are different methods for detecting AI writing, so I have made sure the tools I use utilize a variety of methods.

If there is too much variation between different tools I will make that clear when writing about the subject.

DetectGPT

DetectGPT is first detailed in this scientific paper, where the authors claim it is notably better at detecting fake news than other tools that use the same detection method. It does not detect AI based on comparing samples, but instead uses probability based on the way that certain LLM generate text.

CopyLeaks

CopyLeaks is a commercial service that can detect AI generated text from a variety of popular models. Research (see references) comparing various detection methods usually finds CopyLeaks as being among the most accurate for detecting AI text, with a low false positive rate.

GPTKit

GPTKit uses several techniques for detecting AI generated content, and claims an accuracy rate of 93%.

GPTZero

GPTZero is another detector that was found to have reasonable detection rate by research studies. It can have a higher rate of false positive detection than other methods.

Undetectable.ai

Undetectable.ai is another commercial service that research usually finds is reliable. I treat it as being similar to CopyLeaks. I am less enthused with how it allows you to ‘humanize’ text, so I do not give the service any money.

Other Services

I may use other services not mentioned on this page.

My Brain

If you’ve read enough writing, especially AI writing, you can sometimes detect something is off. It can simply be that AI is wordy as hell, and can use a lot of words to make a simple point. Other LLM models or AI tools may have certain tells that are detectable by humans.

References

https://www.eweek.com/artificial-intelligence/ai-detector-software/#chart

https://www.zdnet.com/article/i-tested-10-ai-content-detectors-and-these-3-correctly-identified-ai-text-every-time

https://guides.library.ttu.edu/artificialintelligencetools/detection

https://www.technologyreview.com/2022/12/19/1065596/how-to-spot-ai-generated-text