Does ChatGPT dream of electric lawers?

Sam Falco
3 min readDec 28, 2023

Ken White, AKA “Popehat,” posted a screen shot on Bluesky last night of a ChatGPT conversation.

His prompt: What does Ken White think about RICO?

ChatGPT’s response:

“As of my last knowledge update in January 2022, Ken White… has not expressed a singular or uniform opinion…” followed by some general summary of varying opinions about RICO, and finishing with “To get the most accurate and up-to-date information on Ken White’s views on RICO…I recommend checking his recent writings…”

White’s comment was, “Still some bugs to work out.”

A screenshot of Ken White’s Bluesky post, He has posted a screenshot of the conversation summarized above, with the comment, “Still some bugs to work out.”

White is a lawyer with a very public and long record of commentary on RICO. It’s not surprising that he’d view the response as buggy. I had a similar reaction when I first experimented with ChatGPT. I asked it what it knew about me as a writer. It provided a list of my published works. None were real. I thought, how useless is this thing if it just makes things up?

That understandable reaction highlights a few common, interconnected misconceptions about ChatGPT:

  • That it is a search engine,
  • With access to all information (or at least on the Internet),
  • Therefore, its so-called hallucinations (AKA “making stuff up”) proves that it doesn’t work.

None of those misconceptions are true.

ChatGPT is not a search engine. It is not a tool for retrieving information, although sometimes it will work for that purpose. You may have seen an explanation that it generates its text via probabilities of words occurring near each other. There is much more to it, but that explanation is good enough for our purposes right now. When you prompt ChatGPT, it doesn’t retrieve information. It determines the most likely first word or phrase in a response, then the next word or phrase, and so on.

The quality of that output depends on the quality of its training data. The training data shapes its responses. Getting 100% reliable information is never going to happen because its training data isn’t 100% true.

OpenAI has been coy about the exact content of the training data, so there’s no way to know for sure what the tool has digested. But we do know that fiction is included. That’s thanks to some Google researchers who tricked it into divulging some of its training data. They found blocks of published novels in the output. So fiction can creep into responses.

Beyond that, some of its data consists of false information. When I ask it questions about Scrum, its responses draw on the Scrum Guide. They also draw on articles written by people who didn’t know what they were talking about. The Guide says that a Sprint is no more than one month. A lot of people have written that it is “two to four weeks.” ChatGPT doesn’t know that the latter statement is false, only that it has seen it made frequently.

Garbage In, Garbage Out

Furthermore, ChatGPT hasn’t been trained on all data, everywhere. It hasn’t even been trained on everything that’s available on the Internet. To return to Ken White’s example — was ChatGPT trained on all of his writing? On some? On none? We can only guess. Given the response, it’s probably a very small amount of “some.”

ChatGPT responses remind me of an extemporaneous speaking competition I participated in during high school. You’d be given a random object (one was an internal part of a vacuum cleaner) and a few minutes to gather your thoughts. Then you had to give a five-minute speech about some doodad you probably knew nothing about. The contest was judged on style, presence, and audience engagement. If the criteria had been, “Did the presenter give me accurate/useful information about the object,” no one could have won.

You can elicit valuable output from ChatGPT. It requires going beyond information retrieval and learning something about the art of prompt engineering. But that’s a topic for another time.

ChatGPT has limitations and drawbacks, and there are serious ethical considerations about what its training data contains. I’m not qualified to address that topic. But based on what I’ve learned so far, I know that it’s pointless to criticize ChatGPT because it fails at a task it isn’t designed to do.

--

--