Fact-Checking Auto-Generated AI Hype

Anna Mills
Cañada College

I presented first-year composition students with a series of claims that ChatGPT generated when asked to tell me surprising things about AI and back each up with a source. The output largely consisted of glowing praise, including “AI can dream” and “AI can read minds.” I demonstrated how to fact-check ChatGPT’s representation of the sources and claims, uncovering inaccuracies. Then I invited students to do similar fact-checking. One goal was to foster curiosity and a healthy skepticism about what AI can do and about the accuracy of chatbots in particular. The other goal was to give students practice evaluating and summarizing sources.

Learning Goals

Verify if a source exists and confirm key details such as authors, publication name, and date
Evaluate source credibility
Identify the main idea of an article from a brief scan
Identify where a summary misrepresents an article’s content or tone
Develop a healthy skepticism about plausible-sounding AI-generated claims

Original Assignment Context

My first-year composition students had written summaries and critical assessments of articles and were moving into a unit on research skills. They had completed a library workshop on searching academic databases as well as textbook readings on types of sources and source credibility from my open textbook How Arguments Work: A Guide to Writing and Analyzing Texts in College. In addition, they had read several introductory articles on AI in general and large language models in particular as the course was focused around an AI theme. This assignment asked them to apply the summary skills they had already practiced with longer texts to a research context where they were skimming multiple articles and assessing one-sentence summaries.

Materials Needed

The assignment requires the following three chat session transcripts, Internet access, and Google Docs or Word for commenting.

Sample chat session with five surprising “facts” about AI
Annotated version of the above chat session that evaluates each claim
Sample chat session with 15 surprising claims about AI for students to fact check
Alternate Google Doc version of the same chat session with 15 claims for students to fact-check

Access to social annotation/collaborative annotation software such as Hypothesis or Perusall integrated with a learning management system will allow for easier grading of student work. I used Hypothesis integrated with Canvas to assign students to write their fact-checking comments in the margins of the online ChatGPT session transcript.

If students have not yet been introduced to core critical AI literacy concepts, consider assigning one of the following readings or videos before sharing the ChatGPT transcripts:

AI Platforms like ChatGPT Are Easy to Use but Also Potentially Dangerous by Gary Marcus, Scientific American, December 19, 2022
Let Us Show You How GPT Works — Using Jane Austen By Aatish Bhatia, The New York Times, April 27, 2023
Artificial Intelligence: Last Week Tonight with John Oliver, March 7, 2023
Computers are getting better at writing, a video from Vox.com, March 2020

Extensions of the Assignment

The instructor may wish to generate new chat sessions or modify the assignment to allow students to generate examples of AI hype by prompting a system such as ChatGPT, Claude, or Llama. Any system not connected to web browsing will currently produce frequent examples of fabricated or inaccurately described sources (if the sources are not uploaded with the chat session, of course). Systems connected to web search like ChatGPT with web browsing, PerplexityAI, Bing, Bard, or Poe.com web search will reference real sources, but the summaries and reported details about the sources may well still be inaccurate.

Time Frame

The assignment should take students about two hours in total. I assigned homework that took about an hour, and we spent one hour-long class session on the activity.

If students have not already been introduced to AI and large language models, including key concepts like predictive text and fabrication/hallucination, I would allow an additional hour for a short reading and/or video with brief discussion.

Overview

In Fall 2023 in a community college first-year composition course, I invited students to check ChatGPT-generated claims and source references related to AI. The goals of the assignment were to build skepticism of AI hype and awareness of inaccuracies in AI text and to practice finding, skimming, and accurately summarizing academic sources.

To prepare students, I shared my own fact-checking of a sample ChatGPT session. I had prompted ChatGPT very simply, asking it to tell me five surprising facts about AI and back up each one with a credible source. Sample auto-generated claims included “AI Can Dream and Visualize Its Own Imaginations” and “AI Can Create Original Art and Music.” Students read my comments on the chat transcript where I quoted from the source text I found online to show that many of the AI-generated summaries overstated the source’s descriptions of impressive AI capabilities.

Next, students worked in small groups to do their own fact-checking of a different chat session transcript. In that session, I had prompted, “Teach me at least five surprising things about artificial intelligence. For each, give a credible source with author, publication, and date. Describe why the source is credible.” I followed this up with a request for ten more of the same. Most students ended up working individually rather than collaboratively; next time I would guide the group process by suggesting steps and roles for group members and encouraging reading aloud from the actual source abstracts and discussing whether the AI summaries are accurate.

Overall, while the students’ annotations did show they were gaining increased familiarity with academic sources, they seemed more interested in discussing the claims themselves than in the fact-checking process. Reading the twenty claims about AI contributed to the class’s larger discussions of AI hype and open questions about AI capabilities.

Our discussion of the flaws in the output emphasized how unreliable plausible-looking outputs can be. Students seemed interested in the example of a fabricated author, “Yann Bengio,” an amalgam of famous AI scientists Yoshua Bengio and Yann Lecun whose research is heavily cited. Neither had written on the topic of the nonexistent journal article cited by ChatGPT.

We also discussed how the ChatGPT output was heavily slanted toward a positive view of AI; out of the twenty “surprising” claims about AI, only one was in any way negative. Students thought it funny and suspect that ChatGPT justified a Facebook company blog post as credible saying, “While a company blog might not be a traditional academic source, it's a primary source in this case because it's directly from the team that conducted the research.” Next time I would emphasize even more the disconnect between a verifiable primary source used in academic research and a company’s marketing and self-reporting on a product.

Next time to increase engagement, I will try inviting students to discuss the claims about AI and rate them on an AI hype meter before and after they fact-check them. This might spur discussion of both the prevalence of AI hype and the ways automation bias leads us to overestimate the accuracy of generated text.

Acknowledgments

Assignment

Assignment Purpose

This assignment gives you a chance to practice researching and summarizing sources and finding the errors and bias in AI-generated text.

Preparation

Here's an optional video introduction to this assignment!

First, read the sample chat session where a user asks ChatGPT for five surprising things about artificial intelligence. (Alternate Google Docs version of the chat session)
Next, read the comments where an instructor fact-checked the assertions in ChatGPT's output.
Now, read this alternate sample chat session where the user asks ChatGPT for fifteen surprising things about artificial intelligence (but does not mention bias). This is the one you are going to help fact-check.

Main Activity

As a class, we will research whether the sources it mentions exist and whether they really support the assertions in the ChatGPT output. Do the ChatGPT summaries of research findings accurately reflect what the researchers wrote?

You should make at least two comments that do not duplicate comments your classmates have already made.

Can we find links to each source?
Can we find any inaccuracies in the date, author, title, or publication name for a source?
Read or skim one or more of the sources. Can we find a quote that supports or doesn't match an assertion in the ChatGPT output?
If the source is real and available, read a little bit of it. Does the ChatGPT summary of it (including the ChatGPT bold headline) accurately reflect what it claims? Are there any differences in the claim or the tone of the claim?

Assignment by Anna Mills, shared under a CC BY NC 4.0 license.