Alan Knowles
Wright State University
This assignment asks students to train a large language model (LLM) to generate Twitter posts in the style of specific accounts via a process known as few-shot learning, which trains the LLM on a small number of sample posts. Students use the trained LLM to generate tweets, then they rhetorically analyze the generated tweets. The assignment was originally developed for an entry-level Professional and Technical writing (PTW) course, but can be easily adapted to other disciplines and course levels.
Learning Goals:
Original Assignment Context: Intermediate level Digital Writing & Rhetoric course
Materials Needed: Accessible text generators (Hugging Face’s GPT-2 Large interface used in this assignment)
Time Frame: ~2 weeks
Introduction
When I introduce students to large language models (LLMs), I emphasize three of the technology’s most salient features: its tokenization of language, its autoregressive generation of text, and its capacity to be trained by users to perform specific tasks via few-shot learning. This assignment helps students to develop an understanding of these features while they focus on developing a broader literacy with the technology (see “Goals” section, below). As is the case with much of the discourse surrounding artificial intelligence (AI), these technical terms can be intimidating to teachers who are not steeped in LLM research. I have spoken to many teachers who have avoided incorporating LLM activities into their teaching because of this jargon barrier. The good news is, the concepts are not as complicated as they seem at first glance. Here is a brief overview.
Tokenization
Think of tokenization as the way LLMs see text – not as words or sentences, but as chunks of symbols that occur in sequence. A single token for current LLM models (e.g., GPT-3 and GPT-4) often roughly equates to one complete word, while older models (e.g., GPT-2) tend to represent words as multiple tokens, often resulting in generated text ending with partial words. The sentence in Figure 1 demonstrates how GPT-3 would tokenize it, with each token marked by a different color.
Figure 1. Tokenized Sentence
Autoregressive Text Generation
Most current LLMs are autoregressive, meaning they use a statistical model to predict the probability of tokens (output) occuring after a given sequence of tokens (input). In other words, after an LLM sees some words, it predicts which words will come next based on patterns learned in its initial training. During this initial training, the LLMs are given massive amounts of internet text and directed to learn to predict the next tokens in a sequence. This results in base LLM models that are good at a large number of text-based tasks. However, to get the most out of LLMs, users can further train them to perform very specific tasks better.
Few-shot Learning
LLMs can be trained via a process known as few-shot learning, which consists of users providing a few sample inputs and corresponding outputs of tasks they would like the LLM to complete. In other words, the user teaches the LLM what to do by demonstrating the task a few times. Following this training, the user need only provide an input and the LLM will provide an output based on patterns it deduces from the user’s training samples. This, in effect, alters the probability of occurrence the LLM ascribes to various tokens. As you will see from the assignment below, this can result in a trained LLM that imitates not only structure, but even individual voices (i.e., tone and style) from relatively few samples.
Like the AI jargon, this training process likely sounds more technically challenging than it actually is, which is one reason I believe students should encounter LLMs in the classroom. The following assignment is easily taught by inexperienced teachers (who, of course, try it once before teaching it) and completed by inexperienced students.
Overview
For this assignment, students train an LLM via few-shot learning to generate tweets in a specific style or voice by giving it a dataset of formatted sample tweets. The activity asks students to use a free online GPT-2 API (more on this in the “Materials Needed” section below). After training the LLM, students rhetorically analyze the tweets they write with the assistance of the LLM. I have used this assignment to introduce students in my college writing courses to LLMs since the spring semester of 2021. It was originally designed for an introductory-level Professional and Technical Writing (PTW) course, but I have since adapted it for use in first-year composition courses and an upper-level PTW course. The version shared here is the most general, able to be taught as a group activity in a single class period.
To prepare for this activity, a teacher must first create the two training datasets for students to use during the activity.
Creating Datasets
The first time I taught this, choosing tweets for the training data was easy. It was February 2021, just a month after the January 6 Capitol riots, and I wondered if LLMs might provide a novel way to analyze the Twitter activity of the political parties during that time. I decided to create 2 datasets of January 2021 tweets: a Donald Trump dataset and a Nancy Pelosi dataset. These datasets worked well for a few reasons:
I suggest teachers create datasets that adapt this activity to the content of their courses. I have taught versions of this that use datasets of tweets from individuals, organizations, and even competing hashtag campaigns. They all work well, so teachers should do what makes sense for their course. For example, a course focused on social justice rhetorics might try the activity with a #BLM dataset and an #AllLivesMatter dataset. Your choice of datasets, here, can also be a teachable moment, as deciding what tweets to include in the datasets can have important ethical implications (see question 6 in the “Discussion and Analysis” portion of the assignment for an example of this).
Formatting Datasets
After you choose the tweets to include in your two datasets, you’ll need to format them for the activity. Once formatted, students can simply copy/paste them into the chosen LLM interface and begin generating text.
Figure 2. Screenshot of a Formatted Tweet Dataset
Note: This figure shows a small portion of the January 2021 Donald Trump dataset. The full dataset contains roughly 35 posts. The more samples, the better. You can use any word processor or text editor to format the dataset.
To format the tweets as training datasets, (1) paste them into a word processor (each dataset should be on its own document), add a topic to each tweet, then separate them by triple pound/hash signs (see Figure 2). Remember, few shot learning involves providing sample inputs/outputs to train the LLM. In this case, the outputs are copied tweets and the inputs short descriptions of the topics of those tweets. If all goes well, students will be able to provide a topic and get the LLM to generate text for tweets about that topic in the rhetorical style of the author(s) of each of the training datasets.
Goals
This activity is designed to introduce students to LLMs in a way that promotes functional, critical and rhetorical LLM literacies.
For my upper-level courses that cover more advanced rhetorical theory, I assign readings on multiliteracies before doing the activity. However, the activity works just as well in lower-level courses where I often do not assign these readings.
Outcomes
This activity has been largely successful each time I have taught it. Students learn to use the LLM interface quickly and are usually surprised at how well the LLM imitates the rhetorical style of the tweets in the training datasets. Some recurring outcomes:
Materials Needed
There are now many different LLM interfaces that students can use for this activity. For this introductory assignment, I use Hugging Face’s GPT-2 Large interface. It is a free, less-capable web interface that generates only a few tokens at a time. I find it instructive because it generates less text at once, often beginning and/or ending with partial words, which I am convinced makes students more likely to consider the important difference between words and tokens. Like many LLM interfaces, Hugging Face looks similar to a standard word processor. However, pressing the tab button at any time will prompt it to generate 3 text recommendations wherever the cursor is positioned. The length of these recommendations varies, but on average, expect 2-5 words per suggestion.
This activity can also be completed with more advanced LLMs, such as OpenAI’s GPT-3 API (known as the “Playground”). In this case, the LLM would generate entire tweets given only a topic–if you choose to do this, adjust the discussion questions accordingly.
Acknowledgments
I developed this activity for a course at Miami University (Oxford, OH) in 2021. I am grateful to the faculty mentors and members of my dissertation committee who gave me the freedom to mold what was, at the time, a new version of the Digital Writing & Rhetoric course to suit my own research interests. A sincere thank you to Jim Porter, Heidi McKee and Tim Lockridge.
Below is a copy of the document I give to first-year composition students for this activity. I recommend sharing this document as a Google Doc, since the cloud features help to facilitate the “Share Generated Tweets” part of the activity.
Overview
A large language model (LLM) is a type of artificial intelligence (AI) that generates natural language, or text that reads like it is human-written. Most of today’s LLMs are called autoregressive models, which means they generate natural language by predicting what text will come next given what came before. A primary feature of these language models is their ability to be further trained by users to generate specific types or styles of text. For this activity, you will practice training the GPT-2 LLM to generate tweets in the style of 2 different Twitter accounts using training datasets provided by your professor.
Use LLM to Generate Tweets
To prepare for the activity, you must first follow these steps:
Before you start generating text, I suggest adjusting a few settings on the website. Changing the Temperature will affect how predictable the generated text is. In other words, a lower temperature will cause the LLM to generate higher probability text (more words like the, and, etc.). Raising the temperature will cause it to generate lower probability text, essentially making it more creative. Raising the Max Time setting will enable the LLM to take more time to offer suggestions, often resulting in text suggestions that contain more tokens/words. Here are my suggestions for where to start with these settings, but you are encouraged to experiment with them as you go:
To begin generating text suggestions, press the Tab button on your computer. You can continue pressing Tab to get new suggestions until you get something you like. A few tips:
|
Share Generated Tweets
In groups, generate 3-5 tweets using both of the provided training datasets. When you are finished, copy/paste your 2 best topics/tweets from both datasets into the tables below. Make sure you paste the tweets into the correct table so we know which dataset was used to train the AI.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Discussion & Analysis