Teaching Social Identity and Cultural Bias Using AI Text Generation

Christopher D. Jimenez
Stetson University

This interactive survey assignment prompts upper-level humanities students to reflect on their social and cultural identities in relation to the textual inputs & outputs of large language models, such as ChatGPT. Successful implementation of the assignment can improve student understanding of the relationship between textual meaning and personal identity as well as the ways in which AI text-generation models may reproduce biases in response to prompt design and a given method of data curation.

Learning Goals:

Reflect on the ethical implications of AI text generation in terms of cultural bias and consider the potential impact of this technology on societal understandings of identity
Discover how to use text-generation technologies to write in response to prompts as well as how prompt engineering leads to variations in responses
Engage with literary texts that explore the performance of identity, including race and gender as well as the intersection of technology and literature

Original Assignment Context: 400-level senior research capstone course and 300-level course on literary theory and textual criticism

Materials Needed: An accessible AI text generation program (i.e. ChatGPT)

Time Frame: ~3-4 weeks

Introduction

An age-old question with a new twist: what is the relationship between text, identity, and culture—and, now, AI? This essay introduces college instructors to the use of OpenAI’s GPT-3 in a classroom activity where students reflect on the relationship between textual data and their own personal identities, such as race and gender. The ~30-minute activity directs students to provide seemingly unassuming information about their social identities which is then fed into the AI model to generate an output that can be evaluated for its potential cultural biases and other tendencies. Subsequent discussion provides students an opportunity to foster critical thinking skills as they analyze and interpret the AI model’s output, explore how AI-generated text can perpetuate cultural biases, and assess the ethics of AI text generation in society at scale.

Two case studies covering students’ real-world usage of the activity are provided: the first in a 400-level senior research capstone course and the second in a 300-level course on literary theory and textual criticism. These case studies provide examples of how AI text generation may be integrated into different core requirement courses in English literary studies. Specifically, the 300-level case study examines student responses to text generation and social identity in juxtaposition to the study of racial and gender performativity, directly following a unit on Nella Larsen’s novel Passing and Toni Morrison’s short story “Recitatif.” The 400-level case study confronts senior research students with questions about academic integrity and citational practices in relation to text-generation technologies and how to leverage these powerful technologies for research efficacy.

As Leah Henrickson and Albert Meroño-Peñuela have noted, “AI-driven authorship poses new challenges for hermeneutics with respect to how meaning originates, transmits, and develops. Are we to privilege the author or the reader, or adjust our understanding of a conventional author-reader relationship?” (120). Their query follows Roland Barthes’ infamous proclamation of “The Death of the Author” (1967), which argues that the meaning of a text is not determined by the author's intentions but is instead created through the interaction between the text and its readers. Rapid adoption of AI text generation has added a new dimension to this perspective, raising important questions about the extent to which unsupervised technologies may reshape our fundamental understanding of what it means to write or create text, including scholarly research. How, for instance, can automated textual production lead readers to better understand themselves, the roles of others, and the society around them? How might shifts in textual interpretation lead to new aesthetic or political practices? How might our conceptions of knowledge, history, or institutions—and the cultural biases which organize them—evolve in response to the advancement of text-generation technologies?

While these are not exactly new challenges, they have become more urgent than ever. At the time of writing in early 2023, the large-language models (LLMs) that have been made publicly available by Google, Microsoft, and OpenAI have already been heralded as “transformative” on a global level, prompting a dire need for educators and students alike to learn responsible practices in using them. Given the vast scale and rapid proliferation of this technology, it is imperative that we develop critical responses to its anticipated economic impact, its potential for discriminatory bias, and its implications for authorial practice and creative expression. The short activity detailed here has therefore been designed to provide students an opportunity not only to assess their own social identities as a function of their personal data but also the chance to interrogate the cultural biases inherent in the text-generation technologies that have proliferated in digital life.

Goals and Outcomes

Beyond helping students critically engage with the concepts of social identity and culture, other more specific goals of the classroom activity relate to student engagement with AI language models: to help students reflect on how AI text generation might be used for self-exploration, to encourage critical rather than passive engagement with the “black box” mechanisms of digital technologies, and to supplement the instructional toolkit with AI assistance. The activity is paired with a short-answer, survey-based assignment (included below with a Creative Commons license) that instructors can flexibly adapt for their classroom lessons or else assign for students to use on their own for homework.

Using literary texts as data can be a particularly valuable method for exploring and reflecting on social identity and the performance of race and gender in literature, which provides context for the two case studies outlined below. Instructors who would like to demo GPT-3’s abilities with regard to social identity but without resorting to collecting any student input can simply collate textual data from narratives whose characters have ambiguous identities. As mentioned, Nella Larsen’s Passing and Toni Morrison’s “Recitatif” are quite useful examples in this regard. For example, Passing explores the performative nature of racial identity, highlighting the ways in which individuals may choose to “pass” as a member of a different racial group in order to gain social advantages. Having students analyze the following passage from Passing for racial markers often proves to be an illuminating exercise due to the interplay between its rhetorics of lightness and darkness as well as revelation and concealment:

Just as she'd always had that pale gold hair, which, unsheared still, was drawn loosely back from a broad brow, partly hidden by the small close hat. Her lips, painted a brilliant geranium-red, were sweet and sensitive and a little obstinate. A tempting mouth. The face across the forehead and cheeks was a trifle too wide, but the ivory skin had a peculiar soft lustre. And the eyes were magnificent! dark, sometimes absolutely black, always luminous, and set in long, black lashes. Arresting eyes, slow and mesmeric, and with, for all their warmth, something withdrawn and secret about them. (45)

Embedding personal information or description such as this into a text generation model and prompting it to provide a guess, assumption, or estimate of racial identity with rationale provides an opportunity to explore the language-based heuristics informing broader cultural biases.

Similarly, Toni Morrison's "Recitatif" intentionally leaves the racial identity of its two main characters ambiguous, inviting readers to grapple with questions of racial performance and the ways in which racial identity is shaped by social and cultural factors. Twyla and Roberta are introduced as appearing “like salt and pepper standing there” and the narrative employs extensive detail that invites or tricks the reader to (problematically) guess which character is which (244). For instance, Morrison uses the following description of Twyla and Roberta’s food preferences to play on racial stereotypes, but because the author does not make their racial identities explicit in the text, readers are led to explore the arbitrariness of such designations:

We didn't like each other all that much at first, but nobody else wanted to play with us because we weren't real orphans with beautiful dead parents in the sky. We were dumped. Even the New York City Puerto Ricans and the upstate Indians ignored us. All kinds of kids were in there, black ones, white ones, even two Koreans. The food was good, though. At least I thought so. Roberta hated it and left whole pieces of things on her plate: Spam, Salisbury steak—even jello with fruit cocktail in it, and she didn't care if I ate what she wouldn't. Mary's idea of supper was popcorn and a can of Yoo-Hoo. Hot mashed potatoes and two weenies was like Thanksgiving for me. (244)

By using the AI text generation activity to analyze these ambiguous markers, students can deepen their understanding of the ways in which identity is constructed and performed in the text and also reflect on the ways in which their own social identities are shaped by performative elements.

Outcomes for the activity can be summarized in terms of two clearly identifiable successes and a few minor suggestions for revision in any future adaptations of the assignment.

As mentioned, two case studies for 23 students’ real-world usage of the activity are provided: the first in an ENGL 499 senior research capstone course (n=10) and the second in an ENGL 300 course on literary theory and textual criticism (n=13). Most (but not all) students in both courses received the activity with curiosity and openness, with almost none of them expressing (or admitting) prior familiarity with AI text generation. A few vocal students expressed skepticism about the abilities of GPT-3 prior to the activity, but many of these students were pleasantly surprised by the results and impressed with the quality and coherence of the generated text. The activity generated lively discussion and debate among students, with many expressing a newfound interest in the intersection of technology and literature, including the potential applications of AI text generation in fields beyond English studies. The 400-level course was much more appreciative of the novelty of the technology, whereas the 300-level students seemed rather unimpressed or skeptical of its utility; perhaps this indexes the content-based differences between the courses, where the former focused more on research and writing and the latter focused on theorizing identity proper.

Some students expressed concerns about the ethical implications of AI text generation, including the potential for bias and discrimination in the training data used to develop these models, while others raised questions about the impact of AI-generated content on the concept of authorship and creative expression. Overall, the student responses to the AI text generation activity were generally positive and indicated a high level of interest and engagement with this emerging technology. A few specific student inputs and text generation outputs are included below. For future uses of the included assignment, instructors are recommended to make the following changes: 1) to have students type their answers and submit them digitally for faster input during the instructor demo portion of the activity; 2) to provide greater social, political, and cultural context for responsible discussion of topics such as identity and diversity; and 3) to provide students ample time to debrief not only in live group discussion but also with written reflection on questions pertaining both to the content of the assignment (i.e., social identity and cultural bias) but also its form (i.e., in relation to its format and technology used).

Below are examples of prompts and GPT-3 output that were generated from class. Many more examples were used, but these have been provided to illustrate some of the capabilities of GPT-3. This output may not match current iterations of the technology or other AI text generation platforms, but regardless, an important aspect to notice is the fine-tuning of the prompts to ask the model to provide confidence intervals and guesses even in lieu of definitive information about social identity. While many newer models are being adapted and/or aligned to avoid providing such information, the model available at the time of writing fortunately provided students the opportunity to reflect on instances where guesses were given, sometimes even confidently, even when cultural bias was at play:

Prompt 1 (from Passing): I like having tea at the Drayton Hotel in Chicago, live in the 1920s, and exchanging handwritten letters with my friends. GPT-3, guess my race, class, gender, and sexual orientation based on the above information about me, and explain the reasoning behind your guesses.

GPT output 1: Based on the information above, GPT-3 would guess that the speaker is a white, middle-class, heterosexual woman. The Drayton Hotel is a luxury hotel in Chicago, and the 1920s was a time when such establishments were frequented primarily by wealthy white people. Exchanging handwritten letters is also a relatively old-fashioned activity, which could be seen as another indicator of the speaker's middle-class status. As for the speaker's sexual orientation, GPT-3 might guess that she is heterosexual because she does not mention any romantic or sexual interest in people of the same gender.

Prompt 2 (from “Borges and I”): I like hourglasses, maps, eighteenth century typography, the taste of coffee and the prose of Stevenson. GPT-3, guess my race, class, gender, and sexual orientation based on the above information about me, and explain the reasoning behind your guesses.

GPT output 2: You are probably white, middle class, and heterosexual. Your love of maps and hourglasses suggests a preference for order and precision, while your taste for coffee indicates a more sophisticated palate. Your interest in eighteenth century typography indicates an appreciation for history and tradition. Overall, your interests suggest a conventional background and outlook.

Prompt 3 (student example): My favorite childhood memory took place in New York when my mom, older sister, and I visited my aunt’s dairy farm. I am known for being skilled at hustling my friends at card games. I like to unwind at the end of the day by playing pickup basketball downtown after work. My favorite food is BBQ pork ribs with collard greens. I prefer to wear short-sleeve button-up Hawaiian shirts and Converse shoes with denim overalls with cuffed bottoms. GPT-3, based on the information above, please guess my age, gender, race, ethnicity, sexual orientation, religion/spirituality, nationality, and socioeconomic status. Additionally, please provide explanations and confidence intervals for your guesses. Please provide guesses even in absence of identifying clues.

GPT output 3:
Gender: Male (95% confidence interval)
Race: Caucasian (95% confidence interval)
Ethnicity: Caucasian (95% confidence interval)
Sexual Orientation: Straight (95% confidence interval)
Religion/Spirituality: No religious or spiritual preference (95% confidence interval)
Nationality: American (95% confidence interval)
Socioeconomic Status: Middle Class (90% confidence interval)

Explanations: The fact that the individual mentioned visiting their aunt's dairy farm in New York implies that they are of American nationality, as well as likely have a middle class socioeconomic status. The individual's mention of playing pickup basketball also implies that they are of a younger age range, and likely male. The lack of any mention of other racial or ethnic backgrounds or religious backgrounds implies that they are likely Caucasian, with no religious or spiritual preference. Since the individual did not mention any gender-related interests or activities, it is difficult to determine their sexual orientation, but given their gender and age, it is likely that they are straight.

Learning Goals and Outcomes

For ease of practice, suggested formal learning goals and outcomes for the assignment are as follows, where students will:

Reflect on the ethical implications of AI text generation in terms of cultural bias and consider the potential impact of this technology on societal understandings of identity
Discover how to use text-generation technologies to write in response to prompts as well as how prompt engineering leads to variations in responses
Engage with literary texts that explore the performance of identity, including race and gender as well as the intersection of technology and literature

Materials [including assignment sheet]

Specifically, the activity prompts students to answer six survey questions pertaining to their personal histories and preferences—such as their childhood memories, fashion styles, and favorite foods—to collect data that is evaluated by GPT-3 to output guesses, along with confidence intervals, that register students’ various social identities. The survey and AI text generation portion of the activity takes 15 minutes for students to explore GPT-3’s functions, settings, and output; from there, the classroom discussion portion of the activity provides 30 minutes for students to debrief with the instructor and others about their insights and surprises about the AI model’s cultural biases and social identity in general. The survey questions below comprise the initial freewrite portion of the activity, and they have been designed to allow students to describe themselves with everyday information that is not immediately marked by the typical categories of identity such as age, gender, race, ethnicity, sexual orientation, or socioeconomic status. By leaving these categories unmarked, we leave space for the AI model to draw potential connections between students’ routine lived experiences and the normative categories of identity through which students have learned to understand themselves. Students are asked to respond to these questions either by writing answers by hand on a hard copy or typing them in a digital document that can be accessed by the instructor; either way, the goal is for the instructor to collect these responses into a prompt to feed as input into the text-generation application of their choosing, allowing for interactivity in the classroom module. Beyond the survey, the materials needed for this lesson are relatively simple and can be easily accessed in a typical classroom setting that includes a presentation screen. However, the use of technology, including the AI text generation platform, means that students and instructors will need to be comfortable with basic digital literacy skills in order to fully participate in the activity.

These survey questions were included as a means of eliciting personal information that is open-ended and subjective, allowing students to provide a range of different responses that can be analyzed and interpreted by the AI model. For instance, in asking students to describe a childhood memory, Question 1 can provide insight into students’ personal history and the experiences that have shaped their identity, while Question 5 asks about the student's fashion choices, which can provide insight into their personal style and aesthetic preferences. Here are the questions:

Describe the setting in which your favorite childhood memory took place.
What is a talent or hobby that others typically associate with you?
How do you like to unwind at the end of the day?
What is your favorite food?
How do you like to dress?
In what location and during what season/year did your favorite vacation take place?

Overall, these survey questions were chosen because they provide a diverse range of personal information that can be used as input for the AI text generation activity but also lead the AI model to make a fair number of assumptions. By collecting this information from students and feeding it into the AI model, instructors can create a prompt that is specific to each student and that is more likely to generate text that is sometimes eerily reflective of their personal experiences and identity but also sometimes strikingly inaccurate.

Acknowledgments

I would like to thank my students at Stetson University who provided permission for the reprinting of their responses as outlined here. OpenAI’s GPT-3 “text-davinci-003” model from November 2022 was used to generate the outputs as indicated above which have been lightly edited for formatting. Thank you to Lauren Hensley for her helpful feedback on an earlier draft of this article. OpenAI’s ChatGPT was used to generate a basic outline for the original draft of this article, but the writing and its subsequent series of revisions, edits, and mistakes are my own.

Works Cited

Borges, Jorge Luis. “Borges and I.” Collected Fictions, translated by Andrew Hurley, Penguin Books, 1999, p. 324.

Jin, Berber and Miles Kruppa. “ChatGPT Creator OpenAI Is in Talks for Tender Offer That Would Value It at $29 Billion.” The Wall Street Journal, 5 Jan 2023, www.wsj.com/articles/chatgpt-creator-openai-is-in-talks-for-tender-offer-that-would-value-it-at-29-billion-11672949279.

Henrickson, Leah, and Albert Meroño-Peñuela. “The Hermeneutics of Computer-Generated Texts.” Configurations, vol. 30, no. 2, 2022, pp. 115–139.

Larsen, Nella. Passing. New York: Modern Library, 2002.

Morrison, Toni. "Recitatif." Confirmation: An Anthology of African American Women, edited by Amiri Baraka (LeRoi Jones) and Amina Baraka, William Morrow and Company, Inc, 1983, pp. 243-261.

The Assignment

Artificial Intelligence, Social Identity, and Cultural Bias

Background

GPT-3 (Generative Pre-trained Transformer 3) is an Artificial Intelligence (AI) system developed by OpenAI that uses Natural Language Processing (NLP) to understand and generate human-like text. GPT-3 was pre-trained on a dataset containing hundreds of billions of words from sources including books, web pages, and other sources of written text. The model is able to generate text based on a prompt, often with remarkable accuracy and creativity, which makes GPT-3 an invaluable tool for researchers, businesses, and individuals looking to generate text quickly and accurately. In this activity, we will use GPT-3 to study the relationship between biographical information and social identity as well as how cultural bias may be reproduced by AI models, their statistical methods, and the language we use in prompts to generate text.

Freewrite

Answer the following questions but avoid using information that may indicate your age, gender, race, ethnicity, sexual orientation, or socioeconomic status. Aim to be specific.

Describe the setting in which your favorite childhood memory took place.
What is a talent or hobby that others typically associate with you?
How do you like to unwind at the end of the day?
What is your favorite food?
How do you like to dress?
In what location and during what season/year did your favorite vacation take place?

For example: My favorite childhood memory took place in New York when my mom, older sister, and I visited my aunt’s dairy farm. I am known for being skilled at hustling my friends at card games. I like to unwind at the end of the day by playing pickup basketball downtown after work. My favorite food is BBQ pork ribs with collard greens. I prefer to wear short-sleeve button-up Hawaiian shirts and Converse shoes with denim overalls with cuffed bottoms. My favorite vacation took place in Salem, Massachusetts during the summer of 2017.

Once your answers have been collected, we will input them into GPT-3 with the following prompt: GPT-3, based on the above information, provide guesses for my age, gender, race, ethnicity, sexual orientation, and socioeconomic status. Additionally, include detailed explanations and confidence intervals for your guesses.

Before we do so, however, please answer the following questions:

What information have you provided that might “expose” your identities as listed above to GPT-3? What about the information and/or identities causes you to believe this way?
What information have you provided that might “mislead” GPT-3 with respect to your social identities? What about this information might be “misleading,” and why?
What other question(s) might we include to improve and/or impair GPT-3’s ability to guess? Explain how the information gathered by these question(s) might change the AI’s output.
What questions or prompts might we include to allow GPT-3 to get to know the “real” you? Explain how answers to these might provide a better representation of yourself to GPT-3.
What aspects of your identity cannot be captured by survey questions?