TextGenEd: Teaching with Text Generation Technologies

Edited by Annette Vee, Tim Laquintano, and Carly Schnitzler 

Generative AI is the most influential technology in writing in decades—nothing since the word processor has promised as much impact. Publicly-accessible Large Language Models (LLMs) such as ChatGPT have enabled students, teachers, and professional writers to generate writing indirectly, via prompts, and this writing can be calibrated for different audiences, contexts and genres. At the cusp of this moment defined by AI, TextGenEd collects early experiments in pedagogy with generative text technology, including but not limited to AI. The fully open access and peer-reviewed collection features 34 undergraduate-level assignments to support students' AI literacy, rhetorical and ethical engagements, creative exploration, and professional writing text gen technology, along with an Introduction to guide instructors' understanding and their selection of what to emphasize in their courses. TextGenEd enables teachers to integrate text generation technologies into their courses and respond to this crucial moment.


An Introduction to Teaching with Text Generation Technologies

by Tim Laquintano, Carly Schnitzler, and Annette Vee

When we issued the CFP for this collection, teaching and research in automated writing was still rather niche. In the language arts, it existed in critical code studies and creative domains such as computational poetry and, more broadly, electronic literature. In writing studies, interest in automated writing existed in corners of technical writing, computers and writing, and rhetoric. Most writing teachers are comfortable with word processing, content management systems, search, and email, and it has been possible to run a writing class with little else. Now, with the introduction of ChatGPT, it might soon become difficult to research and teach writing without thinking about, or addressing, automated writing technologies and artificial intelligence (AI). As Big Tech rushes ahead in its AI arms race with the intention of having large language models (LLMs) mediate most of our written communication, writers and teachers are forced to consider issues of prompt engineering, alignment, data bias, and even such technical details as language model temperature alongside issues of style, tone, genre and audience.  

At the cusp of this moment defined by generative AI, TextGenEd collects early experiments in pedagogy with generative text technology, including but not limited to AI. The resources in this collection will help writing teachers to integrate computational writing technologies into their assignments. Many of the assignments ask teachers and students to critically probe the affordances and limits of computational writing tools. Some assignments ask students to generate Markov chains (statistically sequenced language blocks) or design simple neural networks and others ask students to use AI platforms in order to critique or gain fluency with them. A few assignments require teachers to have significant familiarity with text generation technologies in order to lead students, but most are set up to allow teachers and students to explore together. Regardless of their approach, all of these assignments now speak to the contemporary writing landscape that is currently being shaped by generative AI. Put another way, the assignments in this collection offer initial answers to urgent calls for AI literacy. 

We hope this collection offers something for teachers with all levels of comfort with technologies—from teachers seasoned with digital writing technologies to teachers approaching the entire domain with trepidation. To that end, we have made the teaching resources in this collection as accessible as possible. WAC Clearinghouse is publishing the collection as fully open access and all of the assignments are licensed as Creative Commons Attribution-Noncommercial (CC-BY-NC), which means that nonprofit educators are free to adapt and use and share them (with credit to the source) as they see fit. We hope they will! 

One requirement of every assignment accepted for this collection was that instructors had taught it at least once. So, all assignments include a description of how students responded along with reflections from the instructors. Short abstracts accompany each assignment and detailed implementations are included. Assignments are organized according to learning goals relevant to writing: rhetorical engagements; AI literacy; ethical considerations; creative explorations; and professional writing. We hope instructors treat this as a living collection, adapting the assignments to local conditions and new technologies as they evolve.

As context for this collection of assignments, we provide below a brief introduction to past, present, and future attempts to automate writing. This general framework can guide instructors' understanding and their selection of what to emphasize in their courses, especially given the hype that surrounds contemporary generative AI. This collection works alongside many emerging resources for instructors, including panels sponsored by CCCC and by MLA, a working paper authored by a MLA/CCCC joint task force, a recent forum in Composition Studies, a WAC Clearinghouse resource curated by Anna Mills, and published research across many academic disciplines, from sociology to rhetoric. Many of the scholars whose assignments appear in this collection also publish on generative AI and other text generation technologies.

It will take all of us to respond to this moment. As editors of this collection, we believe that generative AI is the most influential technology in writing in decades—nothing since the word processor has promised as much impact. And generative AI is moving much faster. Although generative technology for text has been quite good for the last 5 years, it's been less than a year since the watershed release of ChatGPT in November 2022, which by many measures has been one of the fastest growing technologies in the history of humanity. A technology this impactful to education requires collective response and collaboration from teachers. This collection has allowed us to put our heads together with some of the most thoughtful and innovative writing teachers across English studies and beyond. May their ideas invigorate your teaching as much as they have ours. 

A Brief History of Automated and Computational Writing 

While conversations about text generation with AI sometimes present it as a fully new phenomenon, automated writing has its origins much earlier. In the seventeenth century, mathematician G.W. Leibniz invented a cryptographic cipher machine that “would serve to encipher and decipher letters, and do this with great swiftness and in a manner indecipherable by others” (Rescher). In Swift's Gulliver's Travels (1726), a Lagado professor engineered an automated system of writing including young scholar-laborers, blocks of wood, wires and cranks "so that the most ignorant person, at a reasonable charge, and with a little bodily labour, might write books in philosophy, poetry, politics, laws, mathematics, and theology, without the least assistance from genius or study.” Automata that ran on complex clockwork mechanisms proliferated in the 18th and 19th centuries, largely as a way for mechanics and clockmakers to show off their technical prowess (Riskin). These automata, powered by the winding of gears, could variously dance, write, draw, breathe, and, in the case of one mechanical duck, defecate. 

The automation of writing—a uniquely human activity—accompanied conversations about artificial intelligence, even in the early modern era, long before the term came about. With the invention of the computer in the 20th century, the connections between writing and AI grew tighter, most clearly illustrated in Alan Turing's 1950 article in the philosophy journal, Mind: "Computing Machinery and Intelligence.” At the time, computers were humans (mostly women), and digital computers were primarily used for complex calculations, especially in wartime military contexts. Amid the ballistic calculations, Turing speculated on a prompt from his teacher, philosopher Ludwig Wittgenstein: Can machines think? Both men thought it was a ridiculous question—Wittgenstein because he thought machines were nothing like humans and Turing because he wasn't even sure we knew what humans thought. But, Turing argued that if a machine could fool a human into thinking it was a human, then it could be said to think. The machine—a computer—would naturally use writing for this deception. Writing, in other words, is thinking—and the automation of writing is machine thinking. 

By the early 1950s, computation had advanced to the point where programs could be written to generate text. While awaiting his first assignment at Britain’s National Research and Development Corporation in the summer of 1952, British computer scientist Christopher Strachey—a collaborator and friend of Turing who also invented a precursor to the programming language C—created a program that generated campy, over-the-top love letters, all signed by M.U.C., the Manchester University computer. One letter, reproduced below, was later printed in the arts magazine Encounter in 1954: 

Honey Dear 

My sympathetic affection beautifully attracts your affectionate enthusiasm. You are my loving adoration: my breathless adoration. My fellow feeling breathlessly hopes for your dear eagerness. My lovesick adoration cherishes your avid ardour. 

Yours wistfully M. U. C. (Campbell-Kelly 25)

Strachey’s love letter generator is widely-cited as the first work of electronic literature, a more flexible, fun, and digital version of the mechanical writing automata that preceded it (Rettberg). The emergence of e-literature and the generative creative texts in the decades that followed Strachey’s generator established a sensibility of subversion, play, and critique. Even non-computational work by artist groups such as Oulipo were influenced by the combinatorial work done by those working on computers. Following Turing, Strachey and others, a small number of artists and programmers were going against the grain of what computation was generally designed to be used for—things like crunching census data and calculating the trajectory of ammunition in wartime. Instead, they were using computation to generate literature and art. 

Early text generation worked with templates or statistical models such as Markov, a model where the next words in a chain are determined by probabilities. Even as computing became more accessible in the 1980s and 90s, text generation was a niche practice: for determined experimental artists or computational linguists huddling together in the AI winter, when funding for such work dropped in response to greater needs in basic literacy programs and defense (NCEE). Natural language processing—including understanding and generation—were both still active research areas with significant implications for transcription, translation, surveillance, and support for people with disabilities. Advances in machine learning, statistical methods, word embeddings, and dramatic increases in available compute plus data from the web all drove text generation technologies through the 2000s until now. In this collection, assignments by Boyd and Egan are particularly helpful in providing students with context for this history of text generation. 

The Current State of Text Gen Tech: Large Language Models (LLMs)

While earlier models of text generation leaned on grammatical rules, current models are more speculative—predicting the next word in a sequence based on patterns in its dataset. For text generation, large language models (LLMs) train on massive datasets gleaned primarily from the Web using machine learning techniques; they are then subjected to fine tuning and reinforcement learning through human feedback (RLHF) to hone desired output. Over the last ten years, and particularly since late 2017, these techniques have catapulted the field of generative AI, producing so-called "foundation models" that can generate text, image, video or sound across generalized contexts. Developments have been so dramatic that in technology news, AI podcasts and social media, the story told of generative AI is about our relentless march toward artificial general intelligence (AGI). Amid the distortion from overblown claims—no research field has promised so much and delivered so little as AI—there are real potentials and limits to generative AI. Yet, when the hype is dislodged from reality, these remain extremely difficult technologies to grasp: even AI scientists and engineers do not fully understand them or their implications. Below, we outline briefly how generative AI works for text generation and what variables might shape the future of text generation technologies. AI's dominant role in text generation right now means that soon even engagement with word processing might require a basic understanding of how contemporary LLMs work. 

Large language models are called so because they model language. That is, they take examples of language and then use certain processes to attempt to reproduce it. We can therefore approach an understanding of LLMs by breaking down the processes they use and then the data they draw from. 

Processes 

Contemporary LLMs are built with neural networks, souped-up versions of what Warren McCullogh and Walter Pitts introduced in 1944. McCullogh and Pitts borrowed the concept of a neuron from the human brain, comprised of billions of layers of interconnected tiny processors. The mathematical model of neurons fell out of favor in AI for decades, but has been revived with current "deep learning" techniques, so-called because contemporary artificial neural networks are many, many layers deep with simulated neurons that respond to information signals. Convolutions, backpropagation, and transformers—technologies that have accelerated generative AI since 2017—are all deep learning techniques that add layers of complexity to the neural network and can affect outputs. 

The ways that contemporary neural networks recursively feed information back into the models has helped them to produce more coherent text across greater lengths of passages. Early text generation models could only generate short passages before they began to lose earlier details that were needed for coherence. In late 2017, Google researchers published the now famous paper “Attention is All You Need” (Vaswani, et al.), which enabled AI scientists to use transformer models to develop current LLMs. For writing instructors, the relevant detail to know about this advancement is that it effectively enabled language models to retain relevant information and place greater emphasis on earlier parts of the input. This is another way of saying that language models built using transformers could now sustain arguments, narratives, or discussion for thousands of words without "forgetting" crucial ideas from earlier in the prose. The expanded context window of LLMs is not infinite, however, which is why LLMs that consumers can now access tend to be only capable of writing stories for a few thousand words at a time. Some newer models have larger context windows but for the time being remain difficult to access. Regardless of the specifics of the models, it is also important to note that because the networks are so complex, with so many hidden layers, and because models adjust their parameters based on feedback, even the programmers and engineers who design the models cannot fully trace the path from language input to output.  

Data

Large language models are called "large" because of the massive datasets they draw on to model language and the enormous amounts of parameters they have that the model uses to make predictions. AI scientists and engineers draw from large, open datasets such as Common Crawl (petabytes of text scraped from the Web) and websites such as Wikipedia. OpenAI's GPT-3 used CommonCrawl, outbound links from Reddit, Wikipedia, and text from books out of copyright in its dataset (Brown, et al.). (OpenAI has not revealed the data sources for more current versions of GPT, both for what they claim are safety reasons and to retain a competitive edge.) The datasets for contemporary LLMs such as GPT-3 are so large, they are relatively uncurated and unlabeled, although they've been "lightly filtered" (Brown, et al.). This filtering removes some of the most toxic language from a dataset, but the datasets tend to be so large that it has been difficult to clean all unsavory language, and Bender, et. al. also note that the inherent ambiguity of language means that scrubbing certain terms from datasets can preclude the perspectives of marginalized groups. Perhaps more importantly, if a dataset is so large that it can only be read through computational means, then it becomes extremely difficult to account for, or even understand, many of the possible worldviews in the data—although a variety of fields are now hard at work measuring the various kinds of bias embedded in LLMs through various benchmarks (mostly through more computational means). The problem of embedded bias is one of the reasons Bender, et al. have argued that LLMs can be too large. Each of the sources of data for GPT-3, for instance, over-represents men, white people, Western viewpoints and English language patterns. A language model built on that foundation is inevitably going to represent dominant perspectives. Datasets such as "The Pile" have been developed to attend to more diverse uses of language, and LLMs such as BLOOM include large amounts of non-English language training data in order to counter some of these biases. 

Recent Evolution of LLMs

Earlier language models needed to be fine-tuned to particular tasks in order to produce text that resembled good human writing—for instance, models that acted as chatbots in customer service. When GPT-3 came on the scene in 2020, it proved remarkably good at "few-shot learning" tasks—that is, it didn't need fine-tuning to a specific instance in order to produce coherent results that hit established NLP (natural language processing) benchmarks for quality. OpenAI achieved impressive results by scaling up both the parameters and the data they used, and they ushered in a new era and excitement about LLMs. 

It is important to remember that the task of LLMs is simply to predict the next token given an input; it so happens that if you train them on enough data and enough compute, you begin to see emergent capabilities from the act of token prediction (e.g., the ability of LLMs to write computer code and simulate reasoning capabilities). But this prediction is also the reason why Emily Bender and colleagues insist that LLMs are tools of natural language generation and not natural language understanding, even if the performance of some models is so good it feels to the user as if the models understand. But these models don't operate with an understanding of the world, or any "ground truth;" they work statistically. They model language based on associated terms and concepts in their datasets, always predicting the next word (in units called "tokens") from what's represented in their data. This prediction of the next token is also the reason language models can convey false information or “hallucinate”. They don't know false from true—only statistical relationships between tokens. 

Hallucination has not been the only problem with LLMs. When GPT-3 was released in 2020, researchers used adversarial testing to coax all manner of toxic and dangerous outputs from the model. This became something of a social media game when ChatGPT was released, as users made every attempt to “jailbreak” it in an attempt to get it to say nasty things. Numerous reports and swirling internet rumors suggested LLMs might provide good instructions for making methamphetamine or chemical weapons using ingredients available from Home Depot.

Engineers have developed a number of ways to try to mitigate these issues, including fine tuning, implementing safety guardrails (e.g., from blocking certain terms from being input and certain topics from being output), and reinforcement learning through human feedback (RLHF). In RLHF, humans help to train models by giving them question-answer pairs, rating the model's responses for accuracy and appropriateness, and identifying toxic responses (sometimes in the Global South for very little pay, see Perrigo). These methods have improved safety, eliminated some toxicity (a common joke is that the models have been through the corporate diversity training program), and improved the accuracy of responses. However, they are still not perfectly accurate and, given the philosophical complexity of representing "truth," likely never will be. The current hope of model designers and users seems to be that the accuracy of the models will be improved through add-on technologies and plug-ins (e.g., linking a LLM to a database of curated content to help prevent misinformation).

Developments in LLMs are coming at such a pace that it's difficult to keep up. But we can see a few trends: as machine learning techniques improve, the size of datasets and computation needed appears to be shrinking. Consequently, models with fewer parameters are producing more accurate outputs and the resources needed to run them have been shrinking, although as of this writing, the best models are still resource hogs. This means that we have entered a time of “model proliferation” that will lead to models with different purposes, politics, and values. We may soon see accessible models fine-tuned on personalized datasets (e.g., one’s own emails), which might help language models better mimic the voice of the writer instead of producing the general, bland voice that has become relatively identifiable to some teachers of writing. AI plug-ins and apps will extend the capabilities of LLMs and be used in search as well as a host of other writing tasks, as language models begin linking various applications we use on a daily basis through a single interface. And, while many writers have been using ChatGPT as a standalone application, Google and Microsoft have begun embedding language models in their word processing systems and office software, a feature that will soon be rolled out on massive scales. Our writing environments will inevitably be shaped by these AI integrations, but it's unclear what effects this integration will have on our writing or writing processes. The only thing certain here is change—rapid change. 

Despite Big Tech’s insistence that these technologies will sweep the world, there are a number of variables that will affect their trajectory as writers decide the extent to which writing with AI is viable. These variables include: 

  1. Scale and access: Can engineers create language models that achieve decent performance without using extraordinary computing resources? If the technologies remain expensive to use and operate, what does it mean for access? Data at a large scale is impossible to review for accuracy or bias. As Bender, et al. ask: Can language models be too big?
  2. Security and privacy: To what extent do language models leave users vulnerable to breaches of personal information, either in using the models or in having their data as part of the training set for the models? What security is possible in locally-run instances of language models?
  3. Legality: Who will be liable for the harms created by the output of generative AI? Is it fair use for generative AI to mimic the styles of living authors and artists? How will copyright case law develop?
  4. Implementation and user experience: How seamlessly will AI writing applications be integrated into now-standard technologies such as word processors and email clients? To what degree will writers or educators be able to decide on the level of integration or visibility of use for these language models? 
  5. Fact and ground truth: What methods will be developed to decrease inaccuracies (such as "hallucinations" of scholarly references or historical facts) in language models? Can reinforcement learning or connections to established databases prevent language models from their tendency to produce incorrect information? 
  6. Complementary technologies: What will language models be capable of when other applications become bolted onto them? To what degree will AI language models shape our digital discourse? 
  7. Abuse by malicious actors: Will the benefits of generative AI outweigh the potential harms they can create such as supporting disinformation campaigns?
  8. Identification and disclosure: Software for detecting AI generated text has not proven to be particularly effective. A variety of solutions have been proposed, but for the time being it seems to be a cat and mouse game that seems to be initiating a crisis of social trust related to certain kinds of writing. 
  9. Social stigma: Upon its arrival, ChatGPT received intense press coverage that framed it as a cheating technology for students. To what extent will collective impressions of the technology shape its trajectory? 
  10. Style and language bias: Language models write with "standard" grammar in languages that are well-represented in the dataset, such as English. Given significant bias against "accented" writing in educational and professional contexts, how will language models affect writers' or readers' perceptions of "accent" in writing? 
  11. Lesser-known or minoritized languages: How will languages and discourse with little or minoritized representation in the training data be reflected in language models? Will smaller language models be tailored for use by these discourse or language communities? Will supervised learning or synthetic data supplement training to enhance representation? To what degree can or will minoritized discourse communities embrace language models?

While generative AI with language models is the overwhelming force and background in the contemporary writing scene as well as this collection, it is not the whole picture. The spirit of early creative computational writing, for example, is still very much alive both apart from and inclusive of uses of LLMs. Creative uses of computation have evolved alongside the technologies themselves. A wide variety of tools exist to make creative text generation accessible in a pedagogical context. Educators in this collection employ user-friendly tools and libraries like Tracery, RiTA.js, and Markovify to both teach about text generation technologies, and about creative constraint, as it predates and contextualizes AI text generation. 

Shaping Writing’s Futures

Regardless of the power of new language models, nothing happens in the writer’s life without implementation. And implementation is often a messy process. Implementation is when we learn whether or not tools are useful to us, when we adjust to new and clunky interfaces, and when we suss out exactly how hollow or flush the promises of big tech’s marketing language is. Implementation is also an obfuscatory process. The environmental impact of AI, the potential for it to induce extensive job loss, the potential for it to remove thought and care from human work, will not be altogether apparent to the average user of a Google doc who clicks a "Help me write" button and has the tone of their paragraph changed. To the first generation of AI users, it might feel like magic. To the second generation, it might feel ordinary. 

For many writers, the near future will be an experiment in implementation. Like literacy practices themselves, the implementation of new writing tools will be highly sensitive to context as writers assess their needs, and their organizations’ needs, to automate rhetorical practices against the backdrop of questions about data security, privacy, resources, and goals. Writing instructors and higher education as a whole will also be working to determine how implementation will happen in our lives and in the lives of our students. If there is one benefit to the otherwise harrowing “AI arms race,” it is that many of these tools have already come online undercooked and with a clunky or creepy user experience that might stall their adoption. A potential delay in widespread use could buy us some time to learn more about them, understand them, and generate research about how they are used. 

Potential Paths

Even if the variables above restrict the spread of AI, it will be widespread enough that writing teachers need to prepare. We’ve seen the hazy outlines of four responses begin to emerge:

  1. Prohibition: We are skeptical that this will be a viable model. In the near future, any writing done in a word processor will likely be difficult to do without some AI intervention, whether tacit or explicit. Moreover, we are not convinced by any current research that accurate labeling of AI generated prose—which is currently unreliable—will ever be available. A student "honor code" could sidestep the labeling challenge in a prohibition path, but only if students understand when AI intervenes in their prose. As of this writing, Grammarly has integrated a LLM into its interface; Google has a "Help me write" feature that obscures that it's an LLM, and Microsoft is on track to implement a similar feature in Word. Big Tech plans to integrate AI into its next generation search technologies, and complete prohibition might very well lead to an eventual de-skilling of students, something Antonio Byrd has recognized in a recent forum on AI and Writing in Composition Studies. While turning back time before generative AI is not an option, some restrictions on students using LLMs may be beneficial. Well-crafted assignments can create conditions in which students might receive only minimal advantage from engagement with AI.
  2. Leaning In: Some professors have advocated that generative AI is the future of writing and that we should be leaning into the use of language models, having them assist with most if not all stages of the writing process. This might be where we all wind up, but it is crucial to note that an uncritical stance that accepts the discourse of inevitability is unlikely to empower students or educators, and the open issues we mention above can disrupt any full embrace of language models. Yet treating AI as a collaborator—such as some assignments in this collection advocate—can equip students to prepare and even shape a future with AI writing. 
  3. Critical exploration: Students can probe the limits of the technology while learning how to use it. This is the direction we believe to be the most beneficial to our students and which is implied by many of the assignments in this collection that require LLM use. These assignments ask students to flush out data bias, rhetorically examine the output of LLMs, compare their writing to the writing of language models, and discover the limits of the technology. 
  4. A chaotic blending of all the options: This is the current scenario, and the most likely path of the near future. Institutions of higher education are not homogenous, and many of them are pedagogically conservative. We also don’t know what the uptake of generative AI will be in secondary schools and the workplace, i.e., two forces that sandwich higher education and shape our teaching-scape in subtle ways. If these technologies continue to spread, and if they experience rapid uptake, it is clear that we face a serious challenge: We have a narrow path to travel as we try to augment student learning without displacing it. 

Writing Teachers Are Invested in Writing

While we consider these paths forward, writing instructors must confront our own investments and biases in this future of AI and writing. One variable that obscures the future of writing for us is our affinity for writing. Even if we find it difficult at times and drudgery at others, writers and readers connected to this collection appreciate acts of writing and have their livelihoods bound to it. We collectively believe writing is a form of thinking, learning, and communicating. We believe students should write to empower themselves and to prepare themselves to be ethical citizens. 

Not everyone has such investment in writing, of course. Most people who write do so with limited time, skill, or interest. Writing is stressful and is often done under duress, in high-pressure educational and workplace settings. Automation often promises to deliver us from drudgery and disadvantage and yet rarely delivers. But perhaps automating some aspects of writing will free some writers to choose other forms of expression more inspiring to or effective for them. 

We need to be mindful of our investment in writing as we try to determine which parts of the writing process we might yield to AI—and to what extent we have a choice in the matter. Which parts of the writing process can we cede to AI while retaining what we value about writing? We will soon learn if it is tenable to allow students to use AI for some parts of the writing process (e.g., brainstorming and grammar/style checkers) but not for others (e.g., text generation). We may want to embed constraints in our assignments so as not to offload too much of students’ cognitive work to AI. The open question is whether or not these constraints will be possible as AI language models are increasingly integrated into standard writing workflows, or whether students, employers, readers or writers will care about the human or AI origin of prose. 

AI and Economies of Authorship 

Research into professional writing has shown the kinds of writing in workplace and civic contexts and the kinds of inquiry-based writing in higher education are at odds with each other. This includes issues of length (short form versus sustained argumentation), intellectual property and citation conventions, collaboration and individual learning, and a host of other issues. To some extent, we have a gap in values and practices between writing in higher education and writing in workplace/civic/personal spheres. Many of us value that gap, but we also observe that it can devalue our work in higher education, as we are accused of not preparing students for the writing they will “actually” do. We need to address the question of whether these tools open a much wider rift between the writing we do in higher education and writing in the wider world. Writing outside the university is often transactional. While McKee and Porter (2020) rightly point out that AI writing hides or ignores the social and rhetorical contexts of writing to favor an information-transmission model of communication, many writing contexts are satisfied well enough with this stripped-down model of communication. Will a potential misalignment between writing inside and outside of higher ed further devalue the writing for critical inquiry that we assign and practice? 

The European Network for Academic Integrity (ENAI) recently published guidelines on the ethical use of AI that show how vexing some of these issues will become (Foltynek, et al.). The document focuses on education for students and faculty. The guidelines focus on authorization and acknowledgment. Following the lead of a number of major journals (e.g., Nature and Science, see Thorp), the guidelines state that AI cannot be an author, and that “all persons, sources, and tools that influence the idea or generate the content should be properly acknowledged” (2), which includes documenting “input” to the tool, or prompts (3). The guidelines also state that “appropriate use of services, sources, and tools that only influence the form is generally acceptable (e.g., proofreaders, proofreading tools, spelling checkers, thesaurus)” (3). Crucially, these guidelines state that AI cannot be an author because only humans can take responsibility for writing. 

This position of the ENAI reflects a growing consensus within academic research and teaching about AI collaboration: it's a tool, not an author. And to some extent, these recommendations are simply an extension of the status quo. To preserve the integrity of authorship and academic economies of citation and prestige, disciplines have developed specific and nuanced protocols for acknowledging influence: help from mentors, peer reviewers and editors may go in an endnote or an acknowledgements page, intellectual and research precursors will go in a citation system, and some labor remains invisible. Some disciplines have a history of citing constitutive instrumentation—especially in science. No author writes alone, and technological tools have always been part of the entangled materialities that shape writing (Baron). The extent to which AI is constitutive to writing—or acknowledged as such—will depend on disciplinary conventions, individual writing processes, and specific implementations of the technology. 

Finally, we want to call attention to discrepancies in theories and practices of authorship between academic and professional spaces because we see AI potentially heightening the tension between them. In the last decades, we’ve seen academic theories of authorship that have concentrated on influence, remix, materialism, and the messiness of human writing experience. These theories have not always aligned well with the neater and more artificial economies of authorship in higher education (e.g., the preservation and veneration of individual authorship) that we use to measure professional advancement. In addition to that discrepancy, for the sake of education we have not structured economies of student authorship in the same ways as economies of professional authorship. Professional academic authors work in ways that do not always square with academic honesty policies for undergraduates: professional writers have access to proofreaders and editors; they outsource intellectual processes to research assistants or support staff; and they sometimes publish in teams of dozens. Some of the artificiality in student authorship practices is warranted as it provides a practice ground for burgeoning writers. And now undergraduates will have access to a variety of assistive technologies that mimic work that we often outsource (e.g., copyediting), and we see the potential for AI to be integrated into every step of the writing process. Will higher education be able to discipline AI to bring it into alignment with academic economies of authorship? Or, as writers adapt to working with large language models, will AI destabilize the detente between academic and professional economies of authorship and expose the artificiality of writing practices in the academy? 

What This Collection Does

The answers to many of these open questions will take years to understand, but writing teachers are poised to help steer the discourse and paths of generative AI technology. This collection serves to orient writing teachers in that essential work. This section will explain how the assignments have been grouped, but before we outline each theme, we would like to say a bit about student privacy and data collection, as a number of assignments ask students to employ commercial language models, which require them to register for a Gmail account, a Microsoft account, or an OpenAI account. We’ve already seen a number of corporations ban employees from using language models for fear that employees will divulge proprietary information. Until technology companies producing the models offer much more stringent protections, industries such as finance, higher education, and medical will not be able to use them in any large measure. Thus we expect that in the next few years (if not months), Microsoft and Google will introduce models with greater privacy protections built into them for organizations. That said, we are temporarily in a state where access to models requires one of three things: 1) registration with commercial companies that often requires divulging personal information (such as a phone number) and then further divulging information through prompting (best practices for the protection of student privacy would frown at this); 2) installing an open source language model on a private or institutional server and providing students with access, a step that requires a bit more technical know-how; 3) using what is likely a smaller model hosted and accessible for free on a site like Hugging Face

If you are bound by law or personal ethics to protect student privacy at all costs, you may need to help students use an open source version or wait until technology companies implement organization solutions. For those instructors who do not mind asking students to experiment with commercial applications, we should note that most can do this without divulging much new personal information (e.g., if they already have a Google account they can use Bard). If students do express privacy concerns, instructors can work with them to offer a number of privacy protection strategies. Depending on the model, it might be possible for students to register with burner accounts (always a good idea with social media experiments in the classroom) and employ data pollution strategies to frustrate surveillance capitalism’s attempt to invade their privacy. We should also note that there have been a number of applications that will allow students to connect to ChatGPT anonymously and without signing in. These applications come and go and any we recommend may be defunct by the time of publication, but they sometimes require registering for another commercial service (e.g., Telegram or Discord). We trust students and instructors to work together and we recommend that instructors provide alternate assignments if a student objects to using a commercial application. 

Turning to the assignments, we have grouped the assignments into five categories to provide instructors with an orientation to the collection and themes that will likely emerge as they begin integrating computational writing activities into their classrooms. The categories are: rhetorical engagements, AI literacy, ethical considerations, creative explorations, and professional writing. Most of the assignments tend to be user friendly and require minimal technology competencies. A few require both students and the instructor to have more prior knowledge and technical competencies.   

The assignments we have grouped under rhetorical engagements ask students to consider how computational machines have already and will become enmeshed in communicative acts and how we work with them to produce symbolic meaning. Many of these assignments have comparative dimensions and/or ask students to analyze and work with the output of large language models. Aryal asks students to chat with a chatbot on a subject they're familiar with to analyze its "thinking" patterns, and Pardo-Guerra has students revise and annotate an AI-generated passage to consider how it excels and fails in its consideration of course concepts. Byrd’s assignment recognizes the current limitations of LLMs as text generators and has students experiment with automating processes of revision, while Booten’s work with prompt engineering provides students with the opportunity to develop “synthetic metacognition” via “iterating and tinkering with the instructions that guides the output of the LLM.” These assignments help students build out the new rhetorical competencies enabled by LLMs and also the possibility of using them to enhance more traditional literacies.   

The AI literacy grouping helps students to develop a crucial suite of critical thinking skills needed to work with emerging technologies: functional awareness, skepticism about claims, and critical evaluation of outputs. In a preliminary report on how language models might influence the labor market, researchers from OpenAI concluded that “critical thinking skills show a negative correlation with exposure [to automation], while programming and writing skills are positively associated with LLM exposure” (3). In other words, LLMs can automate writing tasks but not critical thinking tasks, a message that is not always clear in the over-hyped language now circulating. LLMs produce text, but without a user to prompt them with the right questions, and without a user to assess their output, they are deceptively worthless. Critical thinking matters more than ever, and sometimes this means peeking under the hood of the machines.   

Assignments from this group tend to focus on concepts that will help students understand how the machines work. Some of them require instructors to have some technical skills or familiarity with concepts from natural language processing. They all support instructors learning AI literacy alongside students. Egan asks students to produce a Markov Chain to learn more about how probabilistic text generators work. Goodman takes students through the process of training a LLM and has them view its processes through a neuroqueer framework. Beshero-Bondar’s assignment introduces students to some fundamental concepts of natural language processing with an emphasis on key concepts in word embeddings.

In the ethical considerations category, assignments are split between two primary foci—the first engages students in the institutional ethics of using LLMs in undergraduate classrooms and the second attends to the ethical implications of LLMs and their outputs. In this first focus area, Fyfe takes a playful approach to serious questions of academic integrity, asking students to write a term paper using a LLM with the express purpose of fooling their instructor in a “Term Paper Turing Test.” Watkins emphasizes the production of an AI Standards of Conduct Framework with his students, creating clear ethical boundaries around LLM use in first-year writing courses. Relatedly, Frazier and Henley discuss how they adapted a pre-LLM assignment for a post-LLM world with an eye towards academic integrity, providing a model for other instructors looking to do the same. In the second focus area, the attention turns to the ethical implications of the general use of these tools. The opacity of the production, training, and outputs of LLM-driven software are among their biggest shortcomings (if not the primary shortcoming), prompting a necessary engagement with each of these opaque processes. Writers working with these systems should think carefully about what they are enabling in using these tools. Jimenez asks students to look at their own social and cultural identities as they are represented (or not) in the outputs of LLMs, with an eye towards these systems’ tendencies to reproduce biases in response to prompt design. Whalen positions his creative assignment as a thoughtful rejection of LLMs for reasons of opacity, opting instead for a text generation assignment that is minimalist and fully transparent in its operations. The assignment also opens up ethical questions about why and why not to use different types of text generation technologies.

Creative explorations play around the edges of text generation technologies, asking students to consider the technical, ethical, and creative opportunities as well as limitations of using these technologies to create art and literature. Many of these assignments look beyond our contemporary scene of LLM text generation and lend valuable context to our current moment, drawing from earlier technologies or historicizing connections. Emphasizing the constraints of LLMs, Luman draws an explicit connection between prompt engineering and the literary work of the Ouvroir de littérature potentielle (“Oulipo”) to articulate the need for precision in human writing, specifically in our role as instructors for the machine. Wu locates text generation in a larger tradition of found art and writing, asking students to create with found materials first using analog processes, then using the RiTa.js Markov library. Calhoun proposes a connection between Hoodoo as a Black Southern American spiritual practice and AI writing platforms, asking students to make conjuring toolkits and compare their own poetic spells with those generated by ChatGPT. In his “Curveship-js” assignment, Montfort uses a JavaScript framework to interrogate narrative discourse and variation. Easter and Sample both examine different creative genres with their text generation assignments; Easter asks students to use text and image AI software to generate a children’s book; Sample prompts students to engage with creative combinatory writing using Tracery to make substantive social critiques through their combinatory poetry. 

Finally, the section on professional writing presents assignments that enable students to understand how computational writing technologies might be integrated into workplace contexts. Unlike academic discourse, professional writing is not grounded in an ethos of truth-seeking and critical inquiry; it tends to be grounded in an ethos of efficacy as well as constraints of legality and workplace ethics. The pivot to orient around technologies of automation could be more aggressive and the ground more fertile for uptake of AI, but this will also hinge on variables such as legal compliance, security concerns, and accuracy. Many professional writers hope to complete their own tasks as rapidly and efficiently as possible while retaining quality standards. If they can produce a document of similar quality with AI and it drops time to completion, they will most likely adopt the technology, if allowed. But if quality is inconsistent, or if AI output requires more human intervention than human-generated text, or if a stigma around AI-generated text degrades its value, or if search engines can detect and downgrade AI-generated text, then professional writers may think twice or even be disallowed from adopted the technology. 

However, instructors of professional writing still have openings for critical and ethical intervention as we prepare students to be effective communicators in the world of work and the civic sphere, especially as students begin adopting new writing technologies. Among this group, Eyman asks students to research and evaluate a range of text analysis and summarization tools to determine how capable the tools are at summarizing technical documents. McKee explores the use of AI in an assignment that asks students to make medical journal findings intelligible for lay audiences. Ranade helps students understand the tools AI provides in an assignment designed for a course on technical editing. Laquintano pits students against AI in an assignment to lower the reading level of a document, and students learn what's lost in translation as well as what's challenging about this common professional writing practice. Crider's assignment asks students to write then evaluate their peers' writing as AI text detectors, but with a twist. Ding helps students hone prompt engineering skills while they summarize, synthesize, and edit AI writing alongside doing their own research. Taken together, the assignments in this grouping provide an opening to help students respond to the trend toward seamless interaction between human and AI assistance in workplace writing.  

Conclusion

On the whole, the collection demonstrates that instructors (and we are including ourselves) and students have much to learn and (re)learn if indeed we are on the brink of a paradigm shift of how writing gets produced. We need to be aware, though, that as of yet we have few best practices established and few data driven studies about how writers will implement these tools in their processes. The timeline for corporations releasing models is on a far faster scale than that of university policies, courses, and training—especially with little funding or energy to support such studies or retooling in the wake of the Covid-19 pandemic. Yet AI safety and response is now our concern as educators.   

In his media blitz of the last year, Sam Altman, CEO of OpenAI and current mouthpiece for LLM advocacy, has spoken at length about the future of AI safety, including the need for government regulation and oversight. But his (real? feigned? misguided?) advocacy about AI safety was preceded by many AI researchers who have alerted us to the dangers of large language models and generative AI. Emily Bender, Timnit Gebru, Margaret Mitchell and Angelina McMillan-Major pointed out the problems with oversized models. Janelle Shane has used humor and the uncanny to lightheartedly critique the failings of generative AI. Meredith Broussard points to failings and limitations in AI's models of the world. Altman and other corporate leaders have repeatedly hyped their own products to argue that their impressive power demands collective decisions on safety parameters for AI alignment (i.e., the extent to which AI aligns with human values). We can read his message with cynicism (“let's all look at how great OpenAI is!"), and we can note that his interviews and congressional testimony suggest that he seems dangerously naive about how social change happens and the extent to which AI has already been weaponized against vulnerable populations, and we can be aware of how the foundational work on AI safety and ethics by AI researchers (many of them women) have been brushed aside for a narrative that promotes existential risk as our main concern (Troy). 

Despite the complexities behind the motivations of corporations who are developing this technology and the differences in opinions among AI researchers, we believe that these tools are likely to be adopted rapidly in certain sectors of the writing economy in the coming months and years, and fostering student understanding of them is important. This instructional experimentation will collectively put us in a much better position to determine, to the extent that we are able, how these tools should be adopted, and how we might resist them when necessary.

Acknowledgements

Thank you to the staff at the WAC Clearinghouse, especially Lindsey Harding, who has championed and shepherded and edited this collection. We appreciate the incredible dedication and ingenuity of the teacher-authors in this collection. Thank you to the anonymous reviewers who diligently read and offered comments on the introduction and the assignments. We also appreciate the educators out there who are working overtime to learn about technological tools as they influence the teaching of writing (some of you by reading this collection!)—despite underfunding, the Covid-19 epidemic, and now highly accessible generative AI. You made this collection possible. A backhanded thank you to OpenAI for releasing ChatGPT and instigating an AI arms race with little understanding of how LLMs will be exploited by malicious actors or weaponized against the poor. Keep on believing in that future of techno utopianism! You made this collection necessary. 

References (APA)

Anson, C. (2022). AI-Based Text Generation and the Social Construction of Fraudulent Authorship: A Revisitation. Composition Studies, 50(1), 37–46. https://files.eric.ed.gov/fulltext/EJ1361686.pdf

Bender, E., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. FAccT ’21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 610–623. https://doi.org/https://doi-org.libproxy.lib.unc.edu/10.1145/3442188.3445922

Baron, D. (2009). A better pencil: Readers, writers, and the digital revolution. Oxford University Press.

Brandt, D. (2007). “Who’s the President?”: Ghostwriting and Shifting Values in Literacy. College English, 69(6), 549–571. http://www.jstor.org/stable/25472239

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., & Kaplan, J. (2020). Language Models are Few-Shot Learners. https://arxiv-org.libproxy.lib.unc.edu/abs/2005.14165

Byrd, Antonio. “Where We Are: AI and Writing.” Composition Studies, vol. 51, no. 1, 2023, pp. 135–42.

Campbell-Kelly, M. (1985). Christopher Strachey, 1916-1975: A Biographical Note. IEEE Annals of the History of Computing, 7(1), 19–42. https://doi-org.libproxy.lib.unc.edu/10.1109/MAHC.1985.10001

Common Crawl. (2023). https://commoncrawl.org/

Foltynek, T., Bjelobaba, S., Glendinning, I., Khan, Z. R., Pavletic, P.,  Kravjar, J., & Santos, R. (2023). ENAI Recommendations on the ethical use of Artificial Intelligence in Education. International Journal for Educational Integrity, 12, 1–19. https://doi.org/https://doi.org/10.1007/s40979-023-00133-4

McKee, H., & Porter, J. (2020). Ethics for AI Writing: The Importance of Rhetorical Context. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES ’20), 110–116. https://doi.org/https://doi-org.libproxy.lib.unc.edu/10.1145/3375627.3375811

—. (2018, April 25). The Impact of AI on Writing and Writing Instruction. Digital Rhetoric Collaborative; Sweetland DRC. https://www.digitalrhetoriccollaborative.org/2018/04/25/ai-on-writing/

The National Commission on Excellence in Education. (1983, April). A Nation at Risk: The Imperative for Educational Reform. United States Department of Education. http://edreform.com/wp-content/uploads/2013/02/A_Nation_At_Risk_1983.pdf

Perrigo, B. (2023, January 18). Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic. Time. https://time.com/6247678/openai-chatgpt-kenya-workers/

Rescher, N. (2014). Leibniz's Machina Deciphratoria: A Seventeenth-Century Proto-Enigma. Cryptologia, 38(2), 103-115, DOI: 10.1080/01611194.2014.885789

Rettberg, S. (2018). Electronic Literature. Polity Press.

Riskin, J. (2016). The Restless Clock: A History of the Centuries-Long Argument over What Makes Living Things Tick. University of Chicago Press. http://ebookcentral.proquest.com/lib/pitt-ebooks/detail.action?docID=4391637.

Swift, J. (1985). Gulliver's travels. New York, Avenel Books.

Thorp, H. H. (2023). ChatGPT is fun, but not an author. Science, 379, 313–313. https://doi.org/10.1126/science.adg7879

Troy, D. (2023, May 1). The Wide Angle: Understanding TESCREAL — the Weird Ideologies Behind Silicon Valley’s Rightward Turn. The Washington Spectatorhttps://washingtonspectator.org/understanding-tescreal-silicon-valleys-rightward-turn/ 

Turing, A. (1950).  Computing machinery and intelligence. Mind, 59(236), 433–460. https://academic.oup.com/mind/article/LIX/236/433/986238

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention Is All You Need. https://arxiv-org.libproxy.lib.unc.edu/abs/1706.03762

Front Matter

Read the collection's CFP and meet the editors. 

AI Literacy

The AI literacy grouping helps students to develop a crucial suite of critical thinking skills needed to work with emerging technologies: functional awareness, skepticism about claims, and critical evaluation of outputs.

Creative Explorations

Creative explorations play around the edges of text generation technologies, asking students to consider the technical, ethical, and creative opportunities as well as limitations of using these technologies to create art and literature. Many of these assignments look beyond our contemporary scene of LLM text generation and lend valuable context to our current moment, drawing from earlier technologies or historicizing connections. 

Ethical Considerations

In the ethical considerations category, assignments are split between two primary foci—the first engages students in the institutional ethics of using LLMs in undergraduate classrooms and the second attends to the ethical implications of LLMs and their outputs.

Professional Writing

This section presents assignments that enable students to understand how computational writing technologies might be integrated into workplace contexts. Unlike academic discourse, professional writing is not grounded in an ethos of truth-seeking and critical inquiry; it tends to be grounded in an ethos of efficacy as well as constraints of legality and workplace ethics.

Rhetorical Engagements

These assignments ask students to consider how computational machines have already and will become enmeshed in communicative acts and how we work with them to produce symbolic meaning.