Typewriter keys and hammers with letters from an early 20th-century typewriter

Why I’m banning AI-generated text from my indigenous rights course

Okay, so I did the research and thought about it.

This has been the leading exhortation for faculty on managing the likelihood that students will use generative AI tools to write papers in their classes: try it out, consider how it might be useful, and write a very nuanced policy.

As designed, large language models (LLMs) like ChatGPT produce intelligent-sounding responses to a wide variety of queries. To do so, they are trained on billions of pieces of writing and develop a predictive model for the relationship between words. Because of this underlying corpus, and the feedback provided by those millions of complete examples and extensive rating by paid human assessors, they generate pleasing content that often uncannily resembles comprehension.

I’ve maintained an open session to experiment with ChatGPT, poked and prodded at its limitations, explored how it remixed and regurgitated material I’ve written, took (most of) an online prompt engineering training by a colleague on Coursera, and entered my writing assignment prompts to see what it comes up with.

And my considered answer is basically, “No.”

No, they shouldn’t use LLMs to replace either search engines, library databases, or Google Scholar. No, they shouldn’t treat LLM output as a summary of the field of human knowledge. And no, students shouldn’t be submitting large language model-generated essays to my class.

In the end, the two main things I’m looking for in class essays are self-reflection and research. And while I can get the appearance of both from large-language model the first is a lie and the second an uncertain and fragile illusion. Allow me to illustrate…

Reflective Essays by Chatbots

I prompted ChatGPT 4 to produce a number of short essays in response to prompts I give my students. And I used prompt engineering techniques I learned on Coursera to produce better content than a simple cut-and-paste of my prompt would suggest. In reviewing the results, I found two unavoidable problems.

First, every time a student uses LLM output with the word “I” on a reflection assignment, they’re telling a lie. To wit:

This perspective is one that I found somewhat unfamiliar initially, but upon exploration, it unveils a complex narrative of cultural exchange that goes beyond the conventional Eurocentric perspectives often presented in mainstream history. … Upon deeper reflection, this concept has shifted my perspective on transatlantic interactions.

ChatGPT-generated text, August 22, 2023.

These sentences are neither true of the student (who hasn’t necessarily “explored” at all in prompting the LLM) nor of the Large Language Model (which doesn’t necessarily have a perspective on history). But my teaching goal here is no more and no less than have the student rethink their perspective. The assignment is about the process, not the product.

Second, even in the best output I extracted from ChatGPT tended to be ungrounded in specifics; long on conclusions and short on evidence. Like this:

Conversely, this sanitization harms indigenous communities themselves by perpetuating stereotypes, erasing their contributions, and diminishing the ongoing challenges they face. By presenting a one-sided narrative, the mainstream version of history perpetuates harmful misconceptions about indigenous peoples, portraying them as relics of the past rather than as vibrant and diverse communities with rich cultural traditions and contemporary relevance. This erasure reinforces the marginalization and discrimination faced by indigenous individuals and communities, contributing to the cycle of invisibility and cultural appropriation.

ChatGPT-generated text, August 22, 2023.

It’s this kind of writing that always frustrates me the most as a grader. Okay, there’s ethical alignment with conclusions I might reach, but it’s also full of zombie nouns and returns again and again to the abstractions raised in the prompt. It summarizes the relations between concepts rather than persuading the reader to feel. By telling but never showing, it fails to force the reader to confront any particular facts, to encounter particular people, or to demonstrate deep engagement with a person from history, a historical tragedy, or a community’s loss. Adequate though not excellent as a conclusion, it’s exactly the wrong place for a student to start their writing process.

These are not new problems, of course. They’re literally the same problems raised by cheating (consensually borrowing someone else’s words or abilities) or plagiarism (borrowing someone else’s written thoughts). Yet both cheating and plagiarism are terms in the realm of competition, of obtaining an unfair advantage over others. My concern lies elsewhere, with not taking on the challenge offered by the course — not doing the work I think offers the student something valuable apart from a grade — and yet striving to give the impression that they did.

These words were already on my syllabus:

While collaboration in working out the meaning of course readings is encouraged, you should not work with other students on formulating your response posts. Invest the necessary time to be able to provide your own answers. While you are encouraged to use one another as resources for finding resources and references for the Wikipedia and essay assignments, your words and thoughts must be your own, based on your own reading of the materials. If your topics do not overlap, you are encouraged to use one another as resources for editing.

Lastly, and most importantly, every assignment in this course has its purpose to encourage you to reach your own synthesis of the facts, to thoughtfully reflect, and to formulate and express your own opinions and perspectives. The value of your writing for this class—and for yourself—ultimately depends on cultivating this ability, for which no amount of well-cited facts and wise opinions of others can be a substitute.

ANTH 3134 Syllabus

There’s another reason for concern, however, and it has to do with content…

Chatbots Recycle Existing Narratives

One way I was pretty sure ChatGPT would go wrong is by reusing tired tropes about disappearing Indians to frame issues. It’s already part of my teaching to warn students about this, and we devote an entire day to studying how false narratives are baked into the language used to write American history.

While ChatGPT will reject requests to write about “the last Maori” or the “the last Huaorani”, images conjured by Jimmy Nelson’s photo book Before They Pass Away, it was more than willing to tell me about “the last Yaghan,” a side effect of coverage of Indigenous elder Christina Calderón’s notoriety as the purported final speaker of the language, which she passed on to her grand-daughter and other members of this community of some 1,600 individuals. ChatGPT writes of this living community in the past tense:

The Yamana people, also known as the Yaghan, were an indigenous group native to the southernmost regions of South America, specifically inhabiting the Tierra del Fuego archipelago, which encompasses parts of modern-day Chile and Argentina. Their unique culture, language, and way of life were intricately tied to the challenging and harsh environment of the region. Over time, due to various factors including colonization, disease, and cultural assimilation, the Yamana population dwindled, and their language, once vibrant, was endangered. The last speakers of the Yamana language, also known as Yagan, hold a significant place in the history of linguistic and cultural preservation.

ChatGPT-generated text, August 22, 2023.

Similarly, a more neutral prompt about “the population future of the Waorani people” yielded…

Historically, the Waorani led a fairly isolated existence, but contact with outsiders, especially since the mid-20th century, has brought profound changes. This contact has come in the form of missionaries, oil companies, and even eco-tourists. The two most significant impacts of this contact are exposure to diseases, to which the Waorani had no immunity, leading to significant population decline, and the gradual encroachment on their traditional territories by external entities, particularly oil companies. … Deforestation, water contamination, and disturbance of wildlife directly impact the Waorani’s ability to live off the land, potentially leading to displacement, altered diets, and changes in their way of life. … However, it’s not just external factors that will determine the population future of the Waorani. Internally, the tribe is at a crossroads. Many Waorani youths are attracted to the conveniences and opportunities that modern cities offer. The allure of education, modern healthcare, employment, and the broader world is strong, leading many to migrate to urban areas, contributing to the potential decline of the population in traditional settlements.

ChatGPT-generated text, August 22, 2023. (emphasis mine)

As a mash-up of writing on recent Waorani history, this was adequate. But the missing piece? Huaorani population has been rising steadily since a low point of under 300 in the 1980s to well over 2,500. Asked directly, ChatGPT was able to fabricate a population trajectory that was at least upward:

It’s important to note that exact population figures for the Waorani can be difficult to pinpoint due to their semi-nomadic lifestyle, the vastness and inaccessibility of their territories, and inconsistent methodologies in various surveys. But, based on my last update in 2021, here’s an approximation of their population in the last five decades:
1970s: Approximately 600-700 individuals.
1980s: Approximately 1,000 individuals.
1990s: Approximately 1,500-2,000 individuals.
2000s: Approximately 2,500-3,000 individuals.
2010s: Approximately 3,000-4,000 individuals
Please note that these are rough estimates.

And pestered again that this was the exact opposite of its essay, it came up with:

The population trend for the Waorani people, based on the approximate numbers provided for the last five decades, indicates a consistent growth.

Of course, I was only able to lead ChatGPT out of the misconception of declining Waorani population because I knew better. The model is equally happy to follow false suggestions, such as the notion that oil development, on the horizon but never activated, has “caused devastation” in Bolivia’s Isiboro-Sécure National Park and Indigenous Territory.

Prompt: Write an essay on how the population of TIPNIS has been affected by oil development.
ChatGPT:
TIPNIS, or the Isiboro Sécure National Park and Indigenous Territory, located in Bolivia, represents a unique confluence of pristine natural environments and indigenous cultural significance. Sprawling across more than 12,000 square kilometers, TIPNIS is home to various indigenous groups, including the Moxeño, Yuracaré, and Chimané. Over the years, the territory has found itself at the forefront of debates surrounding indigenous rights, environmental protection, and the demands of national economic growth, primarily due to oil development. The exploration and extraction of oil, and the subsequent infrastructure developed to support these activities, have had profound impacts on the population of TIPNIS: (followed by a six-point list)

(For accurate information on the looming future threat of oil extraction in TIPNIS, see this article.)

Research is the art of finding out whether and, if so, how an important, general phenomenon plays out in a specific circumstance. These responses show that what large-language models generate is not research, but the merging of existing, human-written general conclusions with characteristic local details. They are This makes LLMs a very risky research tool, capable of reinforcing wrong hunches and leading us on the wrong track.

Here’s the warning I’m sharing on my syllabus:

A further note on large-language models and this course
One focus of this course is unlearning commonly circulated misinformation and misunderstandings about Indigenous people. The training data for large language models includes these ideas in large quantities. Their output should be approached as an object for criticism rather than a starting point for research.

Syllabus for Human Rights of Indigenous Peoples course

How It Feels to Say “No” This Time…

Large-language models have shaken up public perceptions of the possibility of software doing intellectual work, and offered a quick-fix tool for meeting others’ expectations for text content. Where the form is the point, they can sometimes synthesize text that looks great and are likely to be important tools for technical writing, when heavily supervised and error-corrected by an expert in the topic at hand. (At least that’s my impression from my parallel explorations of AI as a support for coding in R, but that’s another essay.)

But in a teaching context, chatbots’ stream of text cannot substitute for the student’s own thoughts. In this they replicate the same danger of the over-eager, and people-pleasing student who thinks they know what sophisticated writing looks like and who plays to the professor’s opinions instead of immersing themselves in the evidence and its details, telling effective stories, and being honest. I’m hoping that being attuned to the problems of generative AI will make me give better feedback to those students as well.

In a research context, I foresee a soup of truthful information, conjecture, and overgeneralization entering into writing. We are going to have to be so much more careful about tracing sources and verifying evidence. Both grading and peer review are about to get more cognitively challenging.

Tomorrow, at the front of my classroom, I’m going to be sharing what I’ve learned in looking at generative AI, and making an appeal to think carefully about how these tools can mislead. And just as importantly, I’ll be making a plea to my students to consider about what can be gained by thinking through situations on our own. For the value of the time that they will spend in thought, rather than save by having a machine write their texts.

Leave a comment