top of page

Generating Culture? How AI Affects Representation and Memory

In early 2024, my peers and I had the privilege of hearing Dr. Gita Chadha talk at Champaca Bookstore, Bengaluru about feminist perspectives in STEM. Dr. Chadha, a professor of Sociology and the Obaid Siddiqui Chair at the National Centre for Biological Sciences, succinctly pointed out to an attentive and enraptured audience how science has been and will continue to be a product of culture and, contrary to those who claim that it is strictly positivist, feeds on and is shaped by the biases and imaginations of those who study, implement, and propagate it. This idea might seem radical to some, vague and under-established to many, and commonplace to others who are familiar with the fact that human beings as social and cultural beings rely on their cultural perceptions to make sense of the world around them.


Cultural imagination can consist of humanity’s collective perceptions of the past, understanding of the present, and longings and fears for the future. It refers to a shared set of ideas, symbols, values, assumptions, and narratives that shape how people in a given society or group perceive the world, and relate to each other. It emerges from and is sustained by culture, history, language, and social interactions. Necessarily then, scientific and technological products of culture also cannot escape the clutches of ideology and dominant cultural views. 


History is written by those who wield the pen. Dominant perceptions are predisposed to being sculpted by those in power. In the early twentieth century, the study of history as a discipline saw a shift from a singular “history” to the acknowledgment of more pluralistic and diverse “histories”. Postmodernist scholars challenged grand narratives and drew attention to the silences, absences, and exclusions within dominant accounts. However, increasingly we see, as generative large language models start being used as information retrieval engines, a return to singularity from multiplicity– a primary all-too-powerful technology reliant on most accessible data being taken at face value and being allowed to form and alter entities that are innately human.


Narrativising Science


To understand why cultural imaginations shaped only by dominant perspectives are problematic, one need only consider the fact that this is not the first time that cultural perceptions have deferred to hegemonic narratives. Consider the accounts of the American Civil War by white historians and fiction writers, and depictions of slavery that are guilty of bolstering what is now called the Happy Slave Myth. Not only did this perception of slavery as something enslaved people were content with seek to excuse century after century of atrocious racial crimes, but it also hinged upon a racial essentialism– one that claimed that enslaved races had God-given traits of biological strength and subservience that rendered them fit to be subjugated by intellectually-superior white masters. Such narratives later also formed the basis of racial purity and supremacy that the Nazi regime in Germany actively propagated during the second world war– going so far as to fund scientific research aimed at legitimising these beliefs. 


Caste in India too was explained in seemingly scientific terms by anthropologists like Herbert Risley. In his book The People of India, Risley argues that caste was essentially race. He used nasal index measurements to claim that upper castes had more “Aryan” features and helped entrench the belief that the upper castes were racially superior because of their perceived association with a “superior” race. Edgar Thurston of the Madras Province also trained the police in the use of what he deemed was scientifically valid anthropometric analysis to identify “criminal” castes. This attitude, fortified by centuries of caste oppression in the Indian subcontinent, and reinvigorated by “scientific” colonial interventions, resulted in the maintenance of caste as a pervasive force in modern India. Even though these findings were later acknowledged to be pseudo-scientific, their association with scientific research helped embed them further into a society that ostensibly values truth-seeking. 


Science, thus, when not aided by humanitarian principles, has often been reduced to but a henchman of bigotry and prejudice. The massive technological changes of the twenty-first century are greatly promising, but it would perhaps be wise to interrogate what assumptions and exclusions these technologies are epistemologically reliant on.


The Exemplar and the Excluded To start with, it is necessary to make one thing clear, even though it may strike to some as redundant and unnecessary to point out: a large language model is not a search engine. One simply fetches information from the web, and orders them in admittedly hierarchical ways that depend on an algorithm. The other entails a restructuring, representation, and regeneration of data available to it to be able to generate something which naturally thus involves exclusions, omissions, and distortions. 


Much has been said and written about the fact that AI image generation produces generalised, homogenised outputs that make certain assumptions about the representative prototypical entities of each category. Generative AI tries to replicate closely a probabilistic model of meaning-making. Eleanor Rosch’s Prototype Theory of Meaning posits that all meaning is formed and all cognition is structured around a central, prototypical entity that best represents a category. 


For example, if one were to map out the meaning of the word “bird”, then they would think of the features that an “ideal” bird supposedly possesses– such as the ability to fly, feathers, a beak, and the capacity to sing, or lay eggs. Based on that definition, the entity that has most of these attributes would be the central prototypical exemplar of the category of “bird”. Other entities that share fewer of these features– such as a penguin or an ostrich– would be situated farther from the prototype. Their inclusion in the category is thus probabilistically weaker


Eleanor Rosch's Probabilistic Model of "Birdiness" moving from a robin as the central exemplar of a bird to an ostrich as a peripheral instance of the same.
Fig. 1 Probabilistic Mapping of “Birdiness” (Source: Words in the Mind by Jean Aitchison)

The most evident challenge to Rosch’s Probabilistic Model is that prototypicality is also contextually dependent. How are we then to decide what the prototypical “bird” is? One could argue that, similarly, generative models designate a prototypical exemplar based not on context but on statistical majority or culturally dominant depictions. Thus, asking it to generate a picture of a woman will always produce a white woman, asking it to generate an Indian name will always produce a Hindu Savarna name, and asking it to generate an image of a doctor, will always produce a male-presenting individual, unless specified otherwise.


Image of a male doctor next to a man and a woman generated by GPT-4o given the prompt: “Outside an operation theatre, a doctor sadly walks to a patient's family.”
Fig. 2 Image generated by GPT-4o given the prompt: “Outside an operation theatre, a doctor sadly walks to a patient's family.”

It follows then that, if used as an information retrieval model, Generative AI will draw from easily accessible mainstream dominant data. For the purposes of this blog and its focus on cultural memory and imagination, ChatGPT was prompted to find out what it says about life in the 1990s in India. 


When asked what the 1990s were like in India, ChatGPT responds with the major economic, political, and pop cultural events that took place in that decade. It talks about geopolitical conflicts, globalisation, and the Telecom revolution. When asked what life was like in the 1990s in India, it describes the experiences of an urban middle-class family. It touches upon rural life largely in terms of negations– how it was not like urban life and how it was lacking. Although it acknowledges that experiences were varied and diverse, it largely focuses on the probable life of what it considers to be a default citizen of India– an urban, upper middle class or middle class, Hindi-speaking person who is probably Hindu, upper caste, cisgender, and heterosexual since there is no mention of how caste, religion, gender non-conformity, or sexual orientation may have factored into individual experiences. This response may be found here.


This standardisation of experience is a representation of what will be remembered and thus, of what will be archived as culturally legitimate, and sedimented as the one-true singular “history”. If tools like ChatGPT become widespread means of remembering, storytelling, and education, then the memory it produces– because it actively produces and reproduces memory– especially about events, lives, and contexts that resist easy representation– will be skewed toward what is already visible, palatable, or statistically dominant. If the digital is increasingly where memory lives, if retention is outsourced to digital models,  then what we are witnessing is the algorithmic narrowing of cultural imagination.


Sounds Drowned Out


Another article on this website on language revitalisation using AI talks about how the Maori language was brought back to life via sustained efforts by the community but was then later co-opted by commercial actors for their own gains. The article rightly states:


“Technology can be a powerful tool, but it should always serve the community, not replace its voice. “


It explores the problem of depriving communities of agency over their language in great detail. Given that AI-assisted language revitalisation is no longer a distant hypothetical, there are some considerations that ought to be taken into account.


At present, large language models currently do not do a very good job of handling even institutionally recognised languages. They still hallucinate and misgenerate syntax and lexical items for languages like Gujarati because they are still low-resource languages. 


An example of ChatGPT hallucinating a non-existent phrase “thodu javaal chhe” in Gujarati
Fig. 3 An example of ChatGPT hallucinating a non-existent phrase “thodu javaal chhe” in Gujarati

Not only does GenAI hallucinate and make claims about low-resource languages, but people have increasingly started noticing that it can make up historical origins and cultural meanings for random made-up phrases even in dominant high-resource languages instead of acknowledging that they do not exist, like a search engine would. 


A Reddit Post about ChatGPT’s very convincing response to a question about a made-up expression
Fig. 4 A Reddit Post about ChatGPT’s very convincing response to a question about a made-up expression.

Needless to say, if more and more people start using LLMs to learn languages, then a language may be altered by these hallucinations. One might argue that technological interventions have always affected language, and that there is nothing particularly alarming about this–  change is, after all, a natural part of language evolution. Notably, dynamic and living languages like English and even Gujarati, which have vast language communities and institutional recognition, would not be shortchanged if they incorporated a few expressions or lexical-grammatical elements as a result of AI. They are in dominant enough positions for their language communities to be able to accept or reject these changes organically without any harm to the language. In the case of endangered languages, however, such hallucination may result in a dilution of the already scarce data available on that language. There may be a risk of imposing dominant linguistic trends on already vulnerable languages, suppressing them further.


Languages are repositories of cultural knowledge. One documents a language to honour the multiplicity of voices– to acknowledge that the human experience is diverse, polyphonic, and carries a range of lived realities and epistemological contributions. When this documentation becomes inundated with the patterns of dominant languages, it undermines the very goal of capturing that diversity. A vulnerable language cannot challenge or resist the imposition of dominant linguistic patterns introduced by AI; it lacks agency in this dynamic. As a result, its documentation and revitalisation risk becoming subservient to the logic and structures of dominant languages, rather than reflective of its own unique worldview and cultural logic.


Who Lives? Who Dies? Who Tells Our Story?


We have established how cultural imagination is relevant to STEM as a discipline and then we have also examined how artificial intelligence stands to alter how we engage with our collective experiences and memories. This, like all other developments, is not as innocuous as it seems. 


Designed to convey a snapshot of life on Earth to any extraterrestrial life that might encounter it, the Voyager Golden Record– a 12-inch gold-plated copper disc launched aboard the Voyager spacecraft in 1977– contained cultural entities that were supposedly meant to capture the emotional and artistic breadth of humanity. However, it understandably had to exclude countless traditions, subcultures, and marginalized voices, in order to include a selective and often idealized version of global culture. Its contents were inevitably shaped by what those in charge believed was worthy of representing all of humanity. An act of selection is inevitably an act of exclusion. That is not something that can be helped, but it can certainly be something we are wary of as a generation heralding a global technological renaissance.


In bringing up the golden record, the intention is to point out and circle back to the fact that technology is cultural and culture responds to technology. The tools we build are not neutral; the hand holding the pen, regardless of how careful it deems itself, can smudge out entire ways of seeing and being. The least we can do to make sure we don’t narrow our minds in the face of singularity and digital homogenisation, is to keep asking: Who gets to be represented? Who gets to be remembered? And who gets to remain?


Commentaires


bottom of page