Using AI to ‘decolonise’ language  · Global Voices
Forus

Illustration by Both Nomads/DALL·E 2, used with permission.
We need to find a new language, new narratives, a new world view and the tools to create the words to overcome the “grand humanitarian narrative.” Can technology help?
Many of us this past month have heard about the “Matrix”-like functions of the new AI-powered ChatGPT, which is leaving citizens unsettled, shocked, and in several other emotional states, ranging from fear to enthusiasm.
The use of AI remains niche for most civil society organisations and activists. Yet, several initiatives that use emerging technologies and artificial intelligence to “decolonise” language are appearing – especially in the development sector.
The question of whether development language is colonial is a complex and debated one. The imposition of European languages and cultures on colonized peoples has been a key component of the process of colonialism. Colonial powers imposed their tongues, often leading to the erosion of indigenous languages. Development language can be seen as a continuation of this legacy, as it often excludes indigenous and minority languages, reinforcing patterns of linguistic and cultural dominance.
“Language is a critical part of a community's identity and heritage. As we lose languages, we lose a part of our collective history and culture. Technology can help us preserve and promote these languages, but we must do so in a way that is respectful and inclusive,” says Anju Mary Paul, research lead at FirstVoices a platform that includes an AI-powered language archive, which allows users to search and access audio and video recordings of indigenous languages.
“The fact that we use French or English as development languages whereas in certain communities like in Africa we have several tribal languages is a problem. What happens when people do not understand the language we are using?” asks Dr. Moses Isooba, executive director of the Uganda National NGO Forum (UNNGOF).
Isooba is part of a team developing a language and lexicon prototype using AI to enable more inclusive communication that is less (or not) implicitly neo-colonial, sexist or racist. The idea is to explore the application of AI to redress and replace the lexicon of pejorative jargon, idioms and terminology “imposed” on civil society organisations involved in the international aid complex and the development sector.
The team is designing a lexicon to shift power in two ways: first by abolishing problematic language, while encouraging more inclusive communication using a new find-and-replace style function, and second by enabling people to use their own local language by testing their work with Luganda, the language commonly used in Uganda.
Terms such as “desk officer” or “in the field” are reminiscent of the colonial era. International humanitarian organisations often refer to their offices in the Global South as “country offices,” akin to colonial outposts “acting as the remaining rope tying us to the colonial ship,” Dr. Isooba says. He further explains:
Decolonising language is critically important because language sets the terms of engagement between the north and south. In as much as possible language should be depoliticized and decolonized. Development and humanitarian assistance are political, but our language should remain nonpartisan. Political parties will come and go, but inequalities will remain if our language begins to mirror that of the people in power, especially if the intentions behind the rhetoric are disingenuous.”
Illustration by Both Nomads/DALL·E 2, used with permission.
“Development language has become so full of jargon that even our own families don’t understand what we are working on anymore,” explains Harsh Jaitli, of the Voluntary Action Network India (VANI).
Language is among our most ancestral tools for the better and for the worst. In an online universe plagued by misinformation and attacks, various activists and civil society organisations are now focusing on “new narratives” to expand rather than shrink civic space.
In the past few years, many initiatives and organizations have been working at the nexus of AI and language. Masakhane for instance is a grassroots, community-driven initiative working to build Natural Language Processing (NLP) tools and models for African languages, with the goal of preserving and promoting them in the digital age. Interpretation and translation are also part of the game, and organisations are now working on broadening the spectrum, for instance with this free Chichewa-English translation glossary of development terminology. AI for the People focuses on advocating for more inclusive and equitable development and deployment of AI technologies. It also offers resources and training for communities to engage with AI.
Yet, according to NeuralSpace, a platform that helps bridge the massive language gap in tech, 90 percent of NLP solutions are exclusively available in European languages. AI has the potential to break down language barriers for the remaining six billion people by countering this form of “digital apartheid.” How? By bridging language barriers with complex translation systems or language learning apps, for instance. Other ways include the creation of digital archives involving Optical Character Recognition or speech recognition and by making content available in native languages rather than the omnipresent “colonial” ones.
But all this sounds too good to be (just) true. Before AI can decolonise language, AI itself, must be decolonised.
AI has the potential to transform how we communicate, but it’s also the perfect transmitter for exacerbating existing biases. “We need to address issues of bias in language models and ensure that they are inclusive and representative of all users,” says Soujanya Poria, assistant professor at Nanyang Technological University.
The language AI is trained on might already be “distorted” and incapable of replacing biased human thinking and behaviour.
“The decolonization of language is critically important for empowering communities and challenging dominant narratives. Technology has the potential to be a powerful tool in this process, but we must be careful not to replicate the same patterns of exclusion and marginalization that we are trying to overcome,” says Dr. Isooba.
We asked ChatGPT whether or not AI could replicate colonial, racist or sexist language and this is what it answered:
As an artificial intelligence language model, I am not capable of holding or expressing personal beliefs, values, or biases. However, I acknowledge that the data used to train me may contain biases and I strive to provide information that is accurate, unbiased, and inclusive. It is important to note that it is people who program and use AI, and they can unintentionally embed their own biases in the development and use of AI systems. Therefore, it is crucial to be aware of these potential biases and strive to create and use AI systems in an ethical, inclusive, and responsible manner.
In the words of Timnit Gebru, of Black in AI, we need to move away from thinking about “clean” and “dirty” data and towards thinking about data that reflects the diversity of the real world.
“One of the best ways to do this is to build diverse teams of data scientists who are able to recognize and address their own biases,” he says.
Sabelo Mhlambi and other scholars recently drafted the Decolonial AI Manyfesto, to “move beyond Western-centric biases” and build “dignified socio-technical futures.” Civil society, as well, is increasingly involved in AI development by actively participating in public discussions and debates around the ethical, social, and political implications of AI technologies. Yet, several civil society leaders and activists
such as Mwarimū Mūtemi wa Kīama from Kenya, arrested for his tweets, argue that the purposeful exclusion of marginalized voices from tech decision-making processes is a form of structural violence.
“We need to recognize that access to technology is a human right and work to ensure that everyone has a seat at the table,” says Kīama.
Dianne Olivan, a digital rights and gender campaigner tells us how the marginalised are “sealed off” from decision-making environments because these spaces move within a neocolonial framework:
Those who usually participate in these conversations also have the privilege to participate; they have the economic and cultural capital to contribute and thrive. There are of course attempts to be representative and inclusive, but we should ask: Where are these conversations happening? What do I need to have to participate? In what language are they sharing meaning? Whose voices are the loudest in the room? What happens after these conversations, what is the accountability framework?
After these “consultative” convenings in digital rights policymaking and internet governance, we will all go back to the same digital power structures that allow and disallow the information we access and retain, and what identities are allowed to exist in these platforms, says Olivan:
Truly inclusive digital communities and infrastructure require transformative lens from their inception – they reject a capitalist internet. They are driven by creativity, sharing, care, consent, and well-being; they challenge and circumvent infrastructures of surveillance. And we will build that by first, shifting our focus from working towards being accepted by the loudest, most articulate voices in a stingy meeting room, to holding space and looking for those who are not even able to go through the door.