option
Home
News
AI for the world, or just the West? How researchers are tackling Big Tech's global gaps

AI for the world, or just the West? How researchers are tackling Big Tech's global gaps

April 12, 2025
109

AI for the world, or just the West? How researchers are tackling Big Tech's global gaps

Since the launch of OpenAI's ChatGPT in 2022, artificial intelligence (AI) has woven itself deeply into the fabric of our daily lives. However, the spotlight often shines on AI products designed with American and European audiences in mind, despite claims of being universal tools that democratize access to technology. From the applications they serve to the languages they support, these tools are not always as global as they seem.

Across Africa, researchers and technologists are pushing back against this trend, challenging the status quo and the broader power dynamics within the AI industry. Their work seeks to shift the focus towards solutions that genuinely cater to local needs and communities.

A Global AI Power Imbalance

The Distributed AI Research Institute (DAIR) stands as a beacon of change, an international collective dedicated to "independent and community-rooted AI research free from Big Tech's pervasive influence." I had the opportunity to speak with DAIR members who are crafting AI solutions tailored specifically for African contexts, addressing societal needs rather than the interests of multinational corporations or predominantly Western users.

Nyalleng Moorosi, a senior researcher at DAIR based in Lesotho and a founding member of Deep Learning Indaba, is one such trailblazer. Her background in machine learning and experience teaching in South African public schools has shaped her views on equity in technology. As a former educator at the University of Forte—one of the few universities in South Africa that admitted black students during apartheid—she saw firsthand how poverty affected students' educational journeys. "It was mind-boggling to imagine doing the things that I did through undergrad and post-grad burdened by so much insecurity," she reflected.

After her teaching stint, Moorosi joined Google as one of the first employees at the Google Africa AI research lab in Ghana. Her role as a software engineer allowed her to develop methodologies and technologies aimed at ensuring responsible AI development. "I joined Google because they were building an office in Africa, and I wanted to be in Africa," Moorosi explained. "I didn't want to just go to Google. I wanted to go to Google Africa."

However, a conversation with Timnit Gebru, DAIR's founder and a former co-lead on Google's ethical AI team, prompted Moorosi to question whether Google was the right platform for the kind of equity-focused work she envisioned in machine learning. This led her to join DAIR, where she and Gebru sought to empower communities historically sidelined by the tech industry by keeping and funding local experts on the ground.

DAIR's AI Study

In 2018, Moorosi, Gebru, and DAIR fellow Raesetje Sefala embarked on a project to analyze satellite imagery of South African townships—historically working-class neighborhoods populated by Black residents. Their aim was to understand how these areas had evolved since the end of apartheid. They compiled a dataset to assess whether the quality of life for township residents had improved over time.

South African townships, located on the outskirts of cities, often suffer from underdevelopment and poorer living conditions compared to wealthier suburbs. The government's census data, which tends to favor more affluent areas, has rendered township data nearly invisible, perpetuating spatial apartheid and limiting access to essential services like healthcare, education, and green spaces.

DAIR's research faced challenges due to the limitations of existing South African AI models, which struggled to differentiate townships from suburbs. To overcome this, the researchers utilized millions of satellite images and geospatial data to train machine-learning models. These models successfully categorized areas into wealthy, non-wealthy, and nonresidential building clusters, including vacant land or industrial zones.

Despite these efforts, DAIR faced resistance when attempting to publish their findings. Predominantly white Western academic institutions criticized the study as merely geographical rather than machine-learning research. Moorosi expressed frustration: "We use the same metrics, algorithms, and communication methods, including plots and everything. It's so crazy because many toy datasets were being used then, but we had this dataset about actual things, and it was too niche."

Yet, Moorosi emphasized the study's relevance: "This tracking of how historical segregation affects how we live is present in many ex-British colonies. It's in Nairobi. It's in Lagos. In the colonies, it was standard that the white people lived there and the black people lived there. And the distribution of resources was different between there and there."

She highlighted that the study's content, rather than its quality, seemed to undermine its recognition in a Western-dominated industry.

Providing for Underserved Communities

Asmelash Teka Hadgu, co-founder and CTO of Lesan AI and a research fellow at DAIR, further underscored this point. He discussed Lesan, a tool designed for translating and transcribing Indigenous African languages. Unlike US-based tech giants, Lesan AI focuses on low-resource languages like Amharic and Tigrinya. Hadgu's personal connection to these languages enabled him to build a robust dataset using repurposed local newspaper and radio content.

In the African context, popular language models from tech giants like OpenAI and Anthropic fall short in representing the continent's diverse linguistic landscape. According to Wei Rui Chen's paper, Fumbling in Babel: An Investigation into ChatGPT's Language Identification Ability, African languages receive the least support. "OpenAI's ChatGPT is utterly broken, not slightly wrong, but creating gibberish in languages such as Amharic and Tigrini," Hadgu noted. "Yet, they're still doubling down on that old way of thinking that centers on finding solutions for English first. And assuming other languages will catch up."

Lesan aims to bridge this gap by providing accurate translations for millions of users, opening up web content to these communities. Hadgu emphasized that these languages are not mere add-ons: "We don't spend 95% of our resources on a handful of languages and then work on what they term as long-tail languages."

Western AI companies struggle to represent low-resource languages adequately because these languages are less available for data scraping online, particularly when compared to English-dominated content. Additionally, the data used to train AI models is predominantly from Europe and North America, with only a small fraction coming from Africa, according to a study by the Data Provenance Initiative.

Hadgu criticized the approach of projects like Facebook's No Language Left Behind, which he described as relying on "convenience" data scraping and automated methods. He noted that African languages receive minimal funding compared to English-focused initiatives. Bloomberg reported that Orange SA, in collaboration with OpenAI and Meta Platforms Inc., is working to address this by training AI programs on African languages like Woolof, Pulaar, and Bambara.

However, many African languages rely on tonal systems and oral traditions, which are often overlooked by Western LLMs. Hadgu emphasized the importance of involving elders and community members to ensure accurate representation of local contexts.

Even when Big Tech companies collaborate with smaller AI startups to develop language-specific models, they often exploit open-sourced work to capture ideas and resources. Georg Zoeller of the Centre for AI Leadership in Singapore highlighted this issue: "By open-sourcing the basic tools for AI, hyperscalers have enabled startups to build products in the field and used it to replace internal teams as the primary source of product R&D."

Dr. Paul Azunre, co-founder of Ghana NLP, shared his experience of big companies poaching data without compensation. After Facebook used their data for an open-source model, they approached Ghana NLP for funding proposals. "Once Facebook came to us after they put out a model, which was open source and was built on our data. Then, they were doing an open call for proposals. They came to us and said, 'Why don't you put in a proposal for funding?' And we said, 'Well, you're already using our work.' 'So what else do we need to prove to you? Just pay us,'" Azunre recounted.

Ghana NLP focuses on filling the gap in software products like Google Translate by developing voice-speech recognition, text-to-speech, and speech-to-text translation in local languages such as Twi, Ewe, Yoruba, Fante, and Ga, with plans to expand to neighboring countries. Azunre emphasized the importance of prioritizing local communities: "As a developer who tries to make self-sustaining products, I am sympathetic to why certain products or projects are prioritized in a certain way. We are going to put out Twi first because in Ghana we have 30 million Twi speakers… but the difference between what we are doing and tech giants is for us, the guiding principle is the locals are top of mind."

He stressed the necessity of keeping jobs and data control within the communities from which the knowledge is extracted, advocating for community data sovereignty and the creation of local data sources to empower African communities and preserve their linguistic and cultural identities in AI solutions.

What's Next for AI in Africa

Tech governance researcher Chinasa T. Okolo noted that several African governments are developing AI governance frameworks to counter the influence of multinational corporations. Seven African countries have drafted national AI strategies, though none have yet implemented formal AI regulation strategies. The South African government has released a National AI Policy Framework to ensure equitable access to AI technologies, particularly in underserved and rural areas. Additionally, 36 African countries have established data protection regulations, paving the way for more comprehensive AI regulatory frameworks.

Meanwhile, Western AI companies are beginning to focus on regional-specific LLMs, such as Mistral's model for Arabic-speaking countries in the MENA region and Meta's expansion of Meta AI to support Arabic-speaking users. However, the parallels between colonial extraction and current AI development trends are becoming increasingly evident. MIT Tech Review's Karen Hao pointed out: "While it would diminish the depth of past traumas to say the AI industry is repeating the exact modalities of colonial violence today, it is now using other, more insidious means to enrich the wealthy and powerful at the great expense of the poor."

Related article
Hawaiian Beach Escapades: New Bonds and Surprising Turns Hawaiian Beach Escapades: New Bonds and Surprising Turns Picture yourself on a pristine Hawaiian beach, sunlight warming your skin, waves crafting a calming rhythm. For Josh, this vision became reality after years of dedication. What begins as a tranquil ge
Ozzy Osbourne's 'Crazy Train' Animated Video: A Deep Dive into Its Art and Impact Ozzy Osbourne's 'Crazy Train' Animated Video: A Deep Dive into Its Art and Impact Ozzy Osbourne's 'Crazy Train' transcends its status as a heavy metal classic, embodying a cultural milestone. Its animated music video delivers a striking visual journey that amplifies the song's raw
EleutherAI Unveils Massive Licensed Text Dataset for AI Training EleutherAI Unveils Massive Licensed Text Dataset for AI Training EleutherAI, a leading AI research group, has launched one of the largest collections of licensed and open-domain text for AI model training.Named the Common Pile v0.1, this 8-terabyte dataset was deve
Comments (42)
0/200
WillieJohnson
WillieJohnson August 26, 2025 at 1:25:25 AM EDT

AI's global reach sounds grand, but it’s mostly a Western party. Cool to see researchers poking at Big Tech's blind spots—hope they dig deeper! 🌍

DavidLewis
DavidLewis August 4, 2025 at 2:01:00 AM EDT

It's wild how AI like ChatGPT seems so universal but mostly caters to Western vibes. Kinda makes you wonder if the 'global' tag is just marketing fluff. Are we ever gonna see AI that truly gets the rest of the world? 🤔

JustinJackson
JustinJackson April 23, 2025 at 2:47:47 PM EDT

AI para o mundo ou só para o Ocidente? É legal que a AI esteja em todos os lugares agora, mas por que parece sempre feita para americanos e europeus? Queria que focassem mais em torná-la realmente global. Ainda assim, é um passo na direção certa! 🌍👀

WilliamAllen
WilliamAllen April 22, 2025 at 3:37:38 PM EDT

AI for the world? More like AI for the West! It's cool that AI is everywhere now, but why does it always seem tailored for American and European folks? I wish there was more focus on making it truly global. Still, it's a step in the right direction! 🌍👀

CharlesWhite
CharlesWhite April 21, 2025 at 10:11:35 PM EDT

¡Lectura interesante sobre el impacto global de la IA! Es genial ver a los investigadores abordando las brechas en la tecnología, pero es frustrante ver tanto enfoque aún en Occidente. Necesitamos más herramientas diseñadas para todos, no solo para los sospechosos habituales. ¡Sigan empujando por una IA verdaderamente global, chicos! 🌍

JackPerez
JackPerez April 21, 2025 at 2:15:34 PM EDT

Leitura interessante sobre o impacto global da IA! É ótimo ver pesquisadores abordando as lacunas na tecnologia, mas é frustrante ver tanto foco ainda no Ocidente. Precisamos de mais ferramentas projetadas para todos, não apenas para os suspeitos habituais. Continuem empurrando por uma IA verdadeiramente global, pessoal! 🌍

Back to Top
OR