The AI Race in Africa: Building With AI vs Being Built For
Global AI companies see Africa as a training data source and an emerging market. African technologists are trying to change that equation.
In March 2024, Scale AI’s Remotasks platform — which employed Kenyan and
Nigerian data labellers — abruptly shut down operations in both countries after
workers logged over 20 hours of work earning less than $1 in total.[1] The
same month, Meta, after facing lawsuits in Nairobi, quietly relocated its
content moderation operations from Kenya to Ghana, recruiting workers to
moderate East African language content from the other side of the continent.[1]
These are not isolated incidents. They are the visible surface of a
structural condition: Africa contributes its data, its languages, its cognitive
labour, and its users to a global AI industry whose value is captured almost
entirely elsewhere. The data-labelling market alone is valued at $2.8 billion
today and projected to reach $18.3 billion by 2035.[1] African linguistic data
— in Wolof, Oromo, Igbo, Swahili, Hausa, Amharic — is harvested to train large
language models that are then sold back to African markets at prices set in
Silicon Valley.
This is what one analyst, writing in the Global Policy Journal in April
2026, has called “the digital scramble for Africa.”[1] And yet — something is
shifting. Across the continent, African engineers, startups, governments, and
researchers are building back. The question is whether they are building fast
enough, and with enough structural support, to change the fundamental terms of
Africa’s relationship with artificial intelligence.
The Extraction Side: What ‘AI Jobs’ in Africa Really
Look Like
The global AI industry depends on vast quantities of human-annotated data
— images labelled, sentences translated, content moderated, voices transcribed.
This work is outsourced to the Global South through platforms like Scale AI
(Remotasks), Appen, and Amazon Mechanical Turk. Africa has become one of the
primary destinations.[2]
The conditions are, in most cases, deeply exploitative. Oxford’s Fairwork
project surveyed over 700 workers across 15 digital labour platforms and
concluded that none scored better than “bare minimum” on fair pay, conditions,
contracts, management, and representation.[3] A 2025 Equidem survey of 76
workers from Ghana, Kenya, and Colombia documented 60 independent incidents of
psychological harm — including anxiety, depression, PTSD, and substance
dependence — among content moderators exposed to graphic material.[3]
$18.3bn data-labelling market value by
2035
LabGov / Global Policy Journal 2026 — up
from $2.8bn today; Africa is a primary outsourcing destination
150–430m data labourers worldwide per
World Bank
Brookings 2025 — mostly in the Global
South; driving AI systems that largely exclude them
2,000+ African language families
Yet most global LLMs are trained
predominantly on English and a handful of European languages
Workers in Kenya’s Data Labelers Association have been organising for
better wages and mental health support. The African Content Moderators Union
and the Global Trade Union Alliance of Content Moderators have formed
cross-border coalitions.[3] These are not just labour stories. They are
sovereignty stories. Because the data these workers annotate — the Yoruba
sentences, the Swahili conversations, the Igbo proverbs — will shape which
language models exist, which voices AI “hears,” and which 200 million Nigerians
or 130 million Ethiopians find themselves either included or invisible in the
AI systems defining the 21st century.
The Building Side: Africa Is Not Just Waiting
The resistance to extraction is not only organisational. It is technical.
Across Africa, a wave of homegrown AI infrastructure is emerging that is
genuinely significant — even if it remains underfunded and under-reported
relative to its importance.
Nigeria’s N-ATLAS. At the 80th United Nations General Assembly in
September 2025, Nigeria unveiled N-ATLAS — the country’s first
government-backed large language model. Developed by Lagos-based startup Awarri
in partnership with Nigeria’s National Centre for Artificial Intelligence and
Robotics, N-ATLAS is built on the LLaMA architecture, fine-tuned on over 400
million tokens of multilingual data covering Yoruba, Hausa, Igbo, and
Nigerian-accented English.[4] Nigeria’s Minister of Communications Bosun
Tijani framed it precisely: “The model places Africa’s voices and diversity at
the foundation of AI.” It is positioned as a national digital public good —
open-source checkpoints, APIs, and SDKs available to any Nigerian developer.
Homegrown products and builders. 2025 saw a burst of African-built
AI products designed for African realities. YarnGPT — built by Nigerian
engineer Saheed Ayanniyi — translates videos and generates voiceovers with
Nigerian-sounding voices, trained on Nollywood scripts and local audio. Xara,
launched in June 2025, is a WhatsApp-based AI banking assistant fine-tuned for
Nigerian speech patterns and Pidgin. Intron’s voice-to-text system understands
Nigerian and Kenyan medical terminology, allowing doctors to dictate patient
records in local languages hands-free. Gebeya’s Dala, launched in October 2025
by an Ethiopian software company, is an AI app builder designed specifically
for African infrastructure constraints.[5]
Continental policy momentum. By early 2026, at least 15 African
countries have published national AI strategies, in addition to the African
Union’s 2024 Continental AI Strategy. At the 2025 Global AI Summit, African
ministers delivered a joint declaration: “AI must not simply be imported, but
built with Africa, by Africa, and for Africa.” [6] Togo pledged to train 50,000
people in AI skills annually. Nigeria’s 3MTT (3 million Technical Talent)
programme is building the pipeline for local AI development at scale.
“Building in AI requires two major
things: computing and data. That pipeline does not exist in Nigeria at scale.”
— Babs Ogundeji, SpitchAI, TechCabal 2025
The Structural Obstacles That Must Be Addressed
The energy of Africa’s AI builders is real. The structural obstacles they
face are equally real.
Infrastructure. Africa has too few data centres, unreliable power
supply, and insufficient regional compute capacity. The World Bank’s Digital
Progress and Trends Report 2025 is direct: high-income economies dominate AI
innovation, compute capacity, and startup activity, while most African
countries “remain primarily consumers rather than creators.”[7]
Training frontier AI models requires extraordinary computing resources that are
simply not available at scale on the continent.
Data scarcity for African languages. Africa has over 2,000 living
languages across more than 15 language families.[8] Yet global LLMs are
trained overwhelmingly on English and a handful of European languages. EPIC’s
2025 ethnographic research across Ethiopia, Ghana, Kenya, Nigeria, and South
Africa found that experiments with Yoruba, isiZulu, Amharic, Afaan Oromo, Nigerian
Pidgin, and Kiswahili “revealed substantial inaccuracies in LLM performance”
with these languages.[8] Building competitive African language models
requires first building the datasets — a slow, expensive, culturally sensitive
process.
The Microsoft departure signal. In May 2024, Microsoft announced
the closure of its Africa Development Centre in Lagos, affecting local
engineers and support staff.[9] The episode was a reminder that multinational
tech investment is not a guaranteed path to local capacity building. Foreign AI
investment that does not transfer skills, localise ownership, or strengthen
domestic infrastructure is just a more sophisticated form of the same
extraction dynamic.
What the Path Forward Looks Like
The World Bank’s 2025 analysis introduces a concept that should anchor
Africa’s AI strategy: “small AI” — localised, cost-effective, compute-light
applications that do not require frontier-level infrastructure.[7] Crop
disease detection using satellite imagery. Fraud detection using alternative
credit scoring. AI-powered diagnostic tools for under-resourced clinics.
Educational tutoring in Hausa or Amharic. These are not consolation prizes.
They are the applications where African AI can be world-leading, because they
require the local data and cultural context that only African builders have.
African Business’ 2025 analysis makes the economic case plainly: “Local
data combined with local models can produce differentiated, defensible AI
products and help Africa avoid reliance on biased or inaccurate global
systems.”[10]
The $330 billion untapped credit demand in Africa, the continent’s agricultural
productivity challenges, its language diversity — these are not problems for
Western AI to solve from San Francisco. They are markets for African AI to own.
The data sovereignty dimension cannot be understated. As Brookings’ 2025
analysis argues, African linguistic data is being harvested to build models
sold back to African consumers — a pattern that will only intensify unless
governments establish data governance frameworks that require local
benefit-sharing, fair compensation for labellers, and restrictions on the
extraction of culturally sensitive data.[11]
Africa
is not absent from the AI race. It is present in the most essential way — as
the source of the data, the labour, and the languages that make global AI work.
The question is whether that presence translates into ownership, benefit, and
sovereignty, or whether it simply replicates, in digital form, the same
extractive relationship that colonialism established in the physical world.
N-ATLAS, YarnGPT, Intron, Xara, Awarri — these are not just products. They are
acts of resistance. And they need to become, with serious policy support and
genuine investment, the foundation of a continent-wide AI ecosystem built on
African terms.
REFERENCES
[1]
Global Policy Journal (2026, April 1). Africa’s AI Strategies Cannot
Say No [Scale AI Remotasks; Meta relocation; $2.8bn–$18.3bn data-labelling
market; digital scramble]. https://www.globalpolicyjournal.com/blog/01/04/2026/africas-ai-strategies-cannot-say-no
[2]
Brookings Institution (2024, November). Moving Toward Truly
Responsible AI Development in the Global AI Market [BPO outsourcing; Scale AI;
OpenAI; colonialist exploitation framing]. https://www.brookings.edu/articles/moving-toward-truly-responsible-ai-development-in-the-global-ai-market/
[3]
Brookings Institution (2025, October). Reimagining the Future of
Data and AI Labor in the Global South [Oxford Fairwork; Equidem 2025 survey;
Data Labelers Association Kenya; 150–430m data labourers]. https://www.brookings.edu/articles/reimagining-the-future-of-data-and-ai-labor-in-the-global-south/
[4]
TheCable (2025, September). Giving AI a Nigerian Accent: Inside
First Local Language Model [N-ATLAS; Awarri; NCAR; LLaMA; 400m tokens; UNGA
unveiling]. https://www.thecable.ng/giving-ai-a-nigerian-accent-inside-first-local-language-model/
[5]
TechCabal (2025, December). 10 Exciting African AI Products Launched
in 2025 [YarnGPT; Xara; Gebeya Dala; Chidi; African AI ecosystem]. https://techcabal.com/2025/12/05/10-exciting-african-ai-products-launched-in-2025/
[6]
African Business (2025, October). From Strategy to Sovereignty:
Crafting Africa’s AI Future [AU 2024 AI Strategy; 15 national strategies; 2025
Global AI Summit ministerial declaration; Togo 50,000 AI trainees]. https://african.business/2025/10/innov-africa-deals/from-strategy-to-sovereignty-crafting-africas-ai-future
[7]
African Business / World Bank (2025, November). Strengthening
Africa’s AI Foundations [World Bank Digital Progress 2025; ‘small AI’; compute
gaps; Africa primarily consumers not creators]. https://african.business/2025/11/innov-africa-deals/strengthening-africas-ai-foundations
[8]
EPIC (2025). Decolonizing LLMs: An Ethnographic Framework for AI in
African Contexts [2,000+ languages; Yoruba/Amharic/Kiswahili LLM failures;
multilingual field research]. https://www.epicpeople.org/decolonizing-llms-ethnographic-framework-for-ai-in-african-contexts/
[9]
Hadi, M. (2025, October). AI and the Future of Work in Africa:
Automation, Opportunity, or Exclusion? Medium [Microsoft Lagos closure;
Sama/Samasource labour critique; ILO generative AI findings]. https://medium.com/@mukailahadi/ai-and-the-future-of-work-in-africa-automation-opportunity-or-exclusion-be41d1b9b5d7
[10]
African Business (2025). Local Data + Local Models = Differentiated,
Defensible African AI Products [World Bank Digital Progress 2025 insight]. https://african.business/2025/11/innov-africa-deals/strengthening-africas-ai-foundations
[11]
Brookings Institution (2025). Data Sovereignty and the Need for
Benefit-Sharing in African AI Data Governance. https://www.brookings.edu/articles/reimagining-the-future-of-data-and-ai-labor-in-the-global-south/
[12]
TechCabal (2025, July). AI Is Learning to Speak African Languages,
Thanks to These Startups [Awarri; Intron; SpitchAI; N-ATLAS; Nigeria 3MTT
programme; Bosun Tijani quotes]. https://techcabal.com/2025/07/17/african-ai-startups/
Comments
Post a Comment