The AI Race in Africa: Building With AI vs Being Built For


Global AI companies see Africa as a training data source and an emerging market. African technologists are trying to change that equation.

In March 2024, Scale AI’s Remotasks platform — which employed Kenyan and Nigerian data labellers — abruptly shut down operations in both countries after workers logged over 20 hours of work earning less than $1 in total.[1] The same month, Meta, after facing lawsuits in Nairobi, quietly relocated its content moderation operations from Kenya to Ghana, recruiting workers to moderate East African language content from the other side of the continent.[1]

These are not isolated incidents. They are the visible surface of a structural condition: Africa contributes its data, its languages, its cognitive labour, and its users to a global AI industry whose value is captured almost entirely elsewhere. The data-labelling market alone is valued at $2.8 billion today and projected to reach $18.3 billion by 2035.[1] African linguistic data — in Wolof, Oromo, Igbo, Swahili, Hausa, Amharic — is harvested to train large language models that are then sold back to African markets at prices set in Silicon Valley.

This is what one analyst, writing in the Global Policy Journal in April 2026, has called “the digital scramble for Africa.”[1] And yet — something is shifting. Across the continent, African engineers, startups, governments, and researchers are building back. The question is whether they are building fast enough, and with enough structural support, to change the fundamental terms of Africa’s relationship with artificial intelligence.

The Extraction Side: What ‘AI Jobs’ in Africa Really Look Like

The global AI industry depends on vast quantities of human-annotated data — images labelled, sentences translated, content moderated, voices transcribed. This work is outsourced to the Global South through platforms like Scale AI (Remotasks), Appen, and Amazon Mechanical Turk. Africa has become one of the primary destinations.[2]

The conditions are, in most cases, deeply exploitative. Oxford’s Fairwork project surveyed over 700 workers across 15 digital labour platforms and concluded that none scored better than “bare minimum” on fair pay, conditions, contracts, management, and representation.[3] A 2025 Equidem survey of 76 workers from Ghana, Kenya, and Colombia documented 60 independent incidents of psychological harm — including anxiety, depression, PTSD, and substance dependence — among content moderators exposed to graphic material.[3]

$18.3bn  data-labelling market value by 2035
 LabGov / Global Policy Journal 2026 — up from $2.8bn today; Africa is a primary outsourcing destination

150–430m  data labourers worldwide per World Bank
 Brookings 2025 — mostly in the Global South; driving AI systems that largely exclude them

2,000+  African language families
 Yet most global LLMs are trained predominantly on English and a handful of European languages

Workers in Kenya’s Data Labelers Association have been organising for better wages and mental health support. The African Content Moderators Union and the Global Trade Union Alliance of Content Moderators have formed cross-border coalitions.[3] These are not just labour stories. They are sovereignty stories. Because the data these workers annotate — the Yoruba sentences, the Swahili conversations, the Igbo proverbs — will shape which language models exist, which voices AI “hears,” and which 200 million Nigerians or 130 million Ethiopians find themselves either included or invisible in the AI systems defining the 21st century.

The Building Side: Africa Is Not Just Waiting

The resistance to extraction is not only organisational. It is technical. Across Africa, a wave of homegrown AI infrastructure is emerging that is genuinely significant — even if it remains underfunded and under-reported relative to its importance.

Nigeria’s N-ATLAS. At the 80th United Nations General Assembly in September 2025, Nigeria unveiled N-ATLAS — the country’s first government-backed large language model. Developed by Lagos-based startup Awarri in partnership with Nigeria’s National Centre for Artificial Intelligence and Robotics, N-ATLAS is built on the LLaMA architecture, fine-tuned on over 400 million tokens of multilingual data covering Yoruba, Hausa, Igbo, and Nigerian-accented English.[4] Nigeria’s Minister of Communications Bosun Tijani framed it precisely: “The model places Africa’s voices and diversity at the foundation of AI.” It is positioned as a national digital public good — open-source checkpoints, APIs, and SDKs available to any Nigerian developer.

Homegrown products and builders. 2025 saw a burst of African-built AI products designed for African realities. YarnGPT — built by Nigerian engineer Saheed Ayanniyi — translates videos and generates voiceovers with Nigerian-sounding voices, trained on Nollywood scripts and local audio. Xara, launched in June 2025, is a WhatsApp-based AI banking assistant fine-tuned for Nigerian speech patterns and Pidgin. Intron’s voice-to-text system understands Nigerian and Kenyan medical terminology, allowing doctors to dictate patient records in local languages hands-free. Gebeya’s Dala, launched in October 2025 by an Ethiopian software company, is an AI app builder designed specifically for African infrastructure constraints.[5]

Continental policy momentum. By early 2026, at least 15 African countries have published national AI strategies, in addition to the African Union’s 2024 Continental AI Strategy. At the 2025 Global AI Summit, African ministers delivered a joint declaration: “AI must not simply be imported, but built with Africa, by Africa, and for Africa.” [6] Togo pledged to train 50,000 people in AI skills annually. Nigeria’s 3MTT (3 million Technical Talent) programme is building the pipeline for local AI development at scale.

“Building in AI requires two major things: computing and data. That pipeline does not exist in Nigeria at scale.” — Babs Ogundeji, SpitchAI, TechCabal 2025

The Structural Obstacles That Must Be Addressed

The energy of Africa’s AI builders is real. The structural obstacles they face are equally real.

Infrastructure. Africa has too few data centres, unreliable power supply, and insufficient regional compute capacity. The World Bank’s Digital Progress and Trends Report 2025 is direct: high-income economies dominate AI innovation, compute capacity, and startup activity, while most African countries “remain primarily consumers rather than creators.”[7] Training frontier AI models requires extraordinary computing resources that are simply not available at scale on the continent.

Data scarcity for African languages. Africa has over 2,000 living languages across more than 15 language families.[8] Yet global LLMs are trained overwhelmingly on English and a handful of European languages. EPIC’s 2025 ethnographic research across Ethiopia, Ghana, Kenya, Nigeria, and South Africa found that experiments with Yoruba, isiZulu, Amharic, Afaan Oromo, Nigerian Pidgin, and Kiswahili “revealed substantial inaccuracies in LLM performance” with these languages.[8] Building competitive African language models requires first building the datasets — a slow, expensive, culturally sensitive process.

The Microsoft departure signal. In May 2024, Microsoft announced the closure of its Africa Development Centre in Lagos, affecting local engineers and support staff.[9] The episode was a reminder that multinational tech investment is not a guaranteed path to local capacity building. Foreign AI investment that does not transfer skills, localise ownership, or strengthen domestic infrastructure is just a more sophisticated form of the same extraction dynamic.

What the Path Forward Looks Like

The World Bank’s 2025 analysis introduces a concept that should anchor Africa’s AI strategy: “small AI” — localised, cost-effective, compute-light applications that do not require frontier-level infrastructure.[7] Crop disease detection using satellite imagery. Fraud detection using alternative credit scoring. AI-powered diagnostic tools for under-resourced clinics. Educational tutoring in Hausa or Amharic. These are not consolation prizes. They are the applications where African AI can be world-leading, because they require the local data and cultural context that only African builders have.

African Business’ 2025 analysis makes the economic case plainly: “Local data combined with local models can produce differentiated, defensible AI products and help Africa avoid reliance on biased or inaccurate global systems.”[10] The $330 billion untapped credit demand in Africa, the continent’s agricultural productivity challenges, its language diversity — these are not problems for Western AI to solve from San Francisco. They are markets for African AI to own.

The data sovereignty dimension cannot be understated. As Brookings’ 2025 analysis argues, African linguistic data is being harvested to build models sold back to African consumers — a pattern that will only intensify unless governments establish data governance frameworks that require local benefit-sharing, fair compensation for labellers, and restrictions on the extraction of culturally sensitive data.[11]

Africa is not absent from the AI race. It is present in the most essential way — as the source of the data, the labour, and the languages that make global AI work. The question is whether that presence translates into ownership, benefit, and sovereignty, or whether it simply replicates, in digital form, the same extractive relationship that colonialism established in the physical world. N-ATLAS, YarnGPT, Intron, Xara, Awarri — these are not just products. They are acts of resistance. And they need to become, with serious policy support and genuine investment, the foundation of a continent-wide AI ecosystem built on African terms.

REFERENCES

[1] Global Policy Journal (2026, April 1). Africa’s AI Strategies Cannot Say No [Scale AI Remotasks; Meta relocation; $2.8bn–$18.3bn data-labelling market; digital scramble]. https://www.globalpolicyjournal.com/blog/01/04/2026/africas-ai-strategies-cannot-say-no

[2] Brookings Institution (2024, November). Moving Toward Truly Responsible AI Development in the Global AI Market [BPO outsourcing; Scale AI; OpenAI; colonialist exploitation framing]. https://www.brookings.edu/articles/moving-toward-truly-responsible-ai-development-in-the-global-ai-market/

[3] Brookings Institution (2025, October). Reimagining the Future of Data and AI Labor in the Global South [Oxford Fairwork; Equidem 2025 survey; Data Labelers Association Kenya; 150–430m data labourers]. https://www.brookings.edu/articles/reimagining-the-future-of-data-and-ai-labor-in-the-global-south/

[4] TheCable (2025, September). Giving AI a Nigerian Accent: Inside First Local Language Model [N-ATLAS; Awarri; NCAR; LLaMA; 400m tokens; UNGA unveiling]. https://www.thecable.ng/giving-ai-a-nigerian-accent-inside-first-local-language-model/

[5] TechCabal (2025, December). 10 Exciting African AI Products Launched in 2025 [YarnGPT; Xara; Gebeya Dala; Chidi; African AI ecosystem]. https://techcabal.com/2025/12/05/10-exciting-african-ai-products-launched-in-2025/

[6] African Business (2025, October). From Strategy to Sovereignty: Crafting Africa’s AI Future [AU 2024 AI Strategy; 15 national strategies; 2025 Global AI Summit ministerial declaration; Togo 50,000 AI trainees]. https://african.business/2025/10/innov-africa-deals/from-strategy-to-sovereignty-crafting-africas-ai-future

[7] African Business / World Bank (2025, November). Strengthening Africa’s AI Foundations [World Bank Digital Progress 2025; ‘small AI’; compute gaps; Africa primarily consumers not creators]. https://african.business/2025/11/innov-africa-deals/strengthening-africas-ai-foundations

[8] EPIC (2025). Decolonizing LLMs: An Ethnographic Framework for AI in African Contexts [2,000+ languages; Yoruba/Amharic/Kiswahili LLM failures; multilingual field research]. https://www.epicpeople.org/decolonizing-llms-ethnographic-framework-for-ai-in-african-contexts/

[9] Hadi, M. (2025, October). AI and the Future of Work in Africa: Automation, Opportunity, or Exclusion? Medium [Microsoft Lagos closure; Sama/Samasource labour critique; ILO generative AI findings]. https://medium.com/@mukailahadi/ai-and-the-future-of-work-in-africa-automation-opportunity-or-exclusion-be41d1b9b5d7

[10] African Business (2025). Local Data + Local Models = Differentiated, Defensible African AI Products [World Bank Digital Progress 2025 insight]. https://african.business/2025/11/innov-africa-deals/strengthening-africas-ai-foundations

[11] Brookings Institution (2025). Data Sovereignty and the Need for Benefit-Sharing in African AI Data Governance. https://www.brookings.edu/articles/reimagining-the-future-of-data-and-ai-labor-in-the-global-south/

[12] TechCabal (2025, July). AI Is Learning to Speak African Languages, Thanks to These Startups [Awarri; Intron; SpitchAI; N-ATLAS; Nigeria 3MTT programme; Bosun Tijani quotes]. https://techcabal.com/2025/07/17/african-ai-startups/


Comments

Popular posts from this blog

The African Union’s Agenda 2063: Vision Document or Political Fiction?

The New Scramble for Africa: China, Russia, and the West in 2026

Reforms in Multilateral Institutions and The Challenges of Maintaining a Rule-Based International Order Amid Geopolitical Shifts