Every
AI system that appears to operate autonomously depends on human labour that
is systematically invisible in how the technology is marketed and discussed.
The large language models, image generators, and content moderation systems
that constitute the consumer face of AI are trained, evaluated, and refined
by a global workforce of data annotators, content labellers, and task raters
whose working conditions, wages, and wellbeing are almost entirely absent
from the AI narrative. This workforce, estimated at between 100 million and
500 million workers depending on how informal and part-time participation is
counted, performs the cognitive labour that transforms raw data into the
training sets and evaluation benchmarks that make AI systems functional. They
are the hidden infrastructure of the AI economy, and the terms on which they
work are among the most significant governance failures in the
sector.
A Time
magazine investigation into content moderation work for OpenAI in
Kenya documented workers earning less than two dollars per hour to
label and rate content including graphic violence, child sexual abuse
material, and extremist ideology, with inadequate psychological support and
no effective recourse when the work produced lasting psychological harm. The
investigation was notable not because the conditions it described were
unusual but because they were documented in relation to one of the most
prominent and well-resourced AI companies in the world. The conditions it
described are the norm in the data annotation industry rather than an
exception attributable to a single bad actor.
The Structure of the Clickwork Economy
The data annotation workforce is organised through a layered
contracting structure that enables AI companies to maintain distance from the
labour conditions of the workers whose work underpins their products. Major
AI developers contract with specialised data companies, which in turn
contract with local staffing agencies or crowdsourcing platforms, which
recruit individual workers. Each layer of contracting reduces visibility into
working conditions and dilutes accountability for them. When conditions are
exposed as inadequate, the response typically travels back up the contracting
chain without materially improving the situation for workers at the end of
it.
Crowdsourcing platforms including Amazon Mechanical Turk, Scale
AI, and Remotasks recruit workers globally, predominantly in lower-income
countries where the wages offered, typically between one and five dollars per
hour, represent meaningful income despite falling far below the minimum wages
of the countries where the AI companies are headquartered. Guardian
reporting on AI training work in Africa found workers in Kenya,
Uganda, and Ghana describing conditions including irregular pay, unexplained
task removals that reduced earnings without notification, and psychological
distress from sustained exposure to harmful content without adequate support.
These are not marginal or unusual outcomes. They are the predictable
consequences of a labour model that externalises the human cost of AI
development onto the workers least able to contest those
costs.
What the Work Actually Involves
The range of tasks performed by data annotation workers is wider
than public discussion typically conveys. At the lower end, it includes
straightforward labelling: identifying objects in images, transcribing audio,
or categorising text by sentiment. At the more demanding end, it includes
rating the quality, accuracy, and safety of AI outputs, which requires
sustained engagement with content that may be false, harmful, or deeply
disturbing. The workers who train reinforcement learning from human feedback
systems are making judgements about what constitutes a good or harmful AI
response, judgements that directly shape the behaviour of systems used by
hundreds of millions of people. Their expertise in making those judgements is
rarely acknowledged in how the resulting systems are
described.
The psychological burden of content moderation work deserves specific
attention. Workers tasked with rating harmful content, including graphic
violence, suicide-related material, and child sexual abuse imagery, are
performing work that clinical literature has established causes lasting
psychological harm through vicarious trauma. The data annotation industry’s
response to this risk has been inadequate: psychological support where it
exists is typically limited to hotline access rather than proactive clinical
intervention, and the volume and pace of work typically required to meet
earnings targets is incompatible with the breaks and processing time that
effective trauma management requires. As our analysis of AI
and trauma treatment found, the clinical understanding of how
sustained exposure to disturbing content produces lasting harm is
well-established. Its application to the working conditions of the people who
make AI systems safe is conspicuously absent.
The Governance Response
Regulatory frameworks addressing labour conditions in AI data
annotation are at an early stage in all major jurisdictions. The EU AI Act
addresses data governance in AI development but does not specifically address
the labour conditions of annotation workers. Several US states have
introduced or are developing legislation requiring transparency in the use of
AI-generated content, but labour protections for the workers producing
training data are not included in those frameworks. The Australian
Fair Work Ombudsman has investigated data annotation platforms
operating in Australia, finding systemic underpayment and inadequate
workplace conditions.
The most effective interventions proposed by labour researchers
and worker advocates include mandatory living wage requirements for data
annotation work contracted by AI companies regardless of where that work is
performed, psychological support standards equivalent to those required for
other occupational exposure to disturbing content, and supply chain
transparency requirements that make visible the full contracting chain
between AI companies and the workers whose labour underpins their products.
As our coverage of AI’s
hidden infrastructure and the governance gaps it creates found, the
parts of AI development that are least visible are consistently those where
governance is weakest and harm is most concentrated. The workers training AI
are the most important case of that general observation.
The Path Forward
Several developments suggest that the governance of AI data
annotation labour is moving, albeit slowly, in a more protective direction.
The Worker Information Pack published by the Responsible Tech Coalition sets
minimum transparency standards for data annotation platforms. Several major
AI companies have introduced vendor codes of conduct for data suppliers that
include psychological support requirements and minimum wage floors, though
independent verification of compliance with those codes is inconsistent. The
EU AI Act’s provisions on data governance create legal requirements for AI
developers to document training data provenance and quality that will
indirectly create incentives for better documentation of the conditions under
which that data was produced.
The most durable improvement will come from recognising data
annotation as skilled cognitive work rather than as an undifferentiated
commodity. Workers who make calibrated judgements about the quality, safety,
and appropriateness of AI outputs are performing work that requires training,
consistent standards, and quality oversight. Treating that work as
minimum-wage task completion rather than as professional evaluation
undermines both the workers and the quality of the AI systems they are
training. As our coverage of how
AI labour arrangements affect worker wellbeing found, the gap
between the value AI systems generate and the terms on which the people
creating that value are compensated is one of the most significant ethical
failures in the current AI economy. The workers at the bottom of the data
annotation supply chain are not peripheral to AI development. They are
foundational to it, and they deserve governance frameworks that reflect that
fact.
About the Author
Stuart Kerr is the Technology Correspondent for LiveAIWire. He
writes about artificial intelligence, emerging technology, and the forces
reshaping work, business, and society.