Google
demonstrated its Duplex system making a restaurant reservation by phone in
2018 and the technology community reacted with a mixture of wonder and
unease. The AI conducted a natural conversation with a human receptionist,
navigated interruptions and ambiguous responses, and secured a booking
without the restaurant staff realising they were not speaking to a person.
Seven years later, that demonstration looks modest. AI agents are now booking
entire travel itineraries, managing customer service complaints across
multiple interactions, executing financial transactions, writing and sending
emails on users’ behalf, and completing multi-step workflows that span dozens
of digital services without human intervention at any stage. The age of the
AI agent is not approaching. It has arrived, and the pace at which these
systems are acquiring capability and autonomy is accelerating faster than
public understanding of what they do, and faster than the governance
frameworks designed to manage their risks.
The technical definition of an AI agent is straightforward: a
system that perceives its environment, makes decisions, takes actions, and
learns from the outcomes of those actions toward a defined objective, without
requiring a human to approve each individual step. What makes current AI
agents distinctive from earlier automation systems is the combination of
natural language understanding, the ability to interact with arbitrary
digital interfaces rather than only pre-programmed APIs, and the capacity to
handle the exceptions and ambiguities that defeat traditional rule-based
automation. An AI agent booking a flight can recognise that the original
route is unavailable, evaluate alternatives against user preferences stated
earlier in the conversation, rebook on a different airline, cancel the hotel
that was geographically optimised for the original arrival time, and find a
replacement hotel, all within a single session, without a human decision at
any of these junctures.
Consumer Applications: The Visible Frontier
The consumer applications of AI agents that have attracted the
most attention are those that handle the friction-heavy administrative tasks
of daily life: travel booking and management, restaurant reservations,
customer service interactions, appointment scheduling, and the management of
recurring subscriptions and services. Products including Google’s Gemini
Advanced, Anthropic’s Claude with tool use enabled, and the Microsoft Copilot
integration across Office 365 are all moving toward agentic capabilities that
handle these tasks on users’ behalf. Apple’s enhanced Siri, announced at WWDC
2024 with deep App Store integration, represents the most ambitious consumer
agentic deployment in terms of potential reach, putting AI agent capabilities
on hundreds of millions of devices.
The user experience benefits of effective AI agents for these
tasks are substantial. Time spent on hold with customer service, navigating
opaque airline rebooking systems, managing the administrative overhead of
complex travel logistics, and keeping track of subscriptions across dozens of
services represents a significant and largely unwelcome portion of many
people’s time. AI agents that handle these interactions reliably, without
requiring constant supervision, free that time for more valuable activities.
The efficiency case is strong, and adoption of agentic capabilities is
growing rapidly among early adopters who find that the systems work reliably
enough for routine cases.
The reliability qualifier is important. AI agents fail. They
misinterpret user preferences, take actions that make sense locally but not
in the context of broader user intentions, and occasionally cause
consequential errors that are difficult to reverse: a cancelled hotel booking
that cannot be reinstated at the original price, a changed flight that
triggers cascading itinerary problems, a customer service complaint that was
resolved in a way that created a new problem. The failure modes of AI agents
are different in character from those of human error: they are often
consistent and systematic rather than random, and they are difficult for
users to anticipate because the agent’s reasoning process is not transparent.
Managing user expectations about agent reliability is a product design
challenge that most current implementations have not fully solved.
Enterprise Applications: Where the Stakes Are
Higher
In enterprise contexts, AI agents are being deployed for
higher-stakes applications where their efficiency benefits are larger and
their failure costs are greater. Salesforce’s Agentforce platform, launched
in 2024, deploys AI agents that handle customer service interactions, sales
pipeline management, and business process workflows autonomously.
ServiceNow’s AI agents automate IT service management, HR processes, and
procurement workflows. These platforms are not experimental; they are
enterprise software products with significant customer deployments and
measurable business impact.
The accountability question in enterprise AI agents is more acute
than in consumer contexts. When an AI agent makes a consequential business
decision, including approving a customer refund, escalating a service
complaint, or modifying a contract, the question of who is responsible for
that decision is not trivial. The legal and contractual frameworks governing
business processes typically assume human decision-makers at each
consequential step. AI agents that interpolate autonomous decision-making
into these processes create accountability gaps that employment law, contract
law, and regulatory frameworks have not yet caught up with. The Department
for Science, Innovation and Technology published guidance on AI
accountability in business processes in 2024 that acknowledges this gap but
does not fully resolve it.
Security and Trust
AI agents that act on users’ behalf in digital environments create
novel security challenges. Prompt injection attacks, in which malicious
content embedded in a webpage, email, or document instructs an AI agent to
take actions contrary to the user’s interests, have been demonstrated against
multiple commercial AI agent products. An agent browsing the web on a user’s
behalf that encounters a page containing hidden instructions to forward the
user’s email credentials to a third party is vulnerable in ways that
traditional web browsing is not, because the agent is actively processing and
acting on all content it encounters rather than simply rendering it for human
review.
The security community at organisations including NCSC has published
specific guidance on AI agent security that identifies prompt injection as a
priority concern requiring both defensive design in agent systems and user
awareness about the risks of granting broad permissions to AI agents
operating in adversarial digital environments. The challenge is that the
permissions that make AI agents most useful, broad access to email,
calendars, financial accounts, and communication platforms, are also the
permissions that make them most dangerous if compromised. Calibrating the
scope of agent permissions to the minimum necessary for the intended task is
a principle that product designers and users both need to apply more
consistently than current implementations typically do.
What This Means for You
AI agents are becoming a feature of the digital products you use
rather than a category of products you specifically choose. The AI
capabilities embedded in productivity software, customer service platforms,
and mobile operating systems are increasingly agentic, taking actions rather
than simply providing information. Understanding what actions AI systems
integrated into your tools are authorised to take, reviewing permission
settings before granting broad access, and maintaining oversight of
consequential agentic actions through confirmation requirements or audit logs
are practical adaptations to an environment where AI is increasingly acting
on your behalf. For related analysis, see our coverage of agentic
AI in manufacturing and agentic
AI at the edge. The question of how much autonomy to grant AI
agents, and how to maintain meaningful oversight without sacrificing the
efficiency benefits that motivate their adoption, is one of the most
practically important questions in AI governance that most users have not yet
seriously engaged with.
About the Author
Stuart Kerr is a technology correspondent at LiveAIWire, covering
artificial intelligence, digital innovation, and the social impact of
emerging technologies. Follow LiveAIWire for daily analysis at liveaiwire.com.