Meet agentic AI, chatbots that can decide on your behalf – for better or worse

Sign up now: Get ST's newsletters delivered to your inbox

AI’s improved reasoning ability marks a fundamental shift in what AI can do for users, but also new risks in terms of the damage it can cause.

AI’s improved reasoning ability marks a fundamental shift in what AI can do for users, but also presents new risks in terms of the damage it can cause.

ST ILLUSTRATION: CEL GULAPA

Follow topic:

SINGAPORE Imagine a future where you can instruct a chatbot to order pizza with “no mushroom, cheese only, thin crust, from a place with good reviews”. You could sit back as the chatbot trawls through the internet for pizza joints, reads reviews, interacts with other chatbots and places your customised order without you having to do the hard work.

This type of chatbot powered by agentic artificial intelligence (AI) has been billed as the third wave of AI, as it is able to make decisions on behalf of users.

Described by chipmaker Nvidia as “the next frontier of AI”, the technology marks the next wave of advancement in AI after machine learning and the emergence since 2022 of generative AI models such as ChatGPT and Midjourney that are capable of creating content.

Unlike earlier AI systems, agentic AI proactively figures out the sequence of steps needed to achieve a goal defined by its user, like ordering pizza, without any human intervention in between.

The technology has been supercharged to work more autonomously, thanks to the progress made with large language models such as ChatGPT, allowing developers to program more complex instructions for agentic AI systems in natural language.

AI’s improved reasoning ability marks a fundamental shift in what AI can do for users, but also presents new risks in terms of the damage it can cause.

Tech giants like IBM and Microsoft have been working with enterprise users to develop AI agents that help review and act on customer service requests, among various applications of AI agents.

AI company Anthropic in October rolled out an update to its chatbot Claude 3.5 Sonnet, which can take over a user’s mouse and keyboard virtually to complete tasks such as ordering pizza by surfing the web.

Amazon Web Services (AWS), which is partnered with Anthropic, gave international media a look at the new agentic AI model in October, painting a picture of the new capabilities of AI – along with its risks if the AI model turns rogue.

How does agentic AI work?

Agentic AI models break down complex tasks into smaller tasks and are programmed to check their own work, and can even rope in a human for help when it is unable to perform a task.

The agents can exist in the background as an assistant that steps in to help, or as a chatbot that users instruct, AWS vice-president of technology Mai-Lan Tomsen Bukovec told The Straits Times.

Previous models have been good at completing assigned tasks, but in reality, many steps are needed to complete any task, she said.

The agents can exist in the background as an assistant that steps in to help, or as a chatbot that users instruct, AWS vice-president of technology Mai-Lan Tomsen Bukovec told The Straits Times.

PHOTO: AMAZON

Unlike earlier AI tools, which operated based on simplistic rules – such as turning on the heat when the temperature drops – modern Al agents can now develop strategies to complete tasks based on user instructions. The leap forward stems from advancements in natural language processing, allowing Al to comprehend human language and respond dynamically.

And now that Anthropic’s agentic AI model is able to understand and use apps on a computer, it can tackle more tasks across multiple platforms, said Ms Bukovec.

Unlike traditional AI models, whose answers are limited to saved data that it is trained on, agentic AI is able to use tools on a computer as a human user would, said Mr Vasi Philomin, AWS vice-president of generative AI, during a demo to the media. “We’re telling the AI: ‘Here’s a computer for you – you can use the computer anyway you see fit and solve a problem for me’,” he said, explaining the technology behind Claude’s agentic capabilities.

Unlike traditional Al models, whose answers are limited to saved data that it is trained on, agentic Al is able to use tools on a computer as a human user would, said Mr Vasi Philomin, AWS vice-president of generative AI.

PHOTO: AMAZON

What are the applications for agentic AI?

Customer service bots would less frequently have to divert customers to live agents, as they would be able to address customer service requests, like identifying the source of issues that customers are facing and suggesting a solution.

With better reasoning capabilities, healthcare AI agents would be able to suggest more sophisticated treatments for patients, saving medical professionals’ time for more urgent matters.

Robots deployed in a hospital

can also autonomously split work to complete tasks like patient monitoring or the delivery and sorting of medical supplies more efficiently.

In app development, agentic AI allows machines to execute tasks across multiple apps and web pages. It is estimated that generative AI agents can save marketers an average of three hours per project by autonomously creating content by themselves, drawing ideas and information from the internet and external apps that generative AI could not do.

On consumer devices, future versions of virtual assistants like Siri and Gemini could have significantly broader capabilities such as being able to autonomously interact with independent apps, to book a ride on ride-hailing apps or order food, for instance.

In January, a

demonstration of the Rabbit R1

made headlines. The pocket-sized AI gadget can be trained to perform tasks on phone apps on users’ behalf, like booking a ride on Uber. The company has yet to deliver on its promise, but the concept sparked public interest in AI models that execute tasks for users as the next evolution of AI.

AWS has since leapt ahead,

demonstrating

how Claude 3.5 Sonnet could order pizza online.

Instead of visiting a website or an app, the user just has to type out instructions in a chat box, a la ChatGPT.

Claude then gets to work. It scrolls, clicks and types, explaining each step to the user as it navigates the computer. It opens an internet browser on the computer, visits a website for pizza, selects the pizza and fills the order page with delivery details before coming back to the user to authorise the purchase.

In the October demo for media, Claude could even analyse a website for golf products based on simple prompts entered by the user.

The AI decides on its own that the best way to recommend items is through price, product descriptions and reviews left by users.

When will we see more agentic AI tools?

Consulting firm McKinsey projected that AI could automate 30 per cent of work hours by 2030, which could significantly change the way work looks compared with today.

Besides AWS, tech giants like IBM and Slack-owner Salesforce are already using basic agentic AI tools they developed. Such systems are expected to be more widely available in 2025.

IBM on Nov 25 announced an agentic AI supervisor program that coordinates work among various AI agents within a single business, automatically assigning work based on instructions given by a user.

In October, Salesforce illustrated how an agentic AI assistant built into the Slack communication platform could help businesses such as a cake shop to design an autonomous system that can

manage personalised cake orders without involving bakery staff

.

The AI assistant prompted the business owner for information such as customer behaviours and preferences when placing orders, and suggested processes such as human intervention if automated orders exceed the bakery’s production capacity.

In late November, OpenAI announced that it has updated ChatGPT with better reasoning capabilities, allowing it to take on more complicated tasks by breaking them down into steps. This lays the groundwork for future agentic AI models, it said.

OpenAI’s head of developer experience Romain Huet said he expects industries to be dominated with new uses for agentic models in 2025.

Are jobs at risk?

As AI increasingly takes on tasks autonomously, human roles across industries are expected to evolve.

Workers will need to get used to a reality where they work with AI agents just like fellow employees who have to be trained to complete tasks, wrote talent analyst Josh Bersin in a report in September.

For example, a human resources professional must be adept at using AI tools to perform tasks such as searching the internet for potential recruitment candidates and conducting background checks on social media platforms.

Mr Laurence Liew from AI Singapore, a national programme driving the technology’s adoption, said: “We see AI transforming jobs, rather than replacing them.”

But roles involving data entry and basic levels of customer service, content creation, coding and administration are likely to be taken over by AI, he added. 

“This frees up time for the workers to focus on working on queries that require additional attention, or on more complex tasks, such as strategic planning.”

What could go wrong?  

In 2016, Microsoft had to pull the plug on an AI chatbot aimed at young people. The chatbot learnt from interactions it had with real users on Twitter but started making racist comments after users decided to teach it offensive information.

Generative AI has continued to spout incorrect information and has been flagged for feeding users’ biases. These mishaps could worsen as AI gains more autonomy. As AI independently refines its own code, developers may soon struggle to understand its internal workings.

And this question looms large: Whose fault it is when an AI program makes errors?

Guidelines at tech firms like IBM say there should be human monitoring of autonomous AI operations.

As an AI agent compares its performance against its expected standard and adjusts accordingly, human approval must be given before any impactful actions are made by a machine, IBM said. 

As some agentic models help to fill forms and complete administrative tasks on a computer or make purchases on a user’s behalf, there is also a risk of personal data being exposed or misused.

AWS has cautioned users about this risk when experimenting with Claude 3.5 Sonnet’s computing-surfing capabilities on devices where important documents or personal information are also stored. 

As a safeguard, Claude is programmed to bring the human in the loop when it is about to make a purchase or create an account.

“We didn’t want it to be able to create an Instagram account, for example,” Anthropic’s chief product officer Mike Krieger said, alluding to the use of AI tools to generate bot accounts. “That is something you’re going to have to do yourself.”

The onus is on AI developers to limit the freedom machines have, by coding in the need for a human’s explicit approval at critical stages.

“The good news,” said Mr Liew from AI Singapore, “is that the tasks that autonomous AI can perform are limited to parameters set by the developers.”

Anthropic’s chief product officer Mike Krieger speaking during a fireside chat press event at Amazon headquarters in Seattle on Oct 28.

PHOTO: AMAZON

See more on