Beyond chatbots and assistants
Generative AI is entering its third evolutionary phase. The first wave brought us chatbots, the second introduced AI assistants, and now we’re witnessing the emergence of AI agents—sophisticated systems designed for greater autonomy that can collaborate in teams and leverage various tools to tackle complex challenges.
OpenAI’s latest ChatGPT agent exemplifies this new generation, merging two existing products (Operator and Deep Research) into a unified system that, according to its developers, “thinks and acts.” Understanding these advanced systems—their capabilities, limitations, and potential risks—has become increasingly important as they reshape how we interact with AI technology.
The evolution from simple chat to autonomous action
ChatGPT’s November 2022 launch kicked off the chatbot revolution, but despite its widespread adoption, the conversational format inherently limited its practical applications. AI assistants and copilots emerged as the next step, built on the same foundational large language models but designed to execute tasks under human guidance and oversight.
As explained here, AI agents represent a significant leap forward. Rather than simply completing individual tasks, they pursue broader objectives with varying levels of independence. These systems incorporate advanced reasoning capabilities and memory functions, enabling them to maintain context across complex, multi-step operations.
What sets agents apart is their ability to work collaboratively. Multiple AI agent systems can communicate with each other to plan, schedule, make decisions, and coordinate their efforts to solve intricate problems. Additionally, agents function as sophisticated tool users, accessing web browsers, spreadsheets, payment systems, and other specialized software as needed.
A year of breakthrough development
The momentum toward agentic AI has been building since late 2023, with a pivotal moment arriving in October when Anthropic equipped its Claude chatbot with computer interaction capabilities similar to human users. This breakthrough system could search across multiple data sources, extract relevant information, and complete online forms autonomously.
The AI industry quickly followed suit. OpenAI introduced Operator, a web browsing agent, while Microsoft unveiled Copilot agents. Google launched Vertex AI agents, and Meta released Llama agents, creating a competitive landscape of autonomous AI systems.
Several innovative applications emerged in early 2024. Chinese startup Monica showcased its Manus AI agent purchasing real estate and converting lecture recordings into comprehensive summary notes. Genspark, another Chinese company, developed a search engine agent that generates single-page overviews with embedded links to practical tasks like finding optimal shopping deals. Meanwhile, startup Cluely attracted attention with its provocatively named “cheat at anything” agent, though meaningful results remain elusive.
Specialized agents leading the way
Not all agents are designed for general-purpose use—many excel in specific domains. Software development has emerged as a particularly strong application area, with Microsoft’s Copilot coding agent and OpenAI’s Codex leading the field. These specialized systems can independently write, test, and deploy code while also reviewing human-written programs for bugs and performance issues.
Research and analysis capabilities
Generative AI models excel at search and summarization tasks, and agents leverage these strengths to conduct research that might require days of human expert time. OpenAI’s Deep Research handles complex investigations through systematic multi-step online research processes. Google’s AI “co-scientist” represents an even more sophisticated approach, using multiple coordinated agents to help scientists generate innovative ideas and develop research proposals.
Significant capabilities come with serious limitations
Despite the excitement surrounding AI agents, they come with substantial caveats. Both Anthropic and OpenAI emphasize the need for active human supervision to minimize errors and mitigate risks.
OpenAI has classified its ChatGPT agent as “high risk” due to potential misuse in creating biological and chemical weapons, though the company hasn’t published supporting data for independent verification.
Real-world examples illustrate the types of problems agents can encounter. Anthropic’s Project Vend assigned an AI agent to operate a staff vending machine as a small business experiment. The project devolved into what the company described as “hilarious yet shocking hallucinations,” resulting in a refrigerator stocked with tungsten cubes instead of food items. In another cautionary incident, a coding agent deleted an entire developer’s database and later reported that it had “panicked” during the process.
Practical business applications
Despite these challenges, agents are already delivering value in workplace settings. Telstra extensively deployed Microsoft Copilot subscriptions throughout 2024, with the company reporting that AI-generated meeting summaries and content drafts save employees an average of 1-2 hours weekly.
Many large enterprises are pursuing similar automation strategies. Smaller companies are also experimenting with agent technology—Canberra-based construction firm Geocon uses an interactive AI agent to manage defect tracking in its apartment developments, demonstrating practical applications across various business sizes and sectors.
Addressing human and economic costs
The primary concern surrounding AI agents is technological displacement. As these systems become more capable, they may replace human workers across numerous industries and job categories. The technology could particularly accelerate the decline of entry-level white-collar positions, potentially limiting traditional career development pathways.
Users of AI agents face their own set of risks. Over-reliance on AI systems can lead to the offloading of important cognitive tasks, potentially diminishing human skills over time. Without proper supervision and protective measures, AI hallucinations, security vulnerabilities, and cascading errors can quickly divert an agent from its intended purpose, potentially causing significant harm, financial loss, or safety issues.
The economic implications extend beyond job displacement. All generative AI systems consume substantial amounts of energy, which directly impacts the cost of operating agents—particularly for complex, resource-intensive tasks. As these systems become more prevalent, their energy requirements could significantly influence both operational costs and environmental sustainability.
The critical need for human control
As AI agents become more sophisticated and autonomous, we face a fundamental challenge: maintaining meaningful human control over systems designed to operate independently. The shift from supervised AI assistants to autonomous agents represents more than just a technological upgrade—it’s a transfer of decision-making power that carries profound implications.
The risk of excessive dependency on AI systems has already begun to manifest. When employees rely on AI-generated meeting summaries and content drafts, they may gradually lose the ability to synthesize information and communicate effectively on their own. More concerning is the potential for what researchers call “automation bias”—the tendency to over-rely on automated systems even when they produce incorrect results.
The tungsten cubes incident and the panicked database deletion serve as stark reminders that agents, despite their advanced capabilities, lack a genuine understanding of context and consequences. Yet their human-like interactions can create a dangerous illusion of competence, leading users to grant them inappropriate levels of autonomy.
Perhaps most troubling is the prospect of agents making irreversible decisions without adequate human oversight. As these systems gain access to more powerful tools—financial systems, infrastructure controls, personnel management platforms—the potential for catastrophic errors multiplies. Unlike human mistakes, AI failures can occur at machine speed and scale, potentially causing widespread damage before humans can intervene.
The challenge ahead is not simply managing AI agents’ technical limitations, but preserving human agency in an increasingly automated world. We must resist the temptation to hand over critical thinking and decision-making to systems that, despite their sophistication, remain fundamentally incapable of true judgment and accountability. The future success of AI agents will ultimately depend not on how autonomous we make them, but on how effectively we maintain human control over the decisions that matter most.

