
OpenAI now offers a ChatGPT agent that can autonomously complete multi-step tasks using a dedicated virtual computer.
AI agents are a hot technology, as companies compete to create the ultimate AI assistant. Agents perform multi-step tasks independently to make real-world interactions more convenient, such as making restaurant reservations on behalf of the user.
OpenAI’s new agent is rolling out gradually, starting with ChatGPT Pro users; the Plus and Team tiers will see it as an option over the next few days. OpenAI expects to expand ChatGPT agent to Enterprise and Education users in the next few weeks.
Pro users receive a monthly allowance of 400 agent messages, while other users will have a quota of 40 messages, with the option to purchase additional usage.
To use ChatGPT agent, users must select “agent mode” from the dropdown menu.
ChatGPT agent is an evolution of the Operator agent and other frontier capabilities
Some of the ideas for the ChatGPT agent originated from the company’s Operator agent released earlier this year, OpenAI CEO Sam Altman said in today’s intro to ChatGPT agent video. The agent also incorporates ChatGPT’s deep research capabilities and the chatbot’s classic natural language processing.
“It became clear to us that what people really wanted was for us to bring those capabilities and more together,” said Altman. “People wanted a unified agent that could go off, use its own computer and do real, complex tasks for them, that could seamlessly transition from thinking about something, to taking actions, to using lots of tools, using the terminal, clicking around the web, even producing things like spreadsheets and slides and much more.”
What can a ChatGPT agent do at work?
To integrate all of those capabilities, OpenAI trained a new model. That model can:
- Click around websites, which it interprets through both a text-based and visual-based browser.
- Use a terminal to run code.
- Use direct API access.
- Connect to apps like Gmail and GitHub with ChatGPT connectors.
ChatGPT agent will adapt if the user interrupts it during the task to change something about the request. The agent will also ask the user for more information if it needs clarity during the process.
OpenAI envisions ChatGPT agent being used for a variety of tasks at work; for instance, it can set up meetings, create presentations based on screenshots, and update spreadsheets. Like other AI agents that can take some measure of autonomy like this, ChatGPT agent could save a lot of clicking.
TechnologyAdvice’s Lead Writer Grant Harvey stated in today’s Neuron newsletter: “It scored 45.5% on spreadsheet tasks compared to Excel Copilot’s 20%. Oof. Microsoft’s not gonna like that at the negotiating table for OpenAI’s freedom.”
More agency means more security considerations
AI agents raise new privacy issues because they often involve the agent auto-filling identity and payment information.
In one demonstration, the agent helps a business owner place an order for 500 stickers, confirming the quantity and payment securely — without the user needing to preview the items. Additionally, the agent will request confirmation before taking what OpenAI refers to as “consequential” actions. The company says it has strengthened the same privacy controls that were used in Operator’s research preview for use with ChatGPT agent.
OpenAI also notes that prompt injection could be particularly dangerous when used against an AI agent. In response, the company has trained the agent to notice signs of prompt injection.
Sending the AI to perform specific tasks, such as sending emails, will activate Watch Mode, in which the user must supervise the agent. The model does not save passwords when interacting with the web.
OpenAI also expanded its safeguards against using the model to create biological or chemical weapons. It implemented a rigorous safety stack around the topic, although it says there is no evidence the model could help an amateur build such a weapon.
In other artificial intelligence news, Scale AI laid off approximately 200 employees and 500 contractors following a deal with Meta.
Editor’s note: This article was first published on July 17.