Jul 18, 2025 11:32:00

OpenAI releases ChatGPT agent, which can execute complex tasks in multiple steps using browsers and APIs

OpenAI's ChatGPT has introduced the ChatGPT Agent, an all-in-one agent system that combines website dialogue capabilities, deep research's web information integration capabilities, and ChatGPT's conversation skills. The ChatGPT Agent will be available in stages to users of the Pro, Plus, and Team plans, and is scheduled to be available for Enterprise and Education plans in July 2025.

Introducing the ChatGPT Agent: A new bridge between research and action | OpenAI

https://openai.com/ja-JP/index/introducing-chatgpt-agent/

Introducing ChatGPT agent - YouTube

OpenAI says that the ChatGPT agent will 'run its own virtual computer to perform its tasks, seamlessly switching between inference and action to consistently handle complex tasks.'

According to OpenAI, the ChatGPT agent is an advanced version ofOperator , a research preview agent that can directly interact with websites using a remote browser, and deep research , a multi-step web inference tool. While Operator can scroll, click, and type on the web, it has limitations in detailed analysis and report creation. In contrast, deep research excels at analyzing and summarizing information, but it cannot narrow down results while interacting with websites or access information that requires user authentication. By combining the two, OpenAI says it has become possible to retrieve information accurately and efficiently while more actively engaging with websites through clicking and applying filters.

The ChatGPT agent is equipped with a full range of web tools, including a visual browser to interact with the web through a GUI, a text browser suitable for simple inference-based queries, and even direct access to the API. It is possible to visually access the website and obtain various data and scores through the API, and all processing is done on a computer dedicated to ChatGPT.

For example, in the movie below, a text prompt asks the ChatGPT agent to research 'office properties for tech companies in Singapore' and summarize them on a slide. The man who typed the information leaves the room for a moment, but ChatGPT continues to investigate as instructed and summarize the results on a slide.

ChatGPT agent Makes Slideshows - YouTube

When you ask the ChatGPT agent to compile the City of San Francisco's 2020-2024 Annual Comprehensive Financial Report (ACFR) into a spreadsheet, the agent first searches for the annual comprehensive financial report on the Internet, classifies the amounts written in the PDF file of the ACFR for each year by item, and then creates a spreadsheet table so that it can be compared between years.

ChatGPT agent Makes Spreadsheets - YouTube

In the benchmark ' Humanity's Last Exam ' to test the limits of AI intelligence, the model equipped with the ChatGPT agent achieved a score of 43.1%, far exceeding the scores of OpenAI o3 and deep research alone.

In DSBench , which evaluates agents on realistic data science tasks spanning data analysis and modeling, OpenAI claims that 'the ChatGPT agent significantly outperformed previous state-of-the-art models, and in particular showed results that far exceeded human performance in data analysis tasks.'

In SpreadsheetBench, which evaluates models based on their ability to edit spreadsheets from real-world scenarios, the ChatGPT agent more than doubled its performance over GPT-4o. Furthermore, when using the ability to directly edit spreadsheets, the ChatGPT agent scored 45.5% higher than Excel's Copilot (20.0%).

The results of BrowseComp , which measures the ability to search for hard-to-find information on the web, are as follows.

And because the ChatGPT agent is a function that entrusts the execution of actions on the web to ChatGPT, security has been improved. OpenAI said, 'From the beginning, we have placed safety at the core of our system, and we have further strengthened the control functions introduced in the Operator research preview to address new risks associated with access to a wider range of users and devices,' and said that it focuses on 'explicit user confirmation,' 'supervision mode' that asks users for confirmation and approval for important tasks,' and 'active risk response measures.' In addition, measures such as defense against prompt injection attacks and prevention of unauthorized use, robust privacy management, and concealment of input contents are also being taken.

The ChatGPT agent is being rolled out in stages to users of the Pro, Plus, and Team plans, and is expected to be available to Enterprise and Education users in July 2025. Pro users can run almost unlimited tasks, while other plans can run up to 50 tasks per month, with additional usage available for a fee. When available, the ChatGPT agent can be selected from the 'Tools' drop-down menu in the input field.

The Operator research preview site will remain available for 30 days, after which it will be discontinued.

Related Posts:

Jul 18, 2025 11:32:00 in Software, Web Service, Web Application, Posted by log1i_yk