Close Menu
    Facebook X (Twitter) Instagram
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Facebook X (Twitter) Instagram
    Fintech Fetch
    • Home
    • Crypto News
      • Bitcoin
      • Ethereum
      • Altcoins
      • Blockchain
      • DeFi
    • AI News
    • Stock News
    • Learn
      • AI for Beginners
      • AI Tips
      • Make Money with AI
    • Reviews
    • Tools
      • Best AI Tools
      • Crypto Market Cap List
      • Stock Market Overview
      • Market Heatmap
    • Contact
    Fintech Fetch
    Home»AI News»OpenAI launches GPT-5.4 with native computer use mode, financial plugins for Microsoft Excel, Google Sheets
    OpenAI launches GPT-5.4 with native computer use mode, financial plugins for Microsoft Excel, Google Sheets
    AI News

    OpenAI launches GPT-5.4 with native computer use mode, financial plugins for Microsoft Excel, Google Sheets

    March 6, 20269 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email
    Customgpt

    The AI updates aren’t slowing down. Literally two days after OpenAI launched a new underlying AI model for ChatGPT called GPT-5.3 Instant, the company has unveiled another, even more massive upgrade: GPT-5.4.

    Actually, GPT-5.4 comes in two varieties: GPT-5.4 Thinking and GPT-5.4 Pro, the latter designed for the most complex tasks.

    Both will be available in OpenAI’s paid application programming interface (API) and Codex software development application, while GPT-5.4 Thinking will be available to all paid subscribers of ChatGPT (Plus, the $20-per-month plan, and up) and Pro will be reserved for ChatGPT Pro ($200 monthly) and Enterprise plan users.

    ChatGPT Free users will also get a taste of GPT-5.4, but only when their queries are auto-routed to the model, according to an OpenAI spokesperson.

    kraken

    The big headlines on this release are efficiency, with OpenAI reporting that GPT-5.4 uses far fewer tokens (47% fewer on some tasks) than its predecessors, and, arguably even more impressively, a new “native” Computer Use mode available through the API and its Codex that lets GPT-5.4 navigate a users’ computer like a human and work across applications.

    The company is also releasing a new suite of ChatGPT integrations allowing GPT-5.4 to be plugged directly into users’ Microsoft Excel and Google Sheets spreadsheets and cells, enabling granular analysis and automated task completion that should speed up work across the enterprise, but may make fears of white collar layoffs even more pronounced on the heels of similar offerings from Anthropic’s Claude and its new Cowork application.

    OpenAI says GPT-5.4 supports up to 1 million tokens of context in the API and Codex, enabling agents to plan, execute, and verify tasks across long horizons— however, it charges double the cost per 1 million tokens once the input exceeds 272,000 tokens.

    Native computer use: a step toward autonomous workflows

    The most consequential capability OpenAI highlights is that GPT-5.4 is its first general-purpose model released with native, state-of-the-art computer-use capabilities in Codex and the API, enabling agents to operate computers and carry out multi-step workflows across applications.

    OpenAI says the model can both write code to operate computers via libraries like Playwright and issue mouse and keyboard commands in response to screenshots. OpenAI also claims a jump in agentic web browsing.

    Benchmark results are presented as evidence that this is not merely a UI wrapper.

    On BrowseComp, which measures how well AI agents can persistently browse the web to find hard-to-locate information, OpenAI reports GPT-5.4 improving by 17% absolute over GPT-5.2, and GPT-5.4 Pro reaching 89.3%, described as a new state of the art.

    On OSWorld-Verified, which measures desktop navigation using screenshots plus keyboard and mouse actions, OpenAI reports GPT-5.4 at 75.0% success, compared to 47.3% for GPT-5.2, and notes reported human performance at 72.4%.

    On WebArena-Verified, GPT-5.4 reaches 67.3% success using both DOM- and screenshot-driven interaction, compared to 65.4% for GPT-5.2. On Online-Mind2Web, OpenAI reports 92.8% success using screenshot-based observations alone.

    OpenAI also links computer use to improvements in vision and document handling. On MMMU-Pro, GPT-5.4 reaches 81.2% success without tool use, compared with 79.5% for GPT-5.2, and OpenAI says it achieves that result using a fraction of the “thinking tokens.”

    On OmniDocBench, GPT-5.4’s average error is reported at 0.109, improved from 0.140 for GPT-5.2. The post also describes expanded support for high-fidelity image inputs, including an “original” detail level up to 10.24M pixels.

    OpenAI positions GPT-5.4 as built for longer, multi-step workflows—work that increasingly looks like an agent keeping state across many actions rather than a chatbot responding once.

    Tool search and improved tool orchestration

    As tool ecosystems get larger, OpenAI argues that the naive approach—dumping every tool definition into the prompt—creates a tax paid on every request: cost, latency, and context pollution.

    GPT-5.4 introduces tool search in the API as a structural fix. Instead of receiving all tool definitions upfront, the model receives a lightweight list of tools plus a search capability, and it retrieves full tool definitions only when they’re actually needed.

    OpenAI describes the efficiency win with a concrete comparison: on 250 tasks from Scale’s MCP Atlas benchmark, running with 36 MCP servers enabled, the tool-search configuration reduced total token usage by 47% while achieving the same accuracy as a configuration that exposed all MCP functions directly in context.

    That 47% figure is specifically about the tool-search setup in that evaluation—not a blanket claim that GPT-5.4 uses 47% fewer tokens for every kind of task.

    Improvements for developers and coding workflows

    OpenAI’s coding pitch is that GPT-5.4 combines the coding strengths of GPT-5.3-Codex with stronger tool and computer-use capabilities that matter when tasks aren’t single-shot.

    GPT-5.4 matches or outperforms GPT-5.3-Codex on SWE-Bench Pro while being lower latency across reasoning efforts.

    Codex also gets workflow-level knobs. OpenAI says /fast mode delivers up to 1.5× faster performance across supported models, including GPT-5.4, describing it as the same model and intelligence “just faster.”

    And it describes releasing an experimental Codex skill, “Playwright (Interactive)”, meant to demonstrate how coding and computer use can work in tandem—visually debugging web and Electron apps and testing an app as it’s being built.

    OpenAI for Microsoft Excel and Google Sheets

    Alongside GPT-5.4, OpenAI is announcing a suite of secure AI products in ChatGPT built for enterprises and financial institutions, powered by GPT-5.4 for advanced financial reasoning and Excel-based modeling.

    The centerpiece is ChatGPT for Excel and Google Sheets (beta), which OpenAI describes as ChatGPT embedded directly in spreadsheets to build, analyze, and update complex financial models using the formulas and structures teams already rely on.

    The suite also includes new ChatGPT app integrations intended to unify market, company, and internal data into a single workflow, naming FactSet, MSCI, Third Bridge, and Moody’s.

    And it introduces reusable “Skills” for recurring finance work such as earnings previews, comparables analysis, DCF analysis, and investment memo drafting.

    OpenAI anchors the finance push with an internal benchmark claim: model performance increased from 43.7% with GPT-5 to 88.0% with GPT-5.4 Thinking on an OpenAI internal investment banking benchmark.

    Measuring AI performance against professional work

    OpenAI leans on benchmarks intended to resemble real office deliverables, not just puzzle-solving. On GDPval, an evaluation spanning “well-specified knowledge work” across 44 occupations, OpenAI reports that GPT-5.4 matches or exceeds industry professionals in 83.0% of comparisons, compared to 71.0% for GPT-5.2.

    The company also highlights specific improvements in the kinds of artifacts that tend to expose model weaknesses: structured tables, formulas, narrative coherence, and design quality.

    In an internal benchmark of spreadsheet modeling tasks modeled after what a junior investment banking analyst might do, GPT-5.4 reaches a mean score of 87.5%, compared to 68.4% for GPT-5.2.

    And on a set of presentation evaluation prompts, OpenAI says human raters preferred GPT-5.4’s presentations 68.0% of the time over GPT-5.2’s, citing stronger aesthetics, greater visual variety, and more effective use of image generation.

    Improving reliability and reducing hallucinations

    OpenAI describes GPT-5.4 as its most factual model yet and connects that claim to a practical dataset: de-identified prompts where users previously flagged factual errors. On that set, OpenAI reports GPT-5.4’s individual claims are 33% less likely to be false and its full responses are 18% less likely to contain any errors compared to GPT-5.2.

    In statements provided to VentureBeat from OpenAI and attributed early GPT-5.4 testers, Daniel Swiecki of Walleye Capital says that on internal finance and Excel evaluations, GPT-5.4 improved accuracy by 30 percentage points, which he links to expanded automation for model updates and scenario analysis.

    Brendan Foody, CEO of Mercor, calls GPT-5.4 the best model the company has tried and says it’s now top of Mercor’s APEX-Agents benchmark for professional services work, emphasizing long-horizon deliverables like slide decks, financial models, and legal analysis.

    Pricing and availability

    In the API, OpenAI says GPT-5.4 Thinking is available as gpt-5.4 and GPT-5.4 Pro as gpt-5.4-pro. Pricing is as follows:

    • GPT-5.4: $2.50 / 1M input tokens; $15 / 1M output tokens

    • GPT-5.4 Pro: $30 / 1M input tokens; $180 / 1M output tokens

    • Batch + Flex: half-rate; Priority processing: 2× rate

    This makes GPT-5.4 among the more expensive models to run over API compared to the entire field, as seen in the table below.

    Model

    Input

    Output

    Total Cost

    Source

    Qwen 3 Turbo

    $0.05

    $0.20

    $0.25

    Alibaba Cloud

    Qwen3.5-Flash

    $0.10

    $0.40

    $0.50

    Alibaba Cloud

    deepseek-chat (V3.2-Exp)

    $0.28

    $0.42

    $0.70

    DeepSeek

    deepseek-reasoner (V3.2-Exp)

    $0.28

    $0.42

    $0.70

    DeepSeek

    Grok 4.1 Fast (reasoning)

    $0.20

    $0.50

    $0.70

    xAI

    Grok 4.1 Fast (non-reasoning)

    $0.20

    $0.50

    $0.70

    xAI

    MiniMax M2.5

    $0.15

    $1.20

    $1.35

    MiniMax

    Gemini 3.1 Flash-Lite

    $0.25

    $1.50

    $1.75

    Google

    MiniMax M2.5-Lightning

    $0.30

    $2.40

    $2.70

    MiniMax

    Gemini 3 Flash Preview

    $0.50

    $3.00

    $3.50

    Google

    Kimi-k2.5

    $0.60

    $3.00

    $3.60

    Moonshot

    GLM-5

    $1.00

    $3.20

    $4.20

    Z.ai

    ERNIE 5.0

    $0.85

    $3.40

    $4.25

    Baidu

    Claude Haiku 4.5

    $1.00

    $5.00

    $6.00

    Anthropic

    Qwen3-Max (2026-01-23)

    $1.20

    $6.00

    $7.20

    Alibaba Cloud

    Gemini 3 Pro (≤200K)

    $2.00

    $12.00

    $14.00

    Google

    GPT-5.2

    $1.75

    $14.00

    $15.75

    OpenAI

    Claude Sonnet 4.6

    $3.00

    $15.00

    $18.00

    Anthropic

    GPT-5.4

    $2.50

    $15.00

    $17.50

    OpenAI

    Gemini 3 Pro (>200K)

    $4.00

    $18.00

    $22.00

    Google

    Claude Opus 4.6

    $5.00

    $25.00

    $30.00

    Anthropic

    GPT-5.2 Pro

    $21.00

    $168.00

    $189.00

    OpenAI

    GPT-5.4 Pro

    $30.00

    $180.00

    $210.00

    OpenAI

    Another important note: with GPT-5.4, requests that exceed 272,000 input tokens are billed at 2X the normal rate, reflecting the ability to send prompts larger than earlier models supported.

    In Codex, compaction defaults to 272k tokens, and the higher long-context pricing applies only when the input exceeds 272k—meaning developers can keep sending prompts at or under that size without triggering the higher rate, but can opt into larger prompts by raising the compaction limit, with only those larger requests billed differently.

    An OpenAI spokesperson said that in the API the maximum output is 128,000 tokens, the same as previous models.

    Finally, on why GPT-5.4 is priced higher at baseline, the spokesperson attributed it to three factors: higher capability on complex tasks (including coding, computer use, deep research, advanced document generation, and tool use), major research improvements from OpenAI’s roadmap, and more efficient reasoning that uses fewer reasoning tokens for comparable tasks—adding that OpenAI believes GPT-5.4 remains below comparable frontier models on pricing even with the increase.

    The broader shift

    Across the release and the follow-up clarifications, GPT-5.4 is positioned as a model meant to move beyond “answer generation” and into sustained professional workflows—ones that require tool orchestration, computer interaction, long context, and outputs that look like the artifacts people actually use at work.

    OpenAI’s emphasis on token efficiency, tool search, native computer use, and reduced user-flagged factual errors all point in the same direction: making agentic systems more viable in production by lowering the cost of retries—whether that retry is a human re-prompting, an agent calling another tool, or a workflow re-running because the first pass didn’t stick.

    murf
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Fintech Fetch Editorial Team
    • Website

    Related Posts

    Boston Dynamics Atlas Named 'Best Robot' in Best of CES™ 2026 Awards by CNET Group

    Physical AI is having its moment and everyone wants a piece of it

    March 4, 2026
    Meet SymTorch: A PyTorch Library that Translates Deep Learning Models into Human-Readable Equations

    Meet SymTorch: A PyTorch Library that Translates Deep Learning Models into Human-Readable Equations

    March 3, 2026
    logo

    Teaching students AI skills and helping corner stores go digital, too.

    March 2, 2026
    Featured video: Coding for underwater robotics | MIT News

    Featured video: Coding for underwater robotics | MIT News

    March 1, 2026
    Add A Comment

    Comments are closed.

    Join our email newsletter and get news & updates into your inbox for free.


    Privacy Policy

    Thanks! We sent confirmation message to your inbox.

    changelly
    Latest Posts
    Wheat Leads Grain Rally on Thursday

    Wheat Sparks Grain Surge on Thursday

    March 6, 2026
    OpenAI launches GPT-5.4 with native computer use mode, financial plugins for Microsoft Excel, Google Sheets

    OpenAI launches GPT-5.4 with native computer use mode, financial plugins for Microsoft Excel, Google Sheets

    March 6, 2026
    How to Choose the Right AI Course | Avoid These Red Flags

    How to Choose the Right AI Course | Avoid These Red Flags

    March 5, 2026
    I Made CS2 HACKS With Different AIs - Will I Get Banned?

    I Made CS2 HACKS With Different AIs – Will I Get Banned?

    March 5, 2026
    2 Indicators Turn Bullish for Bitcoin: What’s Next for BTC’s Price?

    Two Signals Point to a Bullish Outlook for Bitcoin: What Lies Ahead for BTC’s Price?

    March 5, 2026
    coinbase
    LEGAL INFORMATION
    • Privacy Policy
    • Terms Of Service
    • Social Media Disclaimer
    • DMCA Compliance
    • Anti-Spam Policy
    Top Insights
    Hacker Steals $2.7M From Solv’s Bitcoin Yield Platform

    Hacker Steals $2.7M From Solv’s Bitcoin Yield Platform

    March 6, 2026
    Construction Begins at 1M Qubit Quantum Facility

    Groundbreaking Commences at Quantum Facility with 1 Million Qubits

    March 6, 2026
    notion
    Facebook X (Twitter) Instagram Pinterest
    © 2026 FintechFetch.com - All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.