GPT-5.4 Family — March 2026¶

What Changed¶

The GPT-5.4 family (Thinking, Pro, mini, nano) brought native computer use, up to 1,050,000 token context, and a tiered pricing structure down to $0.20/M tokens for nano. This established the pattern of model families: a flagship for quality, a mini for cost, and a nano for edge/volume use cases — all from the same training run via distillation.

Key Technical Details¶

Native computer use: the model generates structured action sequences (click, type, scroll, screenshot-then-reason) in a loop — not just text. The action space is formalized:

\[ a_t = \pi_\theta(s_t), \quad s_{t+1} = \text{env}(s_t, a_t) \]

where $s_t$ is the current screenshot/DOM state and $a_t$ is a structured action.

In Plain English

The model is now an agent operating in a GUI. Each step: observe state (screenshot or structured UI tree), reason, emit a structured action (click, type), then observe the next state. Training and safety analysis map naturally to an MDP: policy $\pi_\theta$ maps observations to actions; the environment advances the state.

Technical Details

GPT-5.4 standard: $2.50/$15 per million tokens.
GPT-5.4 Pro: $30/$180 per million tokens — extended reasoning.
GPT-5.4 mini: $0.75/$4.50 per million tokens.
GPT-5.4 nano: $0.20/$1.25 per million tokens — optimized for high-volume, latency-sensitive calls.
1M token context: full document, codebase, or conversation history in a single prompt.

Practical Implications¶

Pricing tiers let you route traffic: nano/mini for classification and extraction, standard/Pro for reasoning-heavy or computer-use tasks. For computer use, enforce allowlisted domains, read-only filesystems, and human confirmation for irreversible actions. Long 1M context shifts cost to prefill — profile prompt size and cache reuse.

Interview Questions

How do you distill a frontier model into mini/nano variants without losing critical capabilities? What training signals are preserved vs. compressed away?
What new safety challenges arise from native computer use that don't exist for text-only models?

Code Example¶

Pricing tiers (per million tokens, illustrative — verify current vendor pricing):

Variant	Input	Output
GPT-5.4	$2.50	$15.00
GPT-5.4 Pro	$30.00	$180.00
GPT-5.4 mini	$0.75	$4.50
GPT-5.4 nano	$0.20	$1.25

MDP-style action sketch:

s_t = encode(screenshot, DOM, goal)
a_t ~ π_θ(· | s_t)   # structured action: click(x,y), type(text), ...
s_{t+1} = Environment.step(s_t, a_t)