GPT-5: Another (Sparse) Giant AI From The Biggest Name In The West

AI-First Platforms & Applications
Blog
11 Aug, 2025

From a modest four-storey office building rising above Bryant Street, San Francisco, the world's most famous AI lab, OpenAI, announced its long-anticipated version 5.0 generative pre-trained transformer (GPT) model. The world paid close attention – the first base model major release since March 2023, when GPT-4’s debut caused a short-lived global AGI panic.

Mainstream sentiment is so far positive. For its popular consumer- and enterprise-facing product ChatGPT, it is a pragmatic step forward: a unified default that routes to a deeper ‘thinking’ configuration when needed, with better instruction following, more reliable tool use and a noticeable speed bump vs o1 and o3. This carries through to the countless third-party applications such as Cursor and Windsurf utilizing OpenAI LLMs through an API – particularly coding, where GPT-5 smooths the daily grind of drafting and debugging larger repos and finally joins the club long owned by Anthropic’s Claude Sonnet/Opus and Google’s Gemini 2.5 Pro.

Despite some heinous chart crimes, both ChatGPT and software developer users are pleased with the update. GPT‑5 exhibits some of the ‘ big model smell’ of its gargantuan sunset predecessor, GPT-4.5, but with Gemini 2.5 Pro speed. While details are scant, OpenAI’s much-derided ‘gpt-oss-120b’ open-source model released earlier this week activates only 4% of its parameters during each forward pass, using a sparse mixture of experts (MoE) architecture – and similarly high sparsity is likely employed by OpenAI’s latest closed model. Such techniques were popularized with GPT-4 and made cost-efficient with DeepSeek’s V3/R1; they are critical for keeping costs down when serving intelligent models large numbers of concurrent users.

Nevertheless, GPT-5 is nowhere near OpenAI’s stated goal of artificial general intelligence (AGI). While the model’s intrinsic knowledge and ability to connect in-context dots exceeds that of some knowledge workers reading this blog today, it is still ‘jagged intelligence’. The model churns out well-trodden management consulting, brainstorming, software development and even some healthcare tasks at lightspeed in delightfully user-oriented language. Step outside the shadowy realm of its training distribution, however, and long-horizon tasks will quickly become spaghetti.

Plus: the real bottleneck is context. GPT-5’s 272,000-token input context limit suggests that it can simultaneously attend to hundreds of pages of enterprise-specific facts, rules and task descriptions. But the reality is that today’s knowledge workers hold far more up-to-date information in working memory – and in rich multimodality. The solution, therefore, is to build context management systems around these models.

Enter the connectors, data models, ontologies, knowledge graphs, retrieval and verification systems. Naïve RAG is good for demos. Production value comes when sources are curated and knowledge discretized – for example, via a knowledge graph – so that agents fetch context that is both relevant and self‑explaining. The EU Parliament’s Archibot work showed what happens when curated corpora meet serious retrieval. In heavy industry, the platforms that win are those that pair a semantic layer with data orchestration and observability. GPT‑5 fits into that stack; it doesn’t replace it.

Enterprise AI platform WRITER provides an example of system‑first thinking. Its Palmyra X5 is a hybrid long‑context model that blends dense and sparse compute, pairs optimized attention with MoE for efficiency and backs its own take on graph-based retrieval-augmented generation (RAG). WRITER’s knowledge graph is hydrated by a purpose‑built LLM and persisted in a database, giving users relational context, incremental updates and a cleaner path for agent planning. The firm’s ‘self‑evolving’ stack combines memory, uncertainty‑driven learning and reflection/retry reinforcement to improve failure recovery for tool‑using agents. It’s not just ‘a bigger model’ – this is an example of a context-engineered system.

Flow Software shows a similar philosophy at an industrial plant‑ops level. Its model context protocol (MCP) gateway exposes agent‑ready tools, enforces deterministic execution plans at session start and injects context so that agents pull only what they need. It targets data tasks where domain models matter more than raw tokens. Other IDM providers, such as HighByte, are publishing similar demos. This is what useful looks like: agent plumbing, not hype.

Buyers should use GPT‑5 where it wins – and open models where they win. Qwen (Alibaba Cloud), Mistral, GLM (Z.ai), Kimi‑K2 (Moonshot) and DeepSeek (High-Flyer Capital Management) are shipping both giant and small open-weight models that are useful today, with excellent instruction following, tool use, coding and rapidly improving reasoning. Plus, they’re easy to spin up on private hardware, respect data boundaries and crush the economics of large‑scale knowledge graph hydration. Contrast that with ‘open’ models that are refusal‑heavy and developer‑hostile: open in name, closed in practice.

Buyers should treat model choice as a workload decision, not a brand choice. GPT-5 does not change our view that LLMs are a commodity. Enterprises should:

  • Standardize on a multi‑model router, mix private and externally hosted, pin evaluation harnesses to business KPIs and make swap‑outs quick.
  • Invest in the semantic layer first – ontological data models to template knowledge graphs, with multi-step verification. High-quality information is the moat.
  • Adopt agent‑ready interfaces (MCP where it fits), with deterministic execution plans and observability.
  • Use small open models for curation/reranking at scale; reserve GPT‑5 and Claude Opus 4.1 for long-horizon, tool‑heavy steps that deliver value.

GPT‑5 is exciting. But if, over the course of the next week, DeepSeek, Mistral, Moonshot or Qwen push an update that tips the scales slightly in their favour, the switchover from OpenAI will happen in the blink of an eye. Enterprises should buy outcomes, not logos – and think of LLMs as tools for curating their own organizational intelligence.

Discover more AI-First Platforms & Applications content
See More

About The Author

Joe Lamming

Joe Lamming

Senior Analyst

View Profile