GM
GM

Enterprise AI Strategy: The 5% That Generate Value

· 13 min read
Enterprise AI Strategy: The 5% That Generate Value

Key Takeaways

  • 1 Only 5% of enterprises achieve AI value at scale—full foundation model ownership represents a multi-year, £80–400 million commitment viable only for platform-scale organisations
  • 2 Open source models now achieve near-parity with proprietary alternatives at 50–75% lower inference costs, fundamentally changing the build vs. buy calculus
  • 3 Competitive advantage emerges from proprietary data and workflow transformation, not model ownership

This is Part 1 of a three-part series on Enterprise AI Strategy. Part 2 covers risk, security, and regulatory considerations. Part 3 addresses implementation, portability, and decision frameworks.


Executive Summary

The question “Should we build our own LLM?” is increasingly being asked at board and executive committee level. In most cases, it is the wrong question.

For global enterprises—including fashion and retail groups as well as enterprise software providers—the strategic decision is not whether to own a Large Language Model, but which layers of the LLM value chain warrant ownership, control, or influence given risk exposure, regulatory obligations, and competitive differentiation. According to BCG’s 2025 research across 1,250 firms worldwide, only 5% of companies are achieving AI value at scale, whilst 60% report minimal revenue and cost gains despite substantial investment.

Evidence from 2024–2025 enterprise deployments reveals a clear pattern: full ownership of foundation models is economically and operationally viable only for a very small class of organisations. McKinsey’s 2025 State of AI research found that organisations achieving meaningful enterprise-wide impact from AI—those attributing 5% or more EBIT impact—represent just 6% of respondents. These high performers are investing more than 20% of their digital budgets in AI technologies and are scaling across the business, but crucially, they focus on transformative workflows rather than model ownership.

However, the landscape has shifted materially in 2025. Open source models have achieved near-parity with proprietary alternatives, and regional providers offer dramatically different cost structures. CTOs and CISOs must now evaluate a broader spectrum of options whilst navigating complex trade-offs between capability, cost, data sovereignty, and regulatory compliance.

This series reframes the debate in terms relevant to CTOs and CISOs: control vs. ownership, resilience vs. dependency, and differentiation vs. distraction.


The Current State: Reframing the Ownership Question

The Critical Distinction Executives Must Make

“Building an LLM” is often used imprecisely to describe several materially different activities. Understanding these distinctions is essential for strategic decision-making:

LayerWhat It Actually MeansStrategic Implication
Foundation modelTraining a large, general-purpose model from scratchMulti-year, £80–400 million commitment; limited relevance outside platform vendors
Fine-tuned modelAdapting an existing model to domain dataTargeted advantage with ongoing operational complexity
RAG systemsInjecting proprietary knowledge at inference timeHigh leverage, lower risk, faster time to value
Private inferenceHosting models in controlled environmentsShifts—not removes—security and compliance risk
Self-hosted open sourceDeploying open-weight models on own infrastructureData sovereignty with operational complexity; 50–75% cost reduction potential

Research from Epoch AI indicates that training costs for frontier models—the most advanced foundation models representing the current state of the art, such as GPT-5, Gemini, and Claude—have grown at a rate of 2.4x per year since 2016. For earlier frontier models with published cost data, Epoch AI estimates GPT-4’s final training run cost approximately £32 million ($40 million) using amortised hardware and energy costs, whilst Google’s Gemini Ultra required approximately £24 million ($30 million). Stanford’s AI Index 2024, using cloud compute pricing methodology, reports higher figures of £62 million ($78 million) for GPT-4 and £152 million ($191 million) for Gemini Ultra. Current frontier models are substantially more expensive: Anthropic’s CEO has indicated that models costing over £800 million ($1 billion) are now in development, with projections suggesting training runs exceeding £8 billion ($10 billion) by 2027.

Key Insight: Most enterprises do not need to own models to control outcomes. The organisations capturing the most value from AI are those redesigning workflows and accelerating innovation—not those building infrastructure. However, the economics of hosting versus consuming models via API have shifted substantially in favour of self-hosted options for high-volume workloads.

Why “In-House” Does Not Automatically Mean “Secure”

From a CISO perspective, the assumption that internal hosting equals lower risk is increasingly flawed. Deloitte’s 2024 State of Generative AI research found that nearly 75% of respondents cited “concerns around security, privacy and governance” as the primary challenge in scaling generative AI.

Persistent AI-specific risks regardless of hosting model include:

  • Shadow AI proliferation: Workforce adoption of AI tools jumped from 22% to 75% between 2023 and 2024, often without governance frameworks in place. Deloitte research indicates that employee data flowing into generative AI services grew more than 30x from 2024 to 2025.

  • Prompt injection and data leakage: Among organisations using generative AI, 47% have experienced problems ranging from hallucinated outputs to cyber security issues, privacy exposure, and intellectual property leakage.

  • Training data exposure: Nearly half (46%) of all data-policy violations involve developers pasting proprietary source code into generative AI tools.

  • Uncontrolled model drift: Without mature MLOps pipelines, model behaviour can degrade over time, creating compliance and operational risks. Enterprise-grade MLOps platforms—including UK-based providers such as Seldon for deployment monitoring and governance—offer control and visibility layers that mitigate these operational risks.

Private infrastructure reduces third-party exposure but increases internal attack surface unless accompanied by mature MLOps pipelines, continuous red-teaming of model behaviour, fine-grained access control at the prompt and retrieval layer, and auditability aligned to regulatory expectations (including the EU AI Act).

Technology Impact: For many enterprises, hyperscaler-hosted models with strong contractual controls may present lower operational risk than poorly governed internal deployments. The key differentiator is governance maturity, not infrastructure ownership.

Business Impact: Organisations with comprehensive technology and cyber security governance in place report tighter alignment between boards, executives, and security teams, as well as greater confidence in their ability to protect AI deployments.

The Governance Reality: Only 48% of AI initiatives progress from prototype to production, with an average journey of approximately eight months. The bottleneck is rarely model availability—it is integration, security reviews, compliance checks, and organisational change management.


Open Source AI: A Credible Alternative

The Performance Gap Has Closed

The competitive landscape for foundation models shifted dramatically during 2025. Open source and open-weight models now achieve performance comparable to—and in some cases exceeding—proprietary alternatives from major technology vendors.

Meta’s Llama 4 family (released April 2025) demonstrates this convergence. The Llama 4 Scout variant (109 billion total parameters, 17 billion active) supports context windows up to 10 million tokens and delivers benchmark performance competitive with GPT-4o across reasoning, coding, and multilingual tasks. Llama models have now surpassed 1.2 billion downloads, up from 650 million in December 2024. Major enterprises including AT&T, Block, and IBM have deployed Llama in production environments for applications ranging from customer support to mission planning.

DeepSeek-R1 demonstrates that frontier-level reasoning capabilities are no longer exclusive to established Western providers. The model achieves performance comparable to OpenAI’s o1 on mathematical and coding benchmarks, with its Mixture-of-Experts architecture (671 billion total parameters, 37 billion active per token) enabling inference at dramatically lower cost structures. The strategic implications of regional providers—including considerations around data sovereignty, regulatory alignment, and security—are examined in Part 2 of this series.

Alibaba’s Qwen family and Mistral AI’s models (France) further expand the viable options for enterprises seeking alternatives to hyperscaler dependency.

The Economic Case for Self-Hosted Open Source

For enterprises with substantial AI inference volumes, self-hosted open source models present compelling economics:

Cost structure comparison: Generating one million tokens via GPT-4o API costs approximately £24–96 ($30–120) depending on context length. The same volume on a self-hosted Llama model running on cloud GPU infrastructure costs approximately £1.60–2.40 ($2–3) in compute time. Industry analysis indicates enterprises can achieve 50–80% inference cost reductions by migrating high-volume workloads to self-hosted open source models.

Total cost of ownership considerations: Self-hosting introduces infrastructure, operations, and talent costs that offset some API savings. A 2024 peer-reviewed analysis found that chips and staff typically comprise 70–80% of total LLM deployment costs. For a 1,000-person organisation deploying a frontier-class open source model (such as Llama 3.1 405B), first-year investment ranges from £130,000–280,000 ($165,000–350,000), with ongoing annual costs of £600,000–1,600,000 ($760,000–2,000,000). These economics favour self-hosting primarily for organisations with high inference volumes where API costs would otherwise exceed infrastructure investment.

Practical deployment pathway: Smaller model variants significantly reduce infrastructure requirements. Llama 4 Scout can operate on a single Nvidia H100 GPU; the 8B parameter version runs on 16GB consumer GPUs. Quantisation techniques (reducing model precision from FP16 to INT4) can halve GPU memory requirements with minimal quality degradation. Inference-optimised frameworks such as vLLM and Text Generation Inference enable 2–3x throughput improvements on equivalent hardware.

McKinsey’s April 2025 research found that 40% of enterprise leaders prefer hosting AI models on their own infrastructure, citing data privacy, reduced breach risk through vendor elimination, and customised security protocols as primary motivations. Three-quarters of respondents expect to increase their use of open source AI technologies over the next several years.

Strategic Positioning for Global Enterprises

Open source AI addresses several strategic concerns for organisations operating across multiple jurisdictions:

Reduced single-vendor dependency: Self-hosted models enable architectural portability without vendor lock-in. Organisations can switch between model providers or hosting environments with minimal disruption, provided RAG and fine-tuning workflows isolate proprietary data from any single vendor.

Data sovereignty compliance: Self-hosted models on local infrastructure ensure data never leaves jurisdictional boundaries—eliminating concerns about international data transfers that complicate GDPR compliance with externally-hosted API services.

Regulatory alignment: The EU AI Act applies equally to proprietary and open-source models deployed in EU markets. However, self-hosted deployments provide greater control over documentation, audit trails, and compliance evidence required for high-risk system certification. Regulatory timelines and requirements are covered in Part 2.

European alternatives: Mistral AI (France) offers commercially competitive foundation models developed within EU jurisdiction. The Mixtral-8x22B model achieves approximately 75% computational savings compared with equivalent dense models through a Mixture-of-Experts architecture, while supporting native function calling and 64,000-token context windows. Specialists such as Mind Foundry (UK) provide interpretable AI solutions for high-stakes applications in regulated industries.


Competitive Advantage: Where It Actually Emerges

Fashion and Retail Enterprises

Competitive advantage does not come from owning a general-purpose LLM. It comes from proprietary data (customer behaviour, SKU dynamics, supplier relationships), domain-specific decision contexts (seasonality, markdown optimisation, sustainability reporting), and speed of experimentation rather than architectural purity.

McKinsey research indicates that generative AI could add £120–220 billion ($150–275 billion) to the fashion industry’s profits through applications in marketing copy generation, visual content creation, and personalised customer engagement. However, capturing this value requires focusing on high-impact use cases rather than infrastructure ownership.

Several Tier 1 fashion retailers have demonstrated this approach. Abercrombie & Fitch reduced inventory by 30% whilst boosting operating margin by nearly 10% through AI-driven assortment optimisation—using third-party AI capabilities rather than proprietary models. Research indicates that AI-powered merchandise financial planning implementations have delivered direct improvements in key performance indicators, including inventory reductions of up to 30% and increased profitability.

Enterprise Software Providers

For platform-scale software companies (Oracle, SAP, Salesforce), the calculus shifts. These organisations justify model investment through customer leverage (hundreds of thousands of enterprise customers demanding AI capabilities), distribution advantage (embedded models with high switching costs), and R&D amortisation across substantial user bases.

Critical Question for Software CTOs: Does your customer base create the volume and leverage to justify bespoke models, or does partnership with foundation model providers deliver equivalent functionality at lower risk?

Oracle’s approach illustrates strategic positioning—embedding AI capabilities across its enterprise software stack whilst maintaining partnerships with foundation model providers. Toshiba has similarly adopted hybrid strategies, using AI to reinvent traditional business offerings whilst leveraging external capabilities.

The Domain-Specific LLM Opportunity

For enterprises with genuinely differentiated data assets, domain-specific LLMs represent a middle path. Evidence indicates these models can outperform general-purpose alternatives on specialised tasks by 15–25% whilst requiring substantially less investment than foundation model development.

Fashion-specific examples include visual search optimisation using proprietary product imagery, trend prediction models leveraging first-party demand signals, and sustainability scoring models drawing on supply chain data not available to general-purpose LLMs.

European enterprises pursuing domain-specific models can consider partnerships with regional AI developers—such as Mistral AI (France) for foundation model capabilities or Mind Foundry (UK) for high-stakes applications requiring interpretability—as part of a diversified model portfolio that reduces single-vendor dependency.

Key Discipline: The differentiation must be provable and defensible. If the same outcome can be achieved through RAG or fine-tuning, the case for domain-specific models weakens considerably.


The Economic Reality: Understanding the Margin Constraints

Fashion and retail enterprises operate on fundamentally different economics than software companies considering AI investment.

Fashion retail margins are structurally thin: Net margins typically range from 3–10%, with luxury brands at the higher end and high-volume operators at the lower end. Pure-play online retailers such as ASOS reported pre-tax margins of approximately 3.7% in recent periods, compared to 13.4% reported by companies like Next plc that operate omnichannel models. This compares unfavourably to software industry margins that can exceed 20–30% net.

AI investment must demonstrate rapid payback: Unlike software companies that can treat AI development as a multi-year R&D bet, fashion retailers require demonstrable return within much shorter planning cycles. The 49% of leaders who cite “demonstrating clear business value” as the hardest part of scaling AI reflect this pressure acutely.

Open source cost structures favour high-volume workloads: The economics of self-hosted open source models become attractive primarily for enterprises with substantial inference volumes. An organisation running millions of daily chatbot interactions can save 50–70% by switching from per-token cloud pricing to GPU-leased instances. For lower-volume use cases, API-based consumption typically remains more cost-effective when accounting for infrastructure and operational overhead.

Implications for CTOs: AI strategies that prioritise operational efficiency (inventory optimisation, demand forecasting, markdown management) will generate faster payback than customer-facing applications that require longer adoption curves. Additionally, organisations should approach AI investments with the same rigour applied to other capital expenditure, including sensitivity analysis and clear decision gates. Robust FinOps practices are essential for AI cost analysis and control.


What’s Next

This article has established the strategic and economic foundations for enterprise AI decision-making. In Part 2, we examine the risk landscape—including regional provider evaluation, AI supply chain security, and regulatory considerations that CTOs and CISOs must navigate.


References

[1] BCG, “The Widening AI Value Gap: Build for the Future 2025” (September 2025).

[2] McKinsey, “The State of AI: How Organizations are Rewiring to Capture Value” (March 2025).

[3] McKinsey, “The State of AI in 2025: Agents, Innovation, and Transformation” (November 2025).

[4] McKinsey, Mozilla Foundation, and Patrick J. McGovern Foundation, “Open Source Technology in the Age of AI” (April 2025).

[5] Deloitte, “State of Generative AI in the Enterprise” (2024).

[6] Epoch AI, “The Rising Costs of Training Frontier AI Models” (February 2025).

[7] Stanford AI Index, “Artificial Intelligence Index Report 2024”.

[8] Meta AI, “The Future of AI: Built with Llama” (December 2024).

Image generated by Night Cafe Studio AI

Share: