GPT-4 Changes the Enterprise AI Calculation — What Technology Leaders Need to Do in the Next 90 Days

Why GPT-4 Is Different From What Came Before

OpenAI released GPT-4 on 14 March 2023. The gap between GPT-3.5 and GPT-4 in professional knowledge tasks is significant enough that technology leaders who formed their view of LLM enterprise applicability based on earlier models need to update it.

The performance improvement on professional benchmarks is the signal that matters most for enterprise applicability. GPT-4 performs in the 90th percentile on the bar exam, the medical knowledge assessment, and a range of professional qualification tests. GPT-3.5 performed near the bottom decile on the same tests. This is not a marginal improvement in a system that was already capable. It is a qualitative shift in what the technology can produce on the kinds of complex reasoning tasks that have professional and business value.

For enterprise technology leaders, this shift changes the framing from “how do we pilot AI in a few low-stakes use cases” to “how do we manage AI as a strategic capability question that will affect our competitive position?” The two framings lead to different investment decisions, different governance requirements, and different board conversations.

The next ninety days are the period in which the leaders who act on this shift will begin separating from the ones who are still treating AI as an innovation budget experiment.

Assessing Business Process Exposure

The first ninety-day priority is a structured assessment of which business processes are most exposed to AI-driven change, by which I mean both most threatened by AI-enabled competition and most amenable to AI augmentation for competitive advantage.

The assessment is not a technology feasibility exercise. It is a competitive analysis that starts with the business processes that create the most significant value in the organisation, and asks whether those processes could be performed better, faster, or at lower cost with AI capability now available at the quality that GPT-4 represents. The answers that emerge from this analysis tend to be more consequential than the use cases that data science teams propose from a technology-first perspective, because they are anchored to business value rather than technical possibility.

The processes most likely to surface as high-priority are those that involve synthesis of large volumes of text or structured information into decisions or recommendations: financial analysis, legal document review, contract negotiation preparation, complex customer support resolution, technical documentation creation, and regulatory compliance assessment. Each of these involves professional judgment applied to information; GPT-4-level reasoning can support that judgment in ways that earlier models could not.

The assessment should also identify processes where competitors’ adoption of AI capability would create a meaningful disadvantage if the organisation does not respond. In professional services and knowledge-intensive industries, the competitive impact of AI-augmented productivity in analytical and writing-intensive processes is not a future risk. It is a current one.

The Build-vs-Buy Decision for AI Capabilities

The second ninety-day priority is clarifying the organisation’s position on the build-vs-buy decision for AI capabilities, because that decision shapes the investment, skills, and governance requirements for everything that follows.

Building AI capabilities on models like GPT-4 through API access, with the organisation’s data and workflows integrated through prompt engineering and fine-tuning, is the accessible path for most enterprises. It does not require training a large language model from scratch, which is beyond the resource capacity of all but a handful of technology organisations globally. It does require building the application infrastructure around the model: the retrieval systems that provide the model with relevant context, the output handling that integrates the model’s responses into operational workflows, and the evaluation and monitoring infrastructure that maintains quality in production.

Buying AI capabilities through commercial applications that have already built the model integration is the faster path to deployment. The trade-off is less customisation to the organisation’s specific context and processes, and a commercial relationship with an AI application vendor rather than direct access to the underlying model capabilities.

For most enterprises, the right answer is a combination: buying AI capabilities for use cases where commercial applications are available and adequate, and building for use cases that are sufficiently specific to the organisation’s context and sufficiently important to its competitive position that commercial applications cannot serve them.

The clarity required in the next ninety days is which use cases fall in each category, because that clarity determines the skills investment, the platform decisions, and the vendor relationships the organisation needs to pursue.

Data Privacy and Governance Before Deployment

The third ninety-day priority is establishing the data privacy and governance framework that GPT-4-powered applications require before deploying them in contexts that involve sensitive data.

GPT-4’s capabilities create new data privacy considerations that earlier, less capable models did not raise with the same urgency. A model that can synthesise and summarise complex documents with professional quality is a model that can extract sensitive information from those documents in ways that may not be immediately apparent from the content sent to it. An AI application that is given access to customer records, financial data, or strategic information for legitimate business purposes has that data available to it in ways that require the same governance attention as any other system with access to that data class.

The governance questions are specific. Can the AI service being used offer a data processing agreement that is compatible with the organisation’s regulatory obligations, particularly under GDPR and sector-specific requirements? Is the data sent to the AI service retained, and if so under what terms? Who in the organisation is authorised to send what categories of data to AI services? What review process applies to AI-generated outputs before they are used in regulated contexts?

These questions do not have universal answers. They require organisation-specific decisions that take into account the organisation’s regulatory context, the AI services it is using, and the use cases it is deploying. The organisations that establish these governance foundations before deployment avoid the incident that establishes them under pressure. The ones that deploy first and govern second have a worse version of that conversation.

Structuring the Board Communication

The fourth ninety-day priority is preparing the board communication that frames GPT-4’s release as a strategic capability question rather than a technology development to be monitored.

The board communication that serves this moment well has three components. An honest assessment of where the organisation stands relative to its industry peers on AI capability, including the pilots that are in flight, the use cases that are in production, and the investment gaps that the current state reveals. A structured view of the AI capability priorities that the assessment identifies, framed as business outcomes rather than technology deployments. And a governance framework that addresses the board’s risk questions: what are the data privacy and regulatory implications, how will AI-assisted decisions be governed, and what is the escalation path when AI systems produce incorrect or harmful outputs.

The board that understands this framing is positioned to govern AI investment as a strategic capability decision. The board that receives a technology briefing about a new AI model is not equipped to make the governance decisions that AI deployment at scale requires.

The ninety-day window is the period during which the AI strategy narrative forms. The technology leaders who invest in the strategic framing during this period will be having qualitatively different conversations with their boards and their peers six months from now than the ones who are still treating GPT-4 as another model update.

Leave a Comment