Rethinking operations in an agentic AI world

Thursday October 16, 2025. 11:00 AM , from InfoWorld

IT operations have long been defined by one overriding goal: keep applications running at all costs. Success was measured in uptime.

Agentic AI systems break that assumption. Unlike traditional applications, agents are ephemeral, spinning up in response to a prompt or another agent’s request, performing a task, and then disappearing once the job is complete. Think less about managing a full-time workforce, and more about coordinating a revolving door of temporary specialists. It’s Spirit Halloween, not Macy’s.

That ephemerality changes everything. It forces us to ask a fundamental question: what does operations even look like when your software vanishes as quickly as it appears?

Why old playbooks don’t fit

Most of today’s operational models were built for stability and predictability. Kubernetes, for example, was a breakthrough for orchestrating long-running containerized applications. It provided a clean interface between infrastructure and the software it hosted, but always with the assumption that workloads were meant to stay up.

Agentic AI doesn’t play by those rules. An agent might live for seconds, triggered by a chat request or another agent’s output. It might spawn other agents dynamically, creating a cascade of transient processes. Patterns emerge, shift, and dissolve without warning. What looks like a neat stack diagram on paper behaves more like a living ecosystem in practice.

The challenge is obvious. You can’t ask a system designed for permanence to manage constant disappearance. Ops teams that try to extend yesterday’s playbooks into this new world quickly run into puzzles. How do you ensure ephemeral agents get the right data at the right time? How do you monitor or secure workloads that might not exist long enough to show up in a dashboard? How do you prevent every new agent from requiring its own brittle slice of infrastructure?

It isn’t that Kubernetes or other orchestration systems are obsolete—they’re simply optimized for a different problem. The operational assumptions of persistence and central control no longer hold when the very units of execution are designed to appear, act, and vanish at will.

Capacity vs. consumption: A new mental model

Enterprise IT has long been organized around a division of labor. One group delivers capacity—compute, storage, networking—while another consumes that capacity by building and running applications and services. In the agentic world, this distinction matters more than ever.

“Capacity” is the infrastructure and data environment. “Consumption” is the agents and models that use that capacity. “Inference” is the mechanism (the interface) by which AI agents draw on capacity.

Framing IT operations this way allows us to rethink what agents need to succeed. They don’t care where they run. They don’t need to know which network they’re on or which storage system holds the data. What matters is that when they spin up—however briefly—they can reliably and efficiently access the right resources with the right permissions.

Separating capacity from consumption creates agility. It prevents brittle one-to-one bindings between agents and specific infrastructure. It enables enterprises to adopt new models without ripping apart their stack. And it allows swarms of agents to emerge, work, and dissolve without demanding a new playbook for each one.

Just as Kubernetes abstracted away the complexity of container orchestration, the agentic AI era requires an interface layer that abstracts ephemeral inference from the underlying systems that support it.

Early experiments and enterprise scenarios

We’re already seeing glimpses of this future in experimental work. In one project, Reuven Cohen of the Agentics Foundation demonstrated how outcome-driven prompting could orchestrate swarms of agents to handle research, design, coding, and testing, all without a fixed workflow defined in advance. The system self-organized, spinning up and shutting down agents as needed.

It wasn’t perfect. Workflows and tools were needed to handle deployment decisions and data access, and often took several attempts to get working correctly, if at all. But it illustrated what becomes possible when agents can self-organize around outcomes rather than follow rigid workflows.

For enterprises, the stakes are higher. Imagine customer service agents spinning up around the world to handle localized requests. Each must access the right customer data in compliance with regional regulations and retire cleanly once the interaction ends. Without an abstraction between capacity and consumption, every new agent risks becoming its own operational headache. With it, ephemeral swarms can become manageable.

The underestimated challenges

If separating capacity from consumption sounds straightforward, the reality is less so. One of the biggest hurdles is memory and context. How do you ensure that agents working on a shared outcome all see the same data and context? How do you pass results across ephemeral processes so that one agent isn’t guessing about what another has already discovered?

Monitoring is another challenge. Traditional dashboards and logging tools assume persistence; they expect processes to be around long enough to leave a trace. Ephemeral agents may not. New methods will be required to observe and validate behavior in real time.

Then there’s regulation. Enterprises are already grappling with jurisdiction-specific rules about AI and data use. In an agentic world, composability is key: the ability to swap agents, models, or data sources without breaking the system. Without that, adapting to new compliance requirements becomes nearly impossible.

Looking ahead

The shift from applications to agents represents more than just a new architecture. It’s a fundamental change in what IT operations means. Instead of keeping software up and running, ops in the agentic era is about ensuring the right software can appear, act, and disappear as needed—securely, reliably, and at scale.

There are still unanswered questions. What standards will emerge to govern ephemeral ops? How will enterprises build trust in systems where the actors vanish as quickly as they arrive? How do we guarantee repeatable, reliable behavior in a world defined by constant change? And how will organizations deliver the right data to the right agent at the right moment, without binding every agent to its own custom infrastructure?

We don’t yet have all the answers. But one thing is clear: operations in the agentic era won’t look like the playbooks of the past. The learning has only just begun.

—

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.