Kubernetes in the Age of AI – O’Reilly

When Kubernetes first came onto the scene, it was a major turning point, a revision of the infrastructure and operations space that transformed the way developers and ops personnel build, deploy, and maintain applications in the cloud. It has since become the clear standard for how modern applications are built and operated. As the CNCF noted in its latest Annual Cloud Native Survey report, “Among container users, 82% are using Kubernetes in production in 2025, up from 66% in 2023. This represents near-universal adoption within the container ecosystem.”

Over the last few years, another revision in the space has occurred with Kubernetes’s evolution from a container orchestrator to an AI infrastructure platform. According to the CNCF survey, “The rise of Kubernetes as the de facto AI platform represents a fundamental shift in how organizations approach machine learning operations. . .[with Kubernetes] providing a unified orchestration layer that handles both traditional application workloads and compute-intensive AI tasks.” The emergence of seismic technologies like generative AI and agentic AI has only accelerated this transformation.

The intersection of AI with Kubernetes is undoubtedly one of the most impactful developments in the operations space. As Jonathan Johnson, software architect at Dijure, observes, “AI on K8s is very, very important, and there is not enough [resources] out there.” Raju Gandhi, senior technical architect at Edward Jones, echoes this assessment, noting that “operationalizing AI/ML on K8s is a big issue, [and it’s only] getting bigger. This is a topic that needs attention.” But what are some of the things that you should know about this trend to keep abreast and stay ahead in the game?

Generative AI

Anyone with access to a computer or a smartphone has likely used some iteration of generative AI, a stunning fact when you consider that GenAI was on the outer edges of mainstream discourse and consumption a scant five years ago. But at the end of 2022, the debut of ChatGPT marked the beginning of a technological revolution, one that would impact and reshape nearly every aspect of our working and personal lives. Unsurprisingly, there are now thousands of generative AI models, a proliferation that naturally has its own set of complexities. Selecting a model is simple, but if you’re an application developer or MLOps engineer, how do you go about operating that model in a production system? Not only do you have to be cognizant of factors like resilience, scalability, security, and operational costs, but there’s the fact that bringing a model from experimentation into production can be arduous if not done properly. That’s where Kubernetes comes into play.

As Roland Huß and Daniele Zonca, distinguished engineers at Red Hat, note, “GenAI/LLM models are resource intensive, requiring substantial computational power and large datasets. Given its scalability and extensibility, Kubernetes is uniquely suited to function as an efficient platform for AI and LLM model pretraining, fine-tuning, deployment, and prompt engineering.” They further elaborate that “this integration with Kubernetes not only simplifies the adoption of cutting-edge AI technologies but also ensures a seamless and efficient operational flow. Kubernetes, with its robust scalability and management capabilities, stands as an ideal platform for generative AI projects, aligning DevOps and MLOps practices in a cohesive ecosystem.”

This sentiment is already shared by a wide swath of the industry. According to the CNCF survey above, as of 2025, 66% of organizations run generative AI workloads on Kubernetes. These organizations include OpenAI, which uses Kubernetes for its AI/LLM application experimenting and testing; Tesla, which utilizes KServe to manage production-grade LLM inference; and Adobe, which uses Kubernetes to power its suite of generative creative models. Other companies taking this approach include Uber, Intuit, and Google. With more companies adopting this practice for their generative AI and LLMs operations, it’d be prudent for any organization to leverage Kubernetes for their own GenAI and LLM workflows.

Agentic AI

Nearly coinciding with the rise of GenAI has been the steady growth of agentic AI. Unlike GenAI, agentic AI goes beyond answering simple prompts and generating text in its ability to operate autonomously to perform complex, multistep actions, utilize tools, and make independent decisions. With its ability to support both traditional ML processes and GenAI and LLM operations, it should come as no surprise that Kubernetes has a role in the agentic AI ecosystem as well.

According to Ronald Petty, principal consultant at RX-M, “Kubernetes has been leveraged to host machine learning pipelines, including AI model training and inference. As inference options have become plentiful and affordable, on and off-premise, we have seen the rise of agents. Coupling cloud native technologies and popular protocols, we now see agents moving from ad hoc demos to complex fleets of agents on systems like Kubernetes.” So what are some examples of the integration between these two technologies?

One notable offering is Kagent, an OS programming framework that runs AI agents in Kubernetes and “helps engineers build powerful internal platforms by tackling cloud native tasks such as configuration, troubleshooting, complex deployment scenarios, observability pipelines and dashboards, and safely enabling network security.” Operating along similar lines is K8sGPT, an AI-powered tool that leverages intelligent insights and automated troubleshooting to analyze Kubernetes clusters for configuration problems and security issues, as well as generates solutions to problems discovered in analysis.

A more recent entry in the field is Sympozium, a Kubernetes-native coordination layer for multi-agent AI systems that “solves the same problem Kubernetes solved for containers, but for agents that need to share context, hand off tasks, and maintain shared situational awareness.” Another newer offering is Agent Sandbox, which allows you to run AI agents as isolated, stateful workloads with a native API on Kubernetes.

The fundamentals

While it’s important to be aware of the latest developments and trends affecting your domain, that shouldn’t come at the expense of foundational knowledge and skills. As basketball great Michael Jordan once said, “Get the fundamentals down and the level of everything you do will rise.” One of the most fundamental skills for working with Kubernetes is networking, and frustratingly enough, it’s one of the more difficult ones to master. As Cisco senior staff engineer Nico Vibert observes, “Platform engineers tend to be comfortable with Linux networking but less so with protocols like BGP and IPv6; network administrators know those protocols well but find Kubernetes abstractions unfamiliar. Both personas struggle to navigate the dozens of networking tools seemingly required to meet connectivity and security requirements.” Yet as organizations move mission-critical workloads, AI training pipelines, and regulated financial services onto Kubernetes, the engineers who can design, secure, and troubleshoot the network layer have become some of the most sought-after professionals in the industry.

In recognition of both the importance and difficult nature of the Kubernetes networking skill, the CNCF recently announced a new certification focused on the Kubernetes network engineer role. The certification is designed to validate hands-on networking expertise across all of the aforementioned layers, filling a gap that the Kubernetes community has long recognized.

For organizations that use Kubernetes to develop and deliver applications, leaders and decision-makers need to be aware that utilizing Kubernetes in conjunction with the latest AI tools is no longer a luxury but a necessary practice that will allow their companies to thrive. A similar onus should be placed on the basics. When hiring your next DevOps, network, or site reliability engineer, ensure that their ability to design, secure, and troubleshoot the Kubernetes network layer is second to none.

If you want to dive deeper, check out Roland Huß and Daniele Zonca’s Generative AI on Kubernetes, Jonathan Johnson’s GPU Kubernetes Homelab live course, Alex Corvin, Taneem Ibrahim, and Kyle Stratis’s Scalable Kubernetes Infrastructure for AI Platforms, Ashok Srirama and Sukirti Gupta’s Kubernetes for Generative AI Solutions, and Yogesh Raheja’s K8sGPT Essentials on-demand course. They’re all on O’Reilly. If you’re not a member, you can get started with a free trial.

Source link

What's Hot

Kubernetes in the Age of AI – O’Reilly

The Download: a new hunt for dark matter and Kenya’s case for going solar

AI-assisted data development with Kiro and SageMaker Unified Studio

Kubernetes in the Age of AI – O’Reilly

Generative AI Music Attribution Rethinks Royalties

Around the World, These Building Solutions Keep Things Local

Dementia is rising, but your personal risk is falling

Honolulu gambling raid in Waimakua Place nets machines

The Download: “reprogramming” aging, and the hidden sense of interoception

SpaceX IPO: Everything you need to know

Understanding U-Net Architecture in Deep Learning

Hard-braking events as indicators of road segment crash risk

Redefining AI efficiency with extreme compression