- Vishakha Sadhwani
- Posts
- The DevOps Skills Radar:2025 vs 2026
The DevOps Skills Radar:2025 vs 2026
What's changed, what's growing, and where to focus next ~
Hi Inner Circle!!
This is a weekend edition, so I am not going to throw a long list of tech topics at you today.
Instead, let’s talk about the bigger trend:
What has changed for backend, DevOps, cloud, and AI infrastructure engineers?
The image shows a lot, but let me summarize what these tools are actually doing in our day-to-day work across startups, cloud-native companies, AI-native companies, and big tech.

Before we dive in ~ this isn't an exhaustive list. These trends are gathered from sources like CNCF, industry surveys, and what's actually showing up in job descriptions and production systems right now.
Let's get into it ~
What the Radar is Telling Us
The two biggest movers? AI-Assisted Engineering (55% → 90%) and AI Infrastructure (50% → 85%). Massive growth ~ both labeled exactly that.
That's not a coincidence. Companies aren't just using AI products anymore.. they're building the infrastructure that runs them.
LLMOps, GPUs, vLLM, KServe ~ these went from niche to expected almost overnight.
Observability and Platform Engineering both jumped into the high growth category too. Makes sense ~ the more complex your stack gets, the more you need to see inside it and standardize how teams ship.
The essentials ~ Linux, Docker, Git ~ didn't move. Still at 85–90%. They're not trends anymore.. they're table stakes. If you don't have these, nothing else matters.
Before we dive deeper into the tools and tech trends, here's something relevant to running LLM apps in production ~

Run open-source LLMs in real production.
Capture live traffic, fine-tune and optimize models, then deploy your own checkpoints to dedicated GPU endpoints.
Choose your hardware, set scaling limits, and select deployment regions ~ with stable latency, predictable costs, and clear data residency.
From LLM to production system, all in one platform.
What Changed, And Why?
The shift is clear: AI workloads are now a backend concern, not just an ML team concern.
Kubernetes hit 95% because everything runs on it now ~ backend services, internal platforms, data workloads, and even AI inference workloads.
Cloud hit 92% because companies are building across AWS, Azure, GCP, and OCI instead of managing their own physical infrastructure from scratch.
Terraform/OpenTofu grew because infrastructure-as-code is no longer optional when you are managing cloud environments, GPU clusters, networking, security policies, and multi-cloud deployments.
GitHub Actions / CI-CD grew because AI systems need pipelines too ~ model versioning, evaluation gates, security checks, deployment automation, and rollback workflows.
AI-assisted engineering jumped because engineers are now using AI to write code, debug logs, generate configs, review PRs, and move faster through repetitive backend and DevOps work.
Observability grew because modern systems are too distributed to debug blindly ~ teams need metrics, logs, traces, dashboards, and alerts to understand what is happening in production.
Linux stayed essential because it is still the operating system underneath servers, containers, cloud VMs, Kubernetes nodes, and most production infrastructure.
Docker stayed essential because containers are still the standard way to package applications, services, jobs, and development environments.
Git/GitHub stayed essential because every serious engineering workflow still depends on version control, pull requests, reviews, branching, and collaboration.
Platform Engineering grew because companies want internal developer platforms that let teams deploy, monitor, secure, and manage services without rebuilding DevOps workflows from scratch.
Security / DevSecOps grew because security has moved into the delivery pipeline ~ secrets scanning, dependency checks, container security, cloud permissions, and runtime protection all need to happen earlier.
AI Infrastructure jumped because companies are moving from AI demos to production AI systems ~ LLM serving, GPUs, vLLM, KServe, vector databases, inference latency, and cost optimization are becoming backend problems too.
How to Work on This Stack Now
~ If you're early ~ lock in Linux, Git, Docker, and one cloud provider. These don't expire.
~ If you're mid-level ~ Kubernetes and Terraform are your next investment. Non-negotiable in most cloud-native and AI-native orgs.
~ If you're going deep on AI infrastructure ~ start with observability first (OTel, Grafana, Prometheus), then layer in vLLM or KServe. You can't manage what you can't measure.
~ AI-Assisted Engineering at 90% doesn't mean "use Copilot." It means understanding how AI tooling fits into your engineering workflow ~ from code generation to incident response.
The essence of this post is to make you aware of what’s happening in the ecosystem, not to scare you.
So create a 30/60/90-day plan and act on it.
The engineers who will matter in the next two years are the ones who can operate at the intersection of cloud-native infrastructure and AI systems.
That’s exactly what this newsletter is here to help you do.
What topics or roadmaps are you looking for? Let me know by replying back!
See you next week.
-V