7 Kubernetes Layers Every Engineer Should Know

From external traffic to running containers!!

Hi Inner Circle!

Welcome to this week's K8s edition.

Thought this week, I’d send something you can quickly skim through ~ the layers of Kubernetes applications.

Something practical you can read and digest before your next interview, production issue, or team call.

When something breaks in a Kubernetes cluster, the question is never "is it broken." It's "which layer is it broken on."

Traffic not reaching your app could be a Gateway misconfig, a kube-proxy issue, a service mesh policy, or a pod that's not Ready.

Same symptom, four different layers, four different fixes.

These 7 layers cover the full path from external request to running container. Knowing them changes how you debug, how you design, and how you read someone else's cluster.

Let's get into it ~

The Gateway is the entry point for external traffic into your cluster. It terminates TLS, looks at the request ~ host, path, headers, method — and decides which service inside the cluster handles it.

The Gateway API is the modern replacement for Ingress or you can call it as a more expressive version of ingress. Cleaner separation of roles ~ platform teams own the GatewayClass and Gateway, app teams own the HTTPRoute.

Three resources to know ~

GatewayClass ~ defines which Gateway controller implements the Gateway. The controller is the software/operator responsible for managing Gateway resources and configuring the underlying data plane (like Envoy, NGINX, or a cloud load balancer).

Gateway ~ the actual listener handling incoming traffic. Defines hostname, port, and TLS configuration. In many cloud environments, creating a Gateway also provisions the external Load Balancer automatically.

HTTPRoute ~ where the routing logic lives. Path matches, header matches, traffic splits, and backend service mappings.

Common implementations ~

Envoy-based ~ Envoy Gateway : high performance, deep observability, hot config reloads.

Istio Gateway ~ if you're already running Istio for service mesh, the same Envoy proxies handle north-south traffic too.

NGINX, HAProxy, Kong ~ mature, widely deployed, lots of plugins. Ingress-NGINX is still the most common in the wild even with Gateway API gaining ground.

Cloud-native ~ GKE Gateway, AWS Gateway API Controller (ALB/NLB), Azure Application Gateway. The cloud provisions and manages the load balancer behind the scenes.

And then there’s another interesting extension emerging around AI infrastructure ~ the Inference Gateway project, which extends Kubernetes Gateway API concepts specifically for LLM and model-serving workloads.

Things like:

  • intelligent model routing

  • inference-aware traffic policies

  • GPU-aware request handling

  • and AI workload balancing

We’ll discuss that in an upcoming newsletter :)

Why it matters This is the only layer your users touch directly. Misconfigure it and traffic never reaches your app ~ not because your app is broken, but because nothing is telling the cluster where to send the request.

kube-proxy runs on every node in your cluster. Its job ~ when something inside the cluster says "send this to the orders service"

→ kube-proxy figures out which actual pod gets the traffic.

It does this by programming network rules on each node.

3 modes worth knowing ~

iptables ~ the default for years. Works fine at small scale. At thousands of services, rule evaluation becomes a bottleneck because iptables walks rules linearly.

IPVS ~ uses kernel-level hash tables instead of linear rules. Scales much better for clusters with lots of services. Supports more load balancing algorithms (round-robin, least-conn, source-hash).

eBPF (via Cilium) ~ SKIPS kube-proxy entirely. Service routing happens in the kernel with eBPF programs. Faster, more observable, and what most large clusters are moving toward.

Why it matters Services in Kubernetes are abstractions. They don't actually receive traffic. kube-proxy is what makes the abstraction work. No kube-proxy (or eBPF replacement), no service networking.

Scenario Pod A can reach Pod B by IP, but not by service name. DNS resolves the service name correctly. The service exists and has endpoints. What's broken?

Quick pause before we move into the deeper application layers ~ because modern Kubernetes stacks are increasingly powering AI workloads too.

Run open-source LLMs in real production.

Capture live traffic, fine-tune and optimize models, then deploy your own checkpoints to dedicated GPU endpoints.

Choose your hardware, set scaling limits, and select deployment regions ~ with stable latency, predictable costs, and clear data residency.

From LLM to production system, all in one platform.

A service mesh adds a programmable network layer on top of your services. Every pod gets a proxy (sidecar) attached to it, and all east-west traffic between services flows through these proxies.

That gives you things vanilla Kubernetes doesn't ~

  • mTLS between every service, automatically.

  • Retries, timeouts, and circuit breaking without touching application code.

  • Traffic splitting for canaries and blue-green ~ "send 5% to v2."

  • Distributed tracing of every call between services.

  • Fine-grained authorization policies ~ which service can call which.

Three implementations dominate ~

Istio ~ most feature-complete, Envoy-based data plane. Recent versions support ambient mode, which removes the per-pod sidecar in favor of a per-node proxy. Less overhead, simpler ops.

Linkerd ~ lighter, opinionated, uses its own Rust-based proxy instead of Envoy. Easier to operate, fewer knobs.

Cilium Service Mesh ~ eBPF-based. No sidecars at all. The mesh logic runs in the kernel and in a per-node Envoy proxy for L7. Fastest path if you're already using Cilium as your CNI.

Why it matters Once you have more than a handful of services talking to each other, you need observability and control at the network layer. Otherwise every team reimplements retries, timeouts, and TLS in their own application code ~ inconsistently, and badly.

A sidecar is a second container that runs alongside your main application container inside the same pod. Same network namespace. Same volume mounts. Same lifecycle.

The sidecar handles concerns your app shouldn't have to (like the istio sidecar we discussed above) ~ log shipping, metric collection, secret rotation, the service mesh proxy itself.

Two container types to keep straight ~

Init containers ~ run to completion before the main container starts. Used for setup work ~ fetching config, running migrations, loading model weights, waiting on a dependency. Sequential. The main container doesn't start until all init containers exit successfully.

Sidecar containers ~ run alongside the main container for the full lifetime of the pod. As of Kubernetes 1.29, sidecars are a first-class concept (technically an init container with restartPolicy: Always).

That sounds like a small change but it fixes real problems ~

  • Sidecars now start before the main container and are guaranteed Ready before the main container starts. No more race conditions where the app starts before the Envoy proxy is up.

  • Sidecars terminate after the main container exits. No more lost logs because the log shipper died before the app finished writing.

  • Sidecars don't block Job completion. Old pattern ~ your batch job finishes but the Istio sidecar keeps running and the Job never completes. Fixed.

Why it matters Sidecars let you add capabilities to every workload without changing application code. One pattern, applied across the cluster, instead of every team building their own logging or TLS layer.

A pod is the smallest deployable unit in Kubernetes where your application actually runs. One or more containers, sharing a network namespace and storage volumes, scheduled together onto a single node.

You rarely create pods directly. You create something higher up (Deployment, StatefulSet, Job) and it creates pods for you.

Inside the pod ~

A shared network namespace ~ all containers in the pod share an IP and can talk to each other over localhost.

A shared IPC namespace ~ for processes that need to communicate via shared memory or signals.

Shared volumes ~ mounted into multiple containers if needed (the main pattern behind sidecars reading app logs).

A namespace is a logical boundary inside the cluster - It scopes resource names, applies ResourceQuotas and LimitRanges, and isolates RBAC. Pods in different namespaces can still talk to each other across the pod network unless NetworkPolicies say otherwise.

Interview Question ~ namespaces are not a security boundary by default. They're an organizational boundary. Real isolation requires NetworkPolicies, RBAC, and often PodSecurity admission controls layered on top.

Why it matters : Almost every Kubernetes problem eventually traces back to a specific pod in a specific namespace. Knowing how to find it, describe it, and read its events is the single most useful debugging skill in Kubernetes.

A Deployment manages stateless workloads. You declare "5 replicas of this image" and the Deployment makes it true ~ creating pods, watching them, replacing failed ones, rolling out new versions one batch at a time with health checks between each step.

A StatefulSet manages stateful workloads. Databases. Message brokers. Distributed systems where the pods are not interchangeable.

The differences you should know ~

Pod names ~ Deployment pods get random suffixes (api-7d4f9c-x2k8p). StatefulSet pods get stable, ordered names (kafka-0, kafka-1, kafka-2).

Network identity ~ StatefulSets get a stable DNS name per pod via a headless service. “kafka-0.kafka.default.svc.cluster.local” always points to the same pod across restarts.

Scaling order ~ Deployments scale in parallel. StatefulSets scale one at a time, in order. Pod “kafka-2” doesn't start until “kafka-1” is Ready.

Storage ~ Deployments share volumes or use ephemeral storage. StatefulSets use “volumeClaimTemplates” to give each pod its own persistent volume that follows it across restarts and reschedules.

Update strategy ~ Deployments do rolling updates by ReplicaSet. StatefulSets update pods in reverse ordinal order (pod-2, then pod-1, then pod-0) and support partitioned rollouts for canary upgrades of stateful systems.

Why it matters Picking the wrong one breaks things in subtle ways. Running a database as a Deployment "works" until you scale or restart and lose data identity. Same with model deployments.

A ReplicaSet sits between the Deployment and the actual pods. Its only job ~ ensure the right number of pod copies are running at all times.

You almost never create one directly.

The Deployment creates a ReplicaSet for you. When you update the Deployment's pod spec, it creates a new ReplicaSet and shifts pods from the old one to the new one according to the rollout strategy (maxSurge, maxUnavailable).

That's how rolling updates and rollbacks actually work ~ the old ReplicaSet stays around, scaled to zero. Roll back, and Kubernetes scales the old ReplicaSet back up and the new one down. No image pulls, no waiting ~ the pod template is already there.

Worth knowing ~ “revisionHistoryLimit” controls how many old ReplicaSets are kept around. Default is 10. Too low and rollbacks lose history. Too high and your cluster fills up with empty ReplicaSet objects.

Why it matters Understanding ReplicaSets is how you understand what's actually happening during a deployment. Why a rollout is stuck. How you can scale down an old replicaset? Why old pods aren't being cleaned up. Why a rollback completes in seconds. It all lives at this layer.

That’s it for this week!!

If you want a detailed breakdown of how AI models are deployed inside production Kubernetes clusters, make sure to comment/reply back ~ I’ll put together a deep dive in an upcoming edition.

These layers are essentially the practical application of Kubernetes in real production systems, so it’s worth becoming very comfortable with them.

See you in the next one.

-V