Deploying Microservices on Kubernetes: A Production-Ready Guide for 2026
Learn how to deploy microservices on Kubernetes with proper namespace strategy, resource limits, and production patterns that prevent common failures.
Our team inherited a Kubernetes cluster where everything was jammed into the default namespace with no resource limits. A junior engineer fat-fingered a kubectl delete command. Gone: 40 services, 6 hours of recovery, one very apologetic engineer. That day we learned why namespace strategy exists.
The Problem
Deploying microservices on Kubernetes without structure is chaos. Everything lands in the default namespace. Services have no resource constraints. A misbehaving pod consuming 80GB of memory crashes its neighbors. Deletion is risky because there is no blast radius control. Environment variables are hardcoded. Secrets are floating around git repositories. The system works until it does not, and when it fails, failure is catastrophic.
We have seen teams deploy 8 microservices on Kubernetes successfully. Then deploy the 9th. Suddenly their entire monitoring stack crashes because the new service consumed all the memory on the node. The old developers knew better. The new developers did not. Nobody documented the rules.
Why This Happens
Kubernetes tutorials show a single service deployed to a single namespace. That works for tutorials. It does not work for production systems with dozens of services across dev, staging, and prod environments. Teams copy the tutorial pattern and carry it forward. By the time they realize it is broken, they have 20 services built on top of the broken foundation. Refactoring is expensive.
The tutorial never mentions namespaces because they seem optional. They are not. Namespaces are the difference between a controlled environment and a disaster zone.
The Solution
Step 1: Namespace Strategy
Option A (Recommended for most teams): One namespace per environment.
# Create the namespaces
kubectl create namespace development
kubectl create namespace staging
kubectl create namespace production
# Verify
kubectl get namespaces
All dev services in development namespace. All staging services in staging namespace. All prod services in production namespace. Clear blast radius. A kubectl delete in dev does not affect staging.
Option B (For very large teams): One namespace per environment per service cluster (e.g., payments-dev, payments-prod, shipping-dev, shipping-prod). This adds complexity but gives maximum isolation.
We recommend Option A. It is simple and solves 99% of problems.
Step 2: Create a Namespace with Labels
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
environment: prod
owner: platform-team
cost-center: engineering
Labels matter. They enable cost tracking, RBAC policies, and resource quotas scoped to entire namespaces. Always label your namespaces.
Step 3: Deploy a Production-Ready Microservice
apiVersion: v1
kind: Namespace
metadata:
name: production
---
apiVersion: v1
kind: ConfigMap
metadata:
name: api-service-config
namespace: production
data:
LOG_LEVEL: "info"
PORT: "3000"
ENVIRONMENT: "production"
---
apiVersion: v1
kind: Secret
metadata:
name: api-service-secrets
namespace: production
type: Opaque
stringData:
DATABASE_PASSWORD: "change-me-to-vault"
API_KEY: "change-me-to-vault"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
namespace: production
labels:
app: api-service
version: v1.2.3
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app: api-service
template:
metadata:
labels:
app: api-service
spec:
serviceAccountName: api-service
containers:
- name: api
image: myregistry.azurecr.io/api-service:v1.2.3
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 3000
env:
- name: ENVIRONMENT
valueFrom:
configMapKeyRef:
name: api-service-config
key: ENVIRONMENT
- name: LOG_LEVEL
valueFrom:
configMapKeyRef:
name: api-service-config
key: LOG_LEVEL
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: api-service-secrets
key: DATABASE_PASSWORD
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
readinessProbe:
httpGet:
path: /health/ready
port: http
initialDelaySeconds: 10
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
livenessProbe:
httpGet:
path: /health/live
port: http
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 3
failureThreshold: 3
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 15"]
---
apiVersion: v1
kind: Service
metadata:
name: api-service
namespace: production
labels:
app: api-service
spec:
type: ClusterIP
selector:
app: api-service
ports:
- name: http
port: 80
targetPort: http
protocol: TCP
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: api-service
namespace: production
This manifests creates everything needed for production deployment: ConfigMap for non-sensitive config, Secret for credentials (in production, use External Secrets Operator or Vault), Deployment with 3 replicas, health probes, resource requests and limits, graceful shutdown hook, Service, and ServiceAccount.
Understanding Rolling Updates
RollingUpdate with maxSurge: 1 and maxUnavailable: 0 means:
- Kubernetes can create 1 new pod above the desired replica count during an update.
- Kubernetes will never take down pods unless a new one is healthy.
- Result: zero-downtime deployments even if the new version has bugs. (The old version keeps handling traffic until the new version proves healthy.)
If the new version fails its readiness probe 3 times in a row, Kubernetes stops the rollout. The old version keeps running. This prevents a bad deploy from causing an outage.
Deploy and Verify
kubectl apply -f deployment.yaml
# Watch the rollout
kubectl rollout status deployment/api-service -n production
# Check pod health
kubectl get pods -n production -l app=api-service
# Describe a pod for detailed status
kubectl describe pod api-service-xyz -n production
Resource Requests vs Limits — The Most Misunderstood Kubernetes Concept
Two terms confuse every Kubernetes learner:
Requests: The amount of resources the scheduler uses to decide where to place a pod. "This pod needs 256Mi of memory." If the node has 512Mi available, the scheduler places the pod there. If the node has 200Mi available, the scheduler does not place it there.
Limits: The maximum resources a pod is allowed to consume. "This pod cannot exceed 512Mi of memory." If the pod tries to consume 600Mi, Kubernetes kills it (OOMKilled).
Without requests: The scheduler has no information and places pods randomly. One node ends up with 5 memory-hungry pods. All crash due to memory contention.
Without limits: One pod consumes all available memory and crashes its neighbors (the "noisy neighbor" problem).
With both: The scheduler spreads pods evenly. Each pod is constrained. No crashes.
resources:
requests:
memory: "256Mi" # Scheduler places pod based on this
cpu: "250m"
limits:
memory: "512Mi" # Pod cannot exceed this
cpu: "500m"
Common Mistakes to Avoid
- Using imagePullPolicy: Always in production. Every pod restart pulls the image from the registry again, causing slow starts. Use IfNotPresent (default is good for tagged images).
- No horizontal pod autoscaler on any service. Traffic spikes and pods are CPU-maxed. Manual scaling adds 15 minutes of delay. HPA scales automatically within seconds.
- Hardcoding secrets in deployment YAML. Secrets end up in git history forever. Use External Secrets Operator, AWS Secrets Manager, or HashiCorp Vault.
- No PodDisruptionBudget during cluster upgrades. Kubernetes evicts all replicas of a service to other nodes for maintenance. Without PDB, downtime is inevitable.
- All services in default namespace. A delete command in default deletes everything. Namespace isolation is the cheapest insurance policy.
- No network policies between services. Every pod talks to every pod by default. In a multi-tenant cluster, this is a security disaster.
- No preStop hook for graceful shutdown. Kubernetes kills a pod abruptly. In-flight requests error. preStop hook gives the pod time to drain connections before termination.
Key Takeaways
- Namespace strategy is foundational: One namespace per environment is the simplest approach that scales to hundreds of services.
- Resource requests and limits are mandatory: They prevent scheduling disasters and resource contention.
- Rolling updates with zero unavailability: maxSurge: 1, maxUnavailable: 0 protects against bad deployments.
- Health probes are essential: Readiness and liveness probes detect failures and trigger automatic recovery.
- Secrets belong in Vault, not YAML: Hardcoded secrets are a permanent security liability.
Struggling with managing microservices or configuring production Kubernetes deployments? The Skillzmist team has solved this exact problem for engineering teams across the US, UK, and Europe. Reach out for a free technical consultation — we respond within 24 hours.
Related: Why Kubernetes? The Case for Container Orchestration | Internal Service Communication in Kubernetes