Kubernetes: Complete Guide to Container Orchestration

Kubernetes: Complete Guide to Container Orchestration
Kubernetes (K8s) is the industry-standard platform for running containerized applications at scale. It automates deployment, scaling, and management of containers — replacing the manual work of spinning up servers, configuring load balancers, and responding to crashes.
This guide covers the core Kubernetes objects (Pods, Deployments, Services, Ingress, ConfigMaps, Secrets), how the control plane works, horizontal pod autoscaling, health checks, rolling updates, and when Kubernetes is the right choice (and when it is overkill).
The Problem Kubernetes Solves
Without orchestration, running containers in production means:
- Manually starting containers on servers
- Manually restarting crashed containers
- Manually adding servers when traffic spikes
- Manually configuring which server has which container
- Manually balancing traffic across instances
Kubernetes automates all of this. You declare the desired state (I want 10 instances of my API server) and Kubernetes maintains it continuously.
Kubernetes Architecture
A Kubernetes cluster has two types of machines:
Control Plane (the brain):
- API Server: The only entry point for all cluster communication
- etcd: Distributed key-value store holding all cluster state
- Scheduler: Decides which worker node to run each Pod on
- Controller Manager: Runs controllers that maintain desired state
Worker Nodes (where your containers run):
- kubelet: Agent on each node that runs Pods
- kube-proxy: Network proxy on each node
- Container Runtime: Docker, containerd, or CRI-O
User (kubectl) → API Server → etcd (cluster state)
↓
Controller Manager
↓
Scheduler
↓
Worker Node (kubelet)
↓
Container Runtime
↓
PodCore Kubernetes Objects
Pod
A Pod is the smallest deployable unit in Kubernetes. It contains one or more containers that share:
- The same network namespace (they communicate via
localhost) - The same storage volumes
- The same lifecycle (they start and stop together)
Most Pods have a single container. Multi-container Pods are used for sidecar patterns (log collector alongside the app, Envoy proxy alongside the service).
# pod.yaml — simple single-container Pod
apiVersion: v1
kind: Pod
metadata:
name: my-api
labels:
app: my-api
spec:
containers:
- name: api
image: my-org/my-api:v1.2.3
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: production
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"Never create Pods directly in production — they won't be restarted if they crash. Always use a Deployment.
Deployment
A Deployment manages a set of identical Pods. It ensures the desired number of replicas is always running and handles rolling updates.
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-api
namespace: production
spec:
replicas: 3 # Run 3 identical Pods
selector:
matchLabels:
app: my-api
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # At most 1 extra Pod during update
maxUnavailable: 0 # Zero downtime — never kill a Pod before new one is ready
template:
metadata:
labels:
app: my-api
spec:
containers:
- name: api
image: my-org/my-api:v1.2.3
ports:
- containerPort: 3000
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "512Mi"
cpu: "1000m"
readinessProbe: # Only send traffic when this passes
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe: # Restart Pod if this fails
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 30Rolling Update Process:
- Create one new Pod with the new image
- Wait for its readiness probe to pass
- Remove one old Pod
- Repeat until all Pods are on the new version
- If any new Pod fails its readiness probe, stop the update and preserve old Pods
Service
A Service provides a stable network endpoint for a set of Pods. Pod IPs change every time a Pod restarts; the Service IP stays constant.
# service.yaml
apiVersion: v1
kind: Service
metadata:
name: my-api-service
namespace: production
spec:
selector:
app: my-api # Routes to all Pods with this label
ports:
- name: http
protocol: TCP
port: 80 # Port the Service listens on
targetPort: 3000 # Port on the Pod
type: ClusterIP # Internal only (use LoadBalancer for external)Service types:
- ClusterIP (default): Internal only, accessible within the cluster
- NodePort: Exposes on a port of every node — for development only
- LoadBalancer: Creates a cloud load balancer (AWS ELB, GCP LB) — for production external access
- ExternalName: DNS alias for an external service
Ingress
An Ingress routes HTTP/HTTPS traffic from outside the cluster to internal Services based on hostname or URL path.
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-app-ingress
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod # Auto SSL certificates
spec:
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-api-service
port:
number: 80
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: frontend-service
port:
number: 80
tls:
- hosts:
- api.example.com
- app.example.com
secretName: example-com-tlsConfigMap and Secret
ConfigMaps store non-sensitive configuration. Secrets store sensitive data (passwords, API keys) — stored encrypted in etcd.
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
NODE_ENV: production
LOG_LEVEL: info
MAX_CONNECTIONS: "100"
---
# secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: app-secrets
type: Opaque
stringData:
DATABASE_URL: postgresql://user:pass@db-service:5432/myapp
JWT_SECRET: super-secret-key-32-chars-minimum
STRIPE_SECRET_KEY: sk_live_...Inject into a Deployment:
spec:
containers:
- name: api
envFrom:
- configMapRef:
name: app-config
- secretRef:
name: app-secrets
# Or inject individual values:
env:
- name: PORT
value: "3000"
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: app-secrets
key: DATABASE_URLHealth Checks
Kubernetes uses two probes to manage Pod health:
Liveness Probe: Is the container alive? If this fails, Kubernetes restarts the Pod.
Readiness Probe: Is the container ready to receive traffic? If this fails, the Pod is removed from Service endpoints (no traffic sent) but not restarted.
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30 # Wait 30s before first check (startup time)
periodSeconds: 30 # Check every 30s
failureThreshold: 3 # Restart after 3 consecutive failures
readinessProbe:
httpGet:
path: /ready
port: 3000
initialDelaySeconds: 5 # Check sooner than liveness
periodSeconds: 10
failureThreshold: 3 # Remove from load balancer after 3 failuresYour application must expose these endpoints:
// health.js — Kubernetes health check endpoints
app.get('/health', (req, res) => {
// Just check the process is alive
res.status(200).json({ status: 'alive' });
});
app.get('/ready', async (req, res) => {
try {
await db.query('SELECT 1'); // Check database connectivity
res.status(200).json({ status: 'ready' });
} catch (err) {
res.status(503).json({ status: 'not ready', error: err.message });
}
});Horizontal Pod Autoscaler (HPA)
HPA automatically scales the number of Pod replicas based on CPU, memory, or custom metrics.
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-api
minReplicas: 3 # Never scale below 3
maxReplicas: 50 # Never scale above 50
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70 # Scale up when CPU > 70%
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80HPA scales up when the average CPU utilization of existing Pods exceeds the target. It scales down when utilization drops, with a cooldown period to prevent flapping.
Namespace Organization
Namespaces provide logical separation within a cluster:
# Create namespaces for different environments
kubectl create namespace production
kubectl create namespace staging
kubectl create namespace development
# Deploy to a specific namespace
kubectl apply -f deployment.yaml -n production
# List all Pods in a namespace
kubectl get pods -n productionResource quotas per namespace:
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
requests.cpu: "20"
requests.memory: 40Gi
limits.cpu: "40"
limits.memory: 80Gi
pods: "100"Essential kubectl Commands
# Cluster info
kubectl cluster-info
kubectl get nodes
# Pod management
kubectl get pods -n production
kubectl describe pod my-api-abc123 -n production
kubectl logs my-api-abc123 -n production --follow
kubectl exec -it my-api-abc123 -n production -- /bin/sh
# Deployment management
kubectl get deployments -n production
kubectl rollout status deployment/my-api -n production
kubectl rollout history deployment/my-api -n production
kubectl rollout undo deployment/my-api -n production # Rollback
# Apply/delete resources
kubectl apply -f deployment.yaml
kubectl delete -f deployment.yaml
# Scale manually
kubectl scale deployment/my-api --replicas=5 -n production
# Port forwarding for debugging
kubectl port-forward service/my-api-service 8080:80 -n productionWhen to Use Kubernetes (and When Not To)
Use Kubernetes when:
- Running 10+ microservices that need independent scaling
- Team has dedicated DevOps/platform engineers
- Running on multiple cloud providers or on-premises
- Need sophisticated deployment strategies (canary, blue-green)
- Running stateful workloads (databases) with persistent volumes
Do not use Kubernetes when:
- Running fewer than 5-7 services
- Small team (< 5 engineers)
- First version of a product still finding product-market fit
- Budget is constrained — managed K8s (EKS, GKE) costs $150-300+/month before workload costs
Alternatives to Kubernetes for smaller scale:
- AWS App Runner: Container deployment without cluster management
- Railway / Render: Git push to deploy containers
- Fly.io: Global container deployment with a simpler model than K8s
- AWS ECS with Fargate: Serverless containers, no nodes to manage
Frequently Asked Questions
Q: What is the difference between a Pod and a container?
A container is a process packaged with its dependencies. A Pod is a Kubernetes construct that wraps one or more containers and provides them with a shared network, shared storage volumes, and a shared lifecycle. Most Pods contain exactly one container. Multi-container Pods are used for sidecar patterns (a log forwarder running alongside the app, an Envoy proxy running alongside each service).
Q: How does Kubernetes handle zero-downtime deployments?
The RollingUpdate strategy (the default) starts new Pods before stopping old ones. With maxUnavailable: 0 and maxSurge: 1, Kubernetes starts one new Pod, waits for its readiness probe to pass, then terminates one old Pod. This continues until all Pods are on the new version. If the new Pod's readiness probe never passes, the rollout pauses and the old version keeps serving traffic.
Q: What is Helm and do I need it?
Helm is a package manager for Kubernetes. A Helm chart is a reusable template for Kubernetes manifests with variables for customization. Instead of writing 200 lines of YAML to install PostgreSQL, you run helm install my-postgres bitnami/postgresql. For your own applications, Helm adds value when you need to deploy the same application to multiple environments (dev, staging, prod) with different configurations.
Q: Is Kubernetes difficult to learn?
The core concepts (Pods, Deployments, Services) can be learned in a day. The operational complexity (networking, storage, security, multi-cluster management) takes months to master. For most developers, the goal is not to become a Kubernetes administrator but to understand enough to write Kubernetes YAML for their application and debug common issues.
Key Takeaway
Kubernetes is the standard platform for running microservices and containerized workloads at scale. Its core value is declarative management: you specify the desired state (10 replicas of this image) and Kubernetes maintains it continuously — starting new Pods when traffic increases, restarting crashed containers, and routing traffic only to healthy Pods. The operational complexity is real, but for teams running 10+ services, the automation benefits far outweigh the learning curve. Start with a managed Kubernetes service (EKS, GKE, AKS) to eliminate the burden of managing the control plane yourself.
Read next: Software Architecture Roadmap: Becoming a Senior Architect →
Part of the Software Architecture Hub — engineering the fleet.
