ArchitectureDevOps

Kubernetes: Complete Guide to Container Orchestration

TT
TopicTrick Team
Kubernetes: Complete Guide to Container Orchestration

Kubernetes: Complete Guide to Container Orchestration

Kubernetes (K8s) is the industry-standard platform for running containerized applications at scale. It automates deployment, scaling, and management of containers — replacing the manual work of spinning up servers, configuring load balancers, and responding to crashes.

This guide covers the core Kubernetes objects (Pods, Deployments, Services, Ingress, ConfigMaps, Secrets), how the control plane works, horizontal pod autoscaling, health checks, rolling updates, and when Kubernetes is the right choice (and when it is overkill).


The Problem Kubernetes Solves

Without orchestration, running containers in production means:

  • Manually starting containers on servers
  • Manually restarting crashed containers
  • Manually adding servers when traffic spikes
  • Manually configuring which server has which container
  • Manually balancing traffic across instances

Kubernetes automates all of this. You declare the desired state (I want 10 instances of my API server) and Kubernetes maintains it continuously.


Kubernetes Architecture

A Kubernetes cluster has two types of machines:

Control Plane (the brain):

  • API Server: The only entry point for all cluster communication
  • etcd: Distributed key-value store holding all cluster state
  • Scheduler: Decides which worker node to run each Pod on
  • Controller Manager: Runs controllers that maintain desired state

Worker Nodes (where your containers run):

  • kubelet: Agent on each node that runs Pods
  • kube-proxy: Network proxy on each node
  • Container Runtime: Docker, containerd, or CRI-O
text
User (kubectl) → API Server → etcd (cluster state)
                     ↓
              Controller Manager
                     ↓
               Scheduler
                     ↓
            Worker Node (kubelet)
                     ↓
              Container Runtime
                     ↓
                   Pod

Core Kubernetes Objects

Pod

A Pod is the smallest deployable unit in Kubernetes. It contains one or more containers that share:

  • The same network namespace (they communicate via localhost)
  • The same storage volumes
  • The same lifecycle (they start and stop together)

Most Pods have a single container. Multi-container Pods are used for sidecar patterns (log collector alongside the app, Envoy proxy alongside the service).

yaml
# pod.yaml — simple single-container Pod
apiVersion: v1
kind: Pod
metadata:
  name: my-api
  labels:
    app: my-api
spec:
  containers:
    - name: api
      image: my-org/my-api:v1.2.3
      ports:
        - containerPort: 3000
      env:
        - name: NODE_ENV
          value: production
      resources:
        requests:
          memory: "128Mi"
          cpu: "250m"
        limits:
          memory: "256Mi"
          cpu: "500m"

Never create Pods directly in production — they won't be restarted if they crash. Always use a Deployment.

Deployment

A Deployment manages a set of identical Pods. It ensures the desired number of replicas is always running and handles rolling updates.

yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api
  namespace: production
spec:
  replicas: 3                           # Run 3 identical Pods
  selector:
    matchLabels:
      app: my-api
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1                       # At most 1 extra Pod during update
      maxUnavailable: 0                 # Zero downtime — never kill a Pod before new one is ready
  template:
    metadata:
      labels:
        app: my-api
    spec:
      containers:
        - name: api
          image: my-org/my-api:v1.2.3
          ports:
            - containerPort: 3000
          resources:
            requests:
              memory: "256Mi"
              cpu: "500m"
            limits:
              memory: "512Mi"
              cpu: "1000m"
          readinessProbe:               # Only send traffic when this passes
            httpGet:
              path: /ready
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:                # Restart Pod if this fails
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 30
            periodSeconds: 30

Rolling Update Process:

  1. Create one new Pod with the new image
  2. Wait for its readiness probe to pass
  3. Remove one old Pod
  4. Repeat until all Pods are on the new version
  5. If any new Pod fails its readiness probe, stop the update and preserve old Pods

Service

A Service provides a stable network endpoint for a set of Pods. Pod IPs change every time a Pod restarts; the Service IP stays constant.

yaml
# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: my-api-service
  namespace: production
spec:
  selector:
    app: my-api              # Routes to all Pods with this label
  ports:
    - name: http
      protocol: TCP
      port: 80               # Port the Service listens on
      targetPort: 3000       # Port on the Pod
  type: ClusterIP            # Internal only (use LoadBalancer for external)

Service types:

  • ClusterIP (default): Internal only, accessible within the cluster
  • NodePort: Exposes on a port of every node — for development only
  • LoadBalancer: Creates a cloud load balancer (AWS ELB, GCP LB) — for production external access
  • ExternalName: DNS alias for an external service

Ingress

An Ingress routes HTTP/HTTPS traffic from outside the cluster to internal Services based on hostname or URL path.

yaml
# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-app-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
    cert-manager.io/cluster-issuer: letsencrypt-prod   # Auto SSL certificates
spec:
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: my-api-service
                port:
                  number: 80
    - host: app.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: frontend-service
                port:
                  number: 80
  tls:
    - hosts:
        - api.example.com
        - app.example.com
      secretName: example-com-tls

ConfigMap and Secret

ConfigMaps store non-sensitive configuration. Secrets store sensitive data (passwords, API keys) — stored encrypted in etcd.

yaml
# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  NODE_ENV: production
  LOG_LEVEL: info
  MAX_CONNECTIONS: "100"
---
# secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: app-secrets
type: Opaque
stringData:
  DATABASE_URL: postgresql://user:pass@db-service:5432/myapp
  JWT_SECRET: super-secret-key-32-chars-minimum
  STRIPE_SECRET_KEY: sk_live_...

Inject into a Deployment:

yaml
spec:
  containers:
    - name: api
      envFrom:
        - configMapRef:
            name: app-config
        - secretRef:
            name: app-secrets
      # Or inject individual values:
      env:
        - name: PORT
          value: "3000"
        - name: DB_PASSWORD
          valueFrom:
            secretKeyRef:
              name: app-secrets
              key: DATABASE_URL

Health Checks

Kubernetes uses two probes to manage Pod health:

Liveness Probe: Is the container alive? If this fails, Kubernetes restarts the Pod.

Readiness Probe: Is the container ready to receive traffic? If this fails, the Pod is removed from Service endpoints (no traffic sent) but not restarted.

yaml
livenessProbe:
  httpGet:
    path: /health
    port: 3000
  initialDelaySeconds: 30    # Wait 30s before first check (startup time)
  periodSeconds: 30          # Check every 30s
  failureThreshold: 3        # Restart after 3 consecutive failures

readinessProbe:
  httpGet:
    path: /ready
    port: 3000
  initialDelaySeconds: 5     # Check sooner than liveness
  periodSeconds: 10
  failureThreshold: 3        # Remove from load balancer after 3 failures

Your application must expose these endpoints:

javascript
// health.js — Kubernetes health check endpoints
app.get('/health', (req, res) => {
  // Just check the process is alive
  res.status(200).json({ status: 'alive' });
});

app.get('/ready', async (req, res) => {
  try {
    await db.query('SELECT 1');    // Check database connectivity
    res.status(200).json({ status: 'ready' });
  } catch (err) {
    res.status(503).json({ status: 'not ready', error: err.message });
  }
});

Horizontal Pod Autoscaler (HPA)

HPA automatically scales the number of Pod replicas based on CPU, memory, or custom metrics.

yaml
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-api
  minReplicas: 3              # Never scale below 3
  maxReplicas: 50             # Never scale above 50
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70    # Scale up when CPU > 70%
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

HPA scales up when the average CPU utilization of existing Pods exceeds the target. It scales down when utilization drops, with a cooldown period to prevent flapping.


Namespace Organization

Namespaces provide logical separation within a cluster:

bash
# Create namespaces for different environments
kubectl create namespace production
kubectl create namespace staging
kubectl create namespace development

# Deploy to a specific namespace
kubectl apply -f deployment.yaml -n production

# List all Pods in a namespace
kubectl get pods -n production

Resource quotas per namespace:

yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi
    pods: "100"

Essential kubectl Commands

bash
# Cluster info
kubectl cluster-info
kubectl get nodes

# Pod management
kubectl get pods -n production
kubectl describe pod my-api-abc123 -n production
kubectl logs my-api-abc123 -n production --follow
kubectl exec -it my-api-abc123 -n production -- /bin/sh

# Deployment management
kubectl get deployments -n production
kubectl rollout status deployment/my-api -n production
kubectl rollout history deployment/my-api -n production
kubectl rollout undo deployment/my-api -n production   # Rollback

# Apply/delete resources
kubectl apply -f deployment.yaml
kubectl delete -f deployment.yaml

# Scale manually
kubectl scale deployment/my-api --replicas=5 -n production

# Port forwarding for debugging
kubectl port-forward service/my-api-service 8080:80 -n production

When to Use Kubernetes (and When Not To)

Use Kubernetes when:

  • Running 10+ microservices that need independent scaling
  • Team has dedicated DevOps/platform engineers
  • Running on multiple cloud providers or on-premises
  • Need sophisticated deployment strategies (canary, blue-green)
  • Running stateful workloads (databases) with persistent volumes

Do not use Kubernetes when:

  • Running fewer than 5-7 services
  • Small team (< 5 engineers)
  • First version of a product still finding product-market fit
  • Budget is constrained — managed K8s (EKS, GKE) costs $150-300+/month before workload costs

Alternatives to Kubernetes for smaller scale:

  • AWS App Runner: Container deployment without cluster management
  • Railway / Render: Git push to deploy containers
  • Fly.io: Global container deployment with a simpler model than K8s
  • AWS ECS with Fargate: Serverless containers, no nodes to manage

Frequently Asked Questions

Q: What is the difference between a Pod and a container?

A container is a process packaged with its dependencies. A Pod is a Kubernetes construct that wraps one or more containers and provides them with a shared network, shared storage volumes, and a shared lifecycle. Most Pods contain exactly one container. Multi-container Pods are used for sidecar patterns (a log forwarder running alongside the app, an Envoy proxy running alongside each service).

Q: How does Kubernetes handle zero-downtime deployments?

The RollingUpdate strategy (the default) starts new Pods before stopping old ones. With maxUnavailable: 0 and maxSurge: 1, Kubernetes starts one new Pod, waits for its readiness probe to pass, then terminates one old Pod. This continues until all Pods are on the new version. If the new Pod's readiness probe never passes, the rollout pauses and the old version keeps serving traffic.

Q: What is Helm and do I need it?

Helm is a package manager for Kubernetes. A Helm chart is a reusable template for Kubernetes manifests with variables for customization. Instead of writing 200 lines of YAML to install PostgreSQL, you run helm install my-postgres bitnami/postgresql. For your own applications, Helm adds value when you need to deploy the same application to multiple environments (dev, staging, prod) with different configurations.

Q: Is Kubernetes difficult to learn?

The core concepts (Pods, Deployments, Services) can be learned in a day. The operational complexity (networking, storage, security, multi-cluster management) takes months to master. For most developers, the goal is not to become a Kubernetes administrator but to understand enough to write Kubernetes YAML for their application and debug common issues.


Key Takeaway

Kubernetes is the standard platform for running microservices and containerized workloads at scale. Its core value is declarative management: you specify the desired state (10 replicas of this image) and Kubernetes maintains it continuously — starting new Pods when traffic increases, restarting crashed containers, and routing traffic only to healthy Pods. The operational complexity is real, but for teams running 10+ services, the automation benefits far outweigh the learning curve. Start with a managed Kubernetes service (EKS, GKE, AKS) to eliminate the burden of managing the control plane yourself.

Read next: Software Architecture Roadmap: Becoming a Senior Architect →


Part of the Software Architecture Hub — engineering the fleet.