Infrastructure as Code with Terraform: Beginner to Pro Guide

Infrastructure as Code with Terraform: Beginner to Pro Guide
Infrastructure as Code (IaC) means managing cloud infrastructure — servers, databases, networks, DNS records — through code files instead of through a web console. When your infrastructure is code, it is versionable, reviewable, repeatable, and testable. Terraform is the most widely used IaC tool, supporting all major cloud providers through a declarative, provider-based architecture.
This guide covers everything from your first Terraform file to production-grade module patterns and GitOps pipelines.
Why Infrastructure as Code?
The manual configuration problem:
- An engineer logs into the AWS console and manually configures a server
- Six months later, no one remembers exactly how it was configured
- The server fails and needs to be rebuilt — but the configuration is lost
- A slightly different manual rebuild causes mysterious bugs that take days to diagnose
With Terraform:
# This file IS the server configuration
resource "aws_instance" "web" {
ami = "ami-0c02fb55956c7d316"
instance_type = "t3.medium"
vpc_security_group_ids = [aws_security_group.web.id]
tags = { Name = "web-server" }
}The configuration is in git. Anyone can review it. If the server fails, terraform apply recreates it identically in minutes. If someone changes it manually, the drift is visible in terraform plan.
Terraform Core Concepts
Provider
A provider is a plugin that lets Terraform interact with an API (AWS, GCP, Azure, Cloudflare, GitHub, Kubernetes, etc.).
# versions.tf
terraform {
required_version = ">= 1.7.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
cloudflare = {
source = "cloudflare/cloudflare"
version = "~> 4.0"
}
}
}
provider "aws" {
region = "us-east-1"
}
provider "cloudflare" {
api_token = var.cloudflare_api_token
}Resource
A resource is a piece of infrastructure you want to create or manage.
# A VPC
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "main-vpc"
Environment = "production"
}
}
# A subnet inside the VPC
resource "aws_subnet" "public_a" {
vpc_id = aws_vpc.main.id # Reference to the VPC above
cidr_block = "10.0.1.0/24"
availability_zone = "us-east-1a"
map_public_ip_on_launch = true
tags = { Name = "public-subnet-a" }
}Resources reference each other using resource_type.resource_name.attribute syntax. Terraform automatically infers dependencies from these references.
Variable
Variables make configurations reusable across environments.
# variables.tf
variable "environment" {
type = string
description = "Deployment environment (production, staging, development)"
validation {
condition = contains(["production", "staging", "development"], var.environment)
error_message = "Environment must be production, staging, or development."
}
}
variable "instance_type" {
type = string
default = "t3.medium"
}
variable "db_password" {
type = string
sensitive = true # Never show in logs or plan output
}Set variable values in a terraform.tfvars file (never commit secrets here) or via environment variables:
# Environment variables prefixed with TF_VAR_
export TF_VAR_db_password="my-secret-password"
export TF_VAR_environment="production"Output
Outputs expose values from your Terraform configuration — useful for passing data between modules or displaying important information after apply.
# outputs.tf
output "vpc_id" {
value = aws_vpc.main.id
description = "ID of the main VPC"
}
output "load_balancer_dns" {
value = aws_lb.main.dns_name
description = "DNS name of the application load balancer"
}
output "rds_endpoint" {
value = aws_db_instance.main.endpoint
sensitive = true # Don't show in console output
}Complete AWS Infrastructure Example
A production-ready setup with VPC, load balancer, EC2 instances, and RDS:
# main.tf — Complete production infrastructure
# --- Networking ---
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
tags = { Name = "${var.environment}-vpc" }
}
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
}
resource "aws_subnet" "public" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
map_public_ip_on_launch = true
tags = { Name = "${var.environment}-public-${count.index}" }
}
resource "aws_subnet" "private" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index + 10}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = { Name = "${var.environment}-private-${count.index}" }
}
# --- Security Groups ---
resource "aws_security_group" "alb" {
name = "${var.environment}-alb-sg"
vpc_id = aws_vpc.main.id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_security_group" "app" {
name = "${var.environment}-app-sg"
vpc_id = aws_vpc.main.id
ingress {
from_port = 3000
to_port = 3000
protocol = "tcp"
security_groups = [aws_security_group.alb.id] # Only accept from ALB
}
}
# --- Application Load Balancer ---
resource "aws_lb" "main" {
name = "${var.environment}-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = aws_subnet.public[*].id
}
resource "aws_lb_target_group" "app" {
name = "${var.environment}-app-tg"
port = 3000
protocol = "HTTP"
vpc_id = aws_vpc.main.id
health_check {
path = "/health"
healthy_threshold = 2
unhealthy_threshold = 3
interval = 30
}
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.main.arn
port = 443
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-TLS13-1-2-2021-06"
certificate_arn = aws_acm_certificate_validation.main.certificate_arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.app.arn
}
}
# --- RDS Database ---
resource "aws_db_subnet_group" "main" {
name = "${var.environment}-db-subnet-group"
subnet_ids = aws_subnet.private[*].id
}
resource "aws_db_instance" "main" {
identifier = "${var.environment}-postgres"
engine = "postgres"
engine_version = "16.1"
instance_class = "db.t3.medium"
allocated_storage = 100
storage_encrypted = true
db_name = "myapp"
username = "myapp_admin"
password = var.db_password
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.db.id]
backup_retention_period = 7
deletion_protection = true # Prevent accidental deletion
skip_final_snapshot = false
tags = { Environment = var.environment }
}Terraform State Management
Terraform tracks the real-world state of your infrastructure in a state file. The state tells Terraform what resources it created and their current attributes.
Remote State (Required for Teams)
Never store state locally when working in a team. Use remote state with locking:
# backend.tf
terraform {
backend "s3" {
bucket = "my-company-terraform-state"
key = "production/main.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-state-lock" # Prevents concurrent applies
}
}Create the S3 bucket and DynamoDB table once (manually or with a bootstrap Terraform config) before using it as a backend.
State Commands
# List all resources in state
terraform state list
# Show details of a specific resource
terraform state show aws_db_instance.main
# Remove a resource from state (doesn't destroy it — just stops tracking)
terraform state rm aws_db_instance.old
# Import an existing resource into state
terraform import aws_instance.web i-0123456789abcdef0
# Move resource to a different name (for refactoring)
terraform state mv aws_instance.app aws_instance.webTerraform Modules
Modules are reusable packages of Terraform configuration. They promote DRY principles and let you parameterize common infrastructure patterns.
modules/
├── vpc/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
├── rds/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
└── ecs-service/
├── main.tf
├── variables.tf
└── outputs.tfUse a module:
# environments/production/main.tf
module "vpc" {
source = "../../modules/vpc"
environment = "production"
cidr_block = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"]
public_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
private_subnets = ["10.0.11.0/24", "10.0.12.0/24"]
}
module "database" {
source = "../../modules/rds"
environment = "production"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnet_ids
instance_class = "db.r6g.large"
db_password = var.db_password
}Use public modules from the Terraform Registry:
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "~> 20.0"
cluster_name = "my-eks-cluster"
cluster_version = "1.29"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnet_ids
}GitOps Workflow with GitHub Actions
The modern approach: terraform apply runs in CI, triggered by a merge to main.
# .github/workflows/terraform.yml
name: Terraform
on:
push:
branches: [main]
pull_request:
branches: [main]
permissions:
id-token: write
contents: read
pull-requests: write
jobs:
terraform:
runs-on: ubuntu-latest
defaults:
run:
working-directory: environments/production
steps:
- uses: actions/checkout@v4
# Use OIDC — no stored AWS credentials
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/terraform-role
aws-region: us-east-1
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.7.0
- name: Terraform Init
run: terraform init
- name: Terraform Format Check
run: terraform fmt -check
- name: Terraform Validate
run: terraform validate
- name: Terraform Plan
id: plan
run: terraform plan -out=tfplan -no-color
env:
TF_VAR_db_password: ${{ secrets.DB_PASSWORD }}
# Post plan as PR comment
- uses: actions/github-script@v7
if: github.event_name == 'pull_request'
with:
script: |
const plan = '${{ steps.plan.outputs.stdout }}';
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: `## Terraform Plan\n\`\`\`\n${plan}\n\`\`\``
});
# Apply only on push to main (not on PRs)
- name: Terraform Apply
if: github.ref == 'refs/heads/main' && github.event_name == 'push'
run: terraform apply -auto-approve tfplanThis workflow:
- Plans on every PR and posts the diff as a PR comment
- Applies automatically when PR is merged to
main - Uses OIDC (no static AWS credentials stored in GitHub secrets)
Frequently Asked Questions
Q: What is the Terraform state file and what happens if I lose it?
The state file (terraform.tfstate) records every resource Terraform created and its current attributes. If you lose it, Terraform thinks it has created nothing and will try to create everything again — potentially creating duplicate resources or conflicting with existing ones. Always use remote state (S3 + DynamoDB for AWS) and enable versioning on the S3 bucket so state is never permanently lost.
Q: What is the difference between terraform plan and terraform apply?
terraform plan shows what changes will be made without making them. It compares your HCL configuration against the current state and shows additions (green +), modifications (yellow ~), and deletions (red -). terraform apply executes those changes. Always review the plan output before applying, especially for deletions.
Q: Terraform vs. Pulumi vs. AWS CDK — which should I choose?
Terraform (HCL) is the industry standard with the largest ecosystem and widest cloud provider support. Pulumi and AWS CDK let you write infrastructure in TypeScript, Python, or Go — better for teams who prefer programming languages over DSLs. For most teams starting IaC in 2026, Terraform is the safest choice due to community size, module availability, and hiring pool.
Q: How do I manage different environments (dev, staging, production)?
Use separate state files per environment. Options: separate directories (environments/dev/, environments/staging/), Terraform workspaces (simpler but shares a codebase), or separate repositories per environment. For most teams, separate directories with shared modules gives the best balance of isolation and code reuse.
Key Takeaway
Terraform transforms cloud infrastructure from a fragile manual process into a version-controlled, peer-reviewed, repeatable engineering practice. Your entire production environment — VPCs, load balancers, databases, DNS records, IAM policies — lives in git as HCL files. Changes go through PR review. The plan shows exactly what will change before it changes. GitOps completes the picture by making terraform apply a CI step rather than a manual command. Start by codifying one existing environment, store state remotely, and build the GitOps workflow before you scale to multiple environments.
Read next: Kubernetes: Orchestrating the Container Fleet →
Part of the Software Architecture Hub — engineering the automation.
