Infrastructure as Code: A Beginner's Guide to IaC Fundamentals
TLDR
Infrastructure as Code (IaC) is the practice of managing and provisioning infrastructure through code instead of manual processes. You write configuration files that describe your servers, networks, databases, and other resources, then use automation tools to create and manage them. This approach brings version control, reproducibility, and automation to infrastructure management - making deployments faster, more reliable, and easier to scale.
What is Infrastructure as Code?
Infrastructure as Code (IaC) treats your infrastructure the same way developers treat application code - as text files that can be versioned, reviewed, tested, and automated.
The Traditional Way: Manual Infrastructure
Imagine you need to set up a web application. The traditional approach looks like this:
Traditional Infrastructure Setup:
1. Log into cloud console
2. Click through 20+ screens to create a server
3. Manually configure networking
4. Install software by hand
5. Screenshot settings "just in case"
6. Repeat for staging and production environments
7. Hope you remembered all the steps
8. Realize 6 months later you can't remember what you did
Problems with this approach:
- No documentation - Steps live in someone's head (or nowhere)
- Inconsistent - Each environment is slightly different
- Slow - Creating a new environment takes days or weeks
- Error-prone - Humans make mistakes in repetitive tasks
- Not reproducible - Good luck recreating that setup exactly
The IaC Way: Infrastructure from Code
With Infrastructure as Code, you write configuration files that describe what you want:
# infrastructure.tf
resource "aws_instance" "web_server" {
ami = "ami-12345678" # Example AMI ID (use aws_ami data source in production)
instance_type = "t3.medium"
tags = {
Name = "production-web-server"
Environment = "production"
}
}
resource "aws_security_group" "web_sg" {
name = "web-sg"
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
Then run a command:
terraform apply
And boom - your entire infrastructure is created automatically, exactly as specified.
Why Infrastructure as Code Matters
1. Version Control = Time Machine for Infrastructure
Your infrastructure is stored in Git, just like your application code:
git log --oneline
abc123 Add load balancer for web tier
def456 Increase database storage to 500GB
789ghi Create staging environment
012jkl Initial production infrastructure
Benefits:
- See who changed what and when
- Rollback bad changes instantly
- Review infrastructure changes before applying them
- Track infrastructure evolution over time
2. Reproducibility = Copy-Paste for Environments
Need to create a staging environment identical to production?
# Traditional way: 3 days of clicking and hoping
# IaC way: 5 minutes
cp production.tf staging.tf
# Edit a few values
terraform apply
3. Automation = Speed and Consistency
Manual Setup (2-5 days):
████████████████████████████████████ (100%)
IaC Setup (5-10 minutes):
██ (5%)
4. Documentation = Self-Documenting Infrastructure
The code IS the documentation:
- No outdated wiki pages
- No "tribal knowledge"
- Anyone can read the code to understand your infrastructure
- Onboarding new team members is faster
Declarative vs Imperative: Two Approaches
Declarative: "Here's what I want"
You describe the desired end state. The tool figures out how to get there.
# Terraform (Declarative)
resource "aws_instance" "web" {
count = 3 # I want 3 servers
}
What happens:
- Currently 0 servers exist
- Terraform: "I need to create 3 servers"
- Run again with 5 servers:
- Terraform: "I need to create 2 more servers"
Analogy: Telling a taxi driver "Take me to 123 Main Street" - you don't tell them every turn.
Imperative: "Here's how to do it"
You write explicit steps to execute.
# Ansible (Imperative)
- name: Create web servers
tasks:
- name: Launch EC2 instance
ec2:
count: 3
Analogy: Giving turn-by-turn directions: "Turn left, go 2 miles, turn right..."
Which is Better?
Declarative (Terraform, CloudFormation):
- ✅ Easier to understand final state
- ✅ Handles drift better
- ✅ Idempotent by default
- ❌ Less flexible for complex logic
Imperative (Ansible, Scripts):
- ✅ More flexible and powerful
- ✅ Better for configuration management
- ❌ Harder to reason about state
- ❌ Requires careful idempotency
Best practice: Use declarative for infrastructure (Terraform) and imperative for configuration (Ansible).
Popular IaC Tools
Terraform (Multi-Cloud)
Terraform HCL Code
│
┌──────────┼──────────┐
│ │ │
▼ ▼ ▼
┏━━━┓ ┏━━━┓ ┏━━━━━┓
┃AWS┃ ┃GCP┃ ┃Azure┃
┗━━━┛ ┗━━━┛ ┗━━━━━┛
Best for: Multi-cloud infrastructure, large-scale deployments
Pros:
- Works with 1000+ providers (AWS, Azure, GCP, GitHub, etc.)
- Huge community and module ecosystem
- State management built-in
- Free and open source
Cons:
- Learning curve for HCL language
- State file can be tricky to manage
Example:
provider "aws" {
region = "us-east-1"
}
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket"
}
AWS CloudFormation (AWS-Only)
Best for: AWS-only environments, AWS-native features
Pros:
- Deeply integrated with AWS
- No external dependencies
- Free
- First-class support from AWS
Cons:
- AWS-only (vendor lock-in)
- YAML/JSON can be verbose
- Limited abstraction capabilities
Example:
Resources:
MyBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: my-data-bucket
Pulumi (Real Programming Languages)
Best for: Developers who prefer Python/TypeScript/Go over DSLs
Pros:
- Use familiar programming languages
- Full power of conditionals, loops, functions
- Great IDE support
- Unit testing your infrastructure code
Cons:
- Requires programming knowledge
- Smaller community than Terraform
- SaaS backend (or self-host)
Example:
import pulumi_aws as aws
bucket = aws.s3.Bucket("my-data-bucket")
Ansible (Configuration Management)
Best for: Configuring servers, orchestration, app deployment
Pros:
- Agentless (uses SSH)
- Great for configuration management
- Human-readable YAML
Cons:
- Not purpose-built for infrastructure provisioning
- Can be slow for large deployments
Essential IaC Concepts
1. State Management
The Problem: How does Terraform know what exists?
Terraform State File:
┌──────────────────────────────┐
│ Current Infrastructure │
│ ─────────────────────────────│
│ • 3 EC2 instances │
│ • 1 Load balancer │
│ • 2 Security groups │
│ • 1 RDS database │
└──────────────────────────────┘
^
│ Compare to desired state
│
┌──────────────────────────────┐
│ Your Code (desired state) │
│ ─────────────────────────────│
│ • 5 EC2 instances ← DIFF! │
│ • 1 Load balancer │
│ • 2 Security groups │
│ • 1 RDS database │
└──────────────────────────────┘
Terraform knows to create 2 more EC2 instances.
Best Practice: Store state remotely (S3, Terraform Cloud) not locally.
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "production/terraform.tfstate"
region = "us-east-1"
}
}
2. Modules = Reusable Infrastructure Components
Don't repeat yourself. Create reusable modules:
# modules/web-server/main.tf
variable "environment" {}
variable "instance_type" {}
resource "aws_instance" "web" {
ami = "ami-12345"
instance_type = var.instance_type
tags = {
Environment = var.environment
}
}
Use it everywhere:
module "prod_web" {
source = "./modules/web-server"
environment = "production"
instance_type = "t3.large"
}
module "staging_web" {
source = "./modules/web-server"
environment = "staging"
instance_type = "t3.small"
}
3. Idempotency = Run It Twice, Same Result
terraform apply # Creates 3 servers
terraform apply # Does nothing (already exists)
terraform apply # Still does nothing
Your infrastructure code should be safe to run multiple times without side effects.
4. Drift Detection
The Problem: Someone manually changed production in the console.
terraform plan
# Output:
# ~ aws_instance.web
# instance_type: "t3.medium" => "t3.large" (changed outside Terraform)
IaC tools detect when real infrastructure doesn't match your code.
Best Practices for IaC
1. Everything in Version Control
✅ DO: Store all infrastructure code in Git
❌ DON'T: Keep local copies or "quick fixes" not in Git
2. Code Review Infrastructure Changes
Pull Request:
"Increase database size from 100GB to 500GB"
+ allocated_storage = 500 # Was 100GB
Reviews:
☑️ Database admin: Approved - confirmed with capacity planning
☑️ Finance: Approved - budget allocated
3. Separate Environments
terraform/
├── production/
│ ├── main.tf
│ └── variables.tf
├── staging/
│ ├── main.tf
│ └── variables.tf
└── development/
├── main.tf
└── variables.tf
Never share state files between environments.
4. Use Remote State with Locking
terraform {
backend "s3" {
bucket = "terraform-state"
key = "production/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-locks" # Prevents concurrent runs
encrypt = true
}
}
5. Manage Secrets Properly
# ❌ BAD: Hardcoded secrets
resource "aws_db_instance" "db" {
password = "super_secret_123" # NEVER DO THIS
}
# ✅ GOOD: Use secret management
data "aws_secretsmanager_secret_version" "db_password" {
secret_id = "production-db-password"
}
resource "aws_db_instance" "db" {
password = data.aws_secretsmanager_secret_version.db_password.secret_string
}
6. Tag Everything
locals {
common_tags = {
Environment = "production"
Project = "web-app"
ManagedBy = "terraform"
CostCenter = "engineering"
Owner = "platform-team"
}
}
resource "aws_instance" "web" {
tags = local.common_tags
}
Why tags matter:
- Cost allocation
- Resource discovery
- Compliance tracking
- Automated cleanup
7. Always Run Plan Before Apply
# See what will change
terraform plan
# Review the output carefully
# Then apply
terraform apply
Pro tip: Save plan output for review:
terraform plan -out=tfplan
# Review the plan
terraform apply tfplan
8. Use Linting and Security Scanning
# Check for syntax errors
terraform fmt -check
terraform validate
# Security scanning
tfsec . # Find security issues
checkov -d . # Policy compliance
terrascan scan # Vulnerability scanning
Getting Started: Your First IaC Project
Let's create a simple web server on AWS using Terraform.
Step 1: Install Terraform
# macOS
brew install terraform
# Linux
wget https://releases.hashicorp.com/terraform/1.10.5/terraform_1.10.5_linux_amd64.zip
unzip terraform_1.10.5_linux_amd64.zip
sudo mv terraform /usr/local/bin/
# Verify
terraform version
Step 2: Create Your First Configuration
Create main.tf:
# Data source to get latest Amazon Linux 2 AMI
data "aws_ami" "amazon_linux_2" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
}
# Configure AWS provider
provider "aws" {
region = "us-east-1"
}
# Create a VPC
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
tags = {
Name = "my-first-vpc"
}
}
# Create a subnet
resource "aws_subnet" "public" {
vpc_id = aws_vpc.main.id
cidr_block = "10.0.1.0/24"
tags = {
Name = "public-subnet"
}
}
# Create security group
resource "aws_security_group" "web" {
name = "web-sg"
vpc_id = aws_vpc.main.id
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
}
# Launch an EC2 instance
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux_2.id
instance_type = "t3.micro"
subnet_id = aws_subnet.public.id
vpc_security_group_ids = [aws_security_group.web.id]
user_data = <<-EOF
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello from Terraform!</h1>" > /var/www/html/index.html
EOF
tags = {
Name = "web-server"
}
}
# Output the public IP
output "public_ip" {
value = aws_instance.web.public_ip
}
Step 3: Initialize and Apply
# Initialize Terraform (download providers)
terraform init
# See what will be created
terraform plan
# Create the infrastructure
terraform apply
# Type 'yes' when prompted
# Wait a few minutes, then visit the output IP in your browser
Step 4: Make Changes
Update the instance type:
resource "aws_instance" "web" {
instance_type = "t3.small" # Changed from t3.micro
# ... rest of config
}
Apply the change:
terraform apply
# Terraform will show it needs to recreate the instance
Step 5: Clean Up
# Destroy everything
terraform destroy
# Type 'yes' to confirm
Common Pitfalls to Avoid
1. Hardcoding Values
# ❌ BAD
resource "aws_instance" "web" {
ami = "ami-12345" # Will break in other regions
}
# ✅ GOOD
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
}
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
}
2. Not Using Variables
# variables.tf
variable "environment" {
description = "Environment name"
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "Environment must be dev, staging, or prod."
}
}
variable "instance_count" {
description = "Number of instances"
type = number
default = 1
}
3. Creating Resources with Same Name
# ❌ BAD: Creates naming conflicts
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket" # Name must be globally unique!
}
# ✅ GOOD: Use variables and randomness
resource "random_id" "bucket_suffix" {
byte_length = 4
}
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket-${var.environment}-${random_id.bucket_suffix.hex}"
}
4. Ignoring State File Management
# ❌ NEVER do this
git add terraform.tfstate # State files contain secrets!
# ✅ Add to .gitignore
echo "*.tfstate*" >> .gitignore
echo ".terraform/" >> .gitignore
5. Making Changes Outside IaC
Team Member: "I'll just quickly change this in the console..."
Later:
terraform apply
# Oops, Terraform overwrites the manual change
Result: Confusion, conflicts, drift
Rule: If it's managed by IaC, ONLY change it through IaC.
Next Steps: Your IaC Journey
Level 1: Beginner
- ✅ Complete the getting started tutorial above
- Create a simple static website on S3
- Deploy a single EC2 instance
- Practice with
terraform plan,apply, anddestroy
Level 2: Intermediate
- Create reusable modules
- Set up multiple environments (dev, staging, prod)
- Implement remote state storage
- Add automated backups
- Use Terraform workspaces
Level 3: Advanced
- Implement CI/CD for infrastructure changes
- Write automated tests for your infrastructure
- Use policy as code (Sentinel, OPA)
- Implement drift detection automation
- Multi-region deployments
Practice Projects
Project 1: Personal Website Stack
- S3 bucket for hosting
- CloudFront for CDN
- Route53 for DNS
- ACM certificate for HTTPS
Project 2: Three-Tier Web Application
- Load balancer
- Auto Scaling Group
- RDS database
- ElastiCache for caching
Project 3: CI/CD Pipeline
- GitHub Actions workflow
- Terraform Cloud integration
- Automated testing
- Slack notifications
Resources to Continue Learning
Official Documentation:
Tutorials & Courses:
- HashiCorp Learn (free Terraform tutorials)
- AWS Well-Architected Labs
- Linux Academy / A Cloud Guru
Community:
- Terraform Registry - Pre-built modules
- r/Terraform subreddit
- HashiCorp Discuss forum
Books:
- "Terraform: Up & Running" by Yevgeniy Brikman
- "Infrastructure as Code" by Kief Morris
Conclusion
Infrastructure as Code transforms infrastructure management from a manual, error-prone process into automated, reliable, and reproducible operations. By treating infrastructure as code:
✅ You gain version control and history
✅ Changes are reviewable and auditable
✅ Environments are consistent and reproducible
✅ Deployments become fast and automated
✅ Documentation stays up-to-date automatically
Start small, practice often, and gradually increase complexity. The investment in learning IaC pays dividends in reduced errors, faster deployments, and more reliable systems.
Remember: The best infrastructure code is the simplest code that meets your needs. Don't over-engineer - start with basics and evolve as you learn.
Related Resources
Ready to put IaC into practice? Check out these hands-on exercises:
- Deploy a DigitalOcean Droplet with Terraform - Beginner-friendly tutorial
- Kubernetes Cluster Setup - Advanced infrastructure patterns
- CI/CD Pipeline Design - Automate your infrastructure deployments
Have questions about Infrastructure as Code? Join the discussion in our community or reach out to the DevOps Daily team.
We earn commissions when you shop through the links below.
DigitalOcean
Cloud infrastructure for developers
Simple, reliable cloud computing designed for developers
DevDojo
Developer community & tools
Join a community of developers sharing knowledge and tools
Acronis
The most secure backup
Acronis: the most secure backup solution for your data
Pluralsight
Technology skills platform
Expert-led courses in software development, IT ops, data, and cybersecurity
Want to support DevOps Daily and reach thousands of developers?
Become a SponsorFound an issue?