Skip to main content
โšก Calmops

Working with Configuration as Code: Infrastructure as Code Principles

Introduction

Infrastructure as Code (IaC) revolutionizes how we manage infrastructure by treating it the same way as application code. Instead of manually provisioning servers through web consoles or clicking through wizards, you define your infrastructure in declarative configuration files that can be version-controlled, tested, reviewed, and automated.

This comprehensive guide covers IaC principles, practical implementation with Terraform and Ansible, state management strategies, module design patterns, and building reliable infrastructure pipelines.

Why Infrastructure as Code Matters

Traditional vs IaC Approach

Aspect Traditional IaC
Provisioning Manual, click-based Declarative, automated
Reproducibility Difficult, error-prone Easy, consistent
Versioning None Full Git history
Review Process None Pull request reviews
Rollback Manual, risky Automatic, versioned
Documentation Often outdated Self-documenting

Benefits of IaC

  1. Consistency - Same configuration every time
  2. Auditability - Full history of changes
  3. Speed - Rapid provisioning and teardown
  4. Collaboration - Code review for infrastructure
  5. Disaster Recovery - Recreate infrastructure quickly

Terraform Deep Dive

Project Structure

A well-organized Terraform project improves maintainability:

terraform/
โ”œโ”€โ”€ environments/
โ”‚   โ”œโ”€โ”€ dev/
โ”‚   โ”‚   โ”œโ”€โ”€ main.tf
โ”‚   โ”‚   โ”œโ”€โ”€ variables.tf
โ”‚   โ”‚   โ”œโ”€โ”€ outputs.tf
โ”‚   โ”‚   โ””โ”€โ”€ terraform.tfvars
โ”‚   โ”œโ”€โ”€ staging/
โ”‚   โ””โ”€โ”€ prod/
โ”œโ”€โ”€ modules/
โ”‚   โ”œโ”€โ”€ vpc/
โ”‚   โ”œโ”€โ”€ ec2/
โ”‚   โ”œโ”€โ”€ rds/
โ”‚   โ””โ”€โ”€ ecs/
โ”œโ”€โ”€ global/
โ”‚   โ””โ”€โ”€ s3/
โ”œโ”€โ”€ backend.tf
โ”œโ”€โ”€ provider.tf
โ””โ”€โ”€ versions.tf

Provider Configuration

# versions.tf
terraform {
  required_version = ">= 1.6.0"
  
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

# backend.tf - Remote state with locking
terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}

# provider.tf
provider "aws" {
  region = var.aws_region
  
  default_tags {
    tags = {
      Environment = var.environment
      ManagedBy   = "Terraform"
      Project     = "MyProject"
    }
  }
}

Variables and Outputs

# variables.tf
variable "environment" {
  description = "Environment name"
  type        = string
  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Must be dev, staging, or prod"
  }
}

variable "aws_region" {
  description = "AWS region"
  type        = string
  default     = "us-east-1"
}

variable "vpc_cidr" {
  description = "VPC CIDR block"
  type        = string
  default     = "10.0.0.0/16"
}

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"
  
  validation {
    condition     = can(regex("^t3\\.", var.instance_type))
    error_message = "Must be a t3 instance type"
  }
}

variable "tags" {
  description = "Tags to apply to resources"
  type        = map(string)
  default     = {}
}
# outputs.tf
output "vpc_id" {
  description = "ID of the VPC"
  value       = module.vpc.vpc_id
}

output "web_server_ip" {
  description = "Public IP of web server"
  value       = aws_instance.web.public_ip
  sensitive   = true
}

output "database_connection_string" {
  description = "Database connection string"
  value       = module.database.connection_string
  sensitive   = true
}

Using Modules

# environments/prod/main.tf
module "vpc" {
  source = "../../modules/vpc"
  
  environment = "prod"
  cidr_block = "10.1.0.0/16"
  
  availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
  
  tags = {
    Environment = "prod"
  }
}

module "ecs_cluster" {
  source = "../../modules/ecs"
  
  cluster_name = "prod-app"
  
  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets
  
  desired_capacity = 3
  max_size         = 10
  min_size         = 2
  
  tags = {
    Environment = "prod"
  }
}

module "rds" {
  source = "../../modules/rds"
  
  identifier     = "prod-postgres"
  engine_version = "15.4"
  
  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.database_subnets
  
  instance_class    = "db.t3.medium"
  allocated_storage = 100
  
  backup_retention_period = 30
  skip_final_snapshot     = false
  
  tags = {
    Environment = "prod"
  }
}

VPC Module Example

# modules/vpc/main.tf
resource "aws_vpc" "main" {
  cidr_block           = var.cidr_block
  enable_dns_hostnames = true
  enable_dns_support   = true
  
  tags = merge(var.tags, {
    Name = "${var.environment}-vpc"
  })
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
  
  tags = merge(var.tags, {
    Name = "${var.environment}-igw"
  })
}

resource "aws_subnet" "public" {
  count = length(var.availability_zones)
  
  vpc_id                  = aws_vpc.main.id
  cidr_block              = cidrsubnet(var.cidr_block, 8, count.index)
  availability_zone       = var.availability_zones[count.index]
  map_public_ip_on_launch = true
  
  tags = merge(var.tags, {
    Name = "${var.environment}-public-${count.index + 1}"
  })
}

resource "aws_subnet" "private" {
  count = length(var.availability_zones)
  
  vpc_id            = aws_vpc.main.id
  cidr_block       = cidrsubnet(var.cidr_block, 8, count.index + length(var.availability_zones))
  availability_zone = var.availability_zones[count.index]
  
  tags = merge(var.tags, {
    Name = "${var.environment}-private-${count.index + 1}"
  })
}

# modules/vpc/outputs.tf
output "vpc_id" {
  value = aws_vpc.main.id
}

output "public_subnets" {
  value = aws_subnet.public[*].id
}

output "private_subnets" {
  value = aws_subnet.private[*].id
}

Ansible for Configuration Management

Playbook Structure

# site.yml - Main entry point
---
- import_playbook: base.yml
- import_playbook: webserver.yml
- import_playbook: database.yml
- import_playbook: monitoring.yml

Web Server Playbook

# webserver.yml
---
- name: Configure web servers
  hosts: webservers
  become: yes
  vars:
    nginx_version: stable
    app_user: webapp
    
  tasks:
    - name: Update apt cache
      apt:
        update_cache: yes
        cache_valid_time: 3600
      when: ansible_os_family == "Debian"
    
    - name: Install nginx
      apt:
        name: nginx
        state: present
    
    - name: Install Python for web framework
      apt:
        name:
          - python3
          - python3-pip
        state: present
    
    - name: Create app user
      user:
        name: "{{ app_user }}"
        system: yes
        shell: /bin/bash
        create_home: yes
    
    - name: Copy nginx configuration
      template:
        src: nginx.conf.j2
        dest: /etc/nginx/nginx.conf
        mode: '0644'
      notify: Restart nginx
    
    - name: Copy application configuration
      template:
        src: app_config.yml.j2
        dest: /opt/webapp/config.yml
        owner: "{{ app_user }}"
        group: "{{ app_user }}"
        mode: '0600'
      notify: Reload application
    
    - name: Ensure nginx is running
      service:
        name: nginx
        state: started
        enabled: yes
    
    - name: Configure firewall
      ufw:
        state: enabled
        policy: deny
    
    - name: Allow SSH
      ufw:
        rule: allow
        port: '22'
        proto: tcp
    
    - name: Allow HTTP/HTTPS
      ufw:
        rule: allow
        port: '{{ item }}'
        proto: tcp
      loop:
        - 80
        - 443
  
  handlers:
    - name: Restart nginx
      service:
        name: nginx
        state: restarted
    
    - name: Reload application
      systemd:
        name: webapp
        state: reloaded

Nginx Configuration Template

# templates/nginx.conf.j2
user {{ nginx_user }};
worker_processes {{ nginx_worker_processes }};
error_log {{ nginx_error_log }};
pid {{ nginx_pid_file }};

events {
    worker_connections {{ nginx_worker_connections }};
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;
    
    log_format main '$remote_addr - $remote_user [$time_local] "$request" '
                    '$status $body_bytes_sent "$http_referer" '
                    '"$http_user_agent" "$http_x_forwarded_for"';
    
    access_log {{ nginx_access_log }};
    
    sendfile        on;
    tcp_nopush      on;
    tcp_nodelay     on;
    keepalive_timeout {{ nginx_keepalive_timeout }};
    types_hash_max_size 2048;
    
    gzip on;
    gzip_vary on;
    gzip_proxied any;
    gzip_comp_level 6;
    gzip_types text/plain text/css text/xml text/javascript application/json application/javascript application/xml+rss;
    
    server {
        listen {{ nginx_listen_port }};
        server_name {{ nginx_server_name }};
        
        location / {
            proxy_pass http://127.0.0.1:{{ app_port }};
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
        
        location /health {
            access_log off;
            return 200 "healthy\n";
            add_header Content-Type text/plain;
        }
    }
}

Roles for Reusability

roles/
โ”œโ”€โ”€ common/
โ”‚   โ”œโ”€โ”€ tasks/
โ”‚   โ”‚   โ””โ”€โ”€ main.yml
โ”‚   โ”œโ”€โ”€ handlers/
โ”‚   โ”‚   โ””โ”€โ”€ main.yml
โ”‚   โ””โ”€โ”€ templates/
โ”‚       โ””โ”€โ”€ ntp.conf.j2
โ”œโ”€โ”€ nginx/
โ”‚   โ”œโ”€โ”€ tasks/
โ”‚   โ”‚   โ””โ”€โ”€ main.yml
โ”‚   โ”œโ”€โ”€ handlers/
โ”‚   โ”‚   โ””โ”€โ”€ main.yml
โ”‚   โ”œโ”€โ”€ templates/
โ”‚   โ”‚   โ””โ”€โ”€ nginx.conf.j2
โ”‚   โ””โ”€โ”€ defaults/
โ”‚       โ””โ”€โ”€ main.yml
โ””โ”€โ”€ postgres/
    โ”œโ”€โ”€ tasks/
    โ”‚   โ””โ”€โ”€ main.yml
    โ”œโ”€โ”€ handlers/
    โ”‚   โ””โ”€โ”€ main.yml
    โ””โ”€โ”€ defaults/
        โ””โ”€โ”€ main.yml
# Using roles
- name: Configure database server
  hosts: database
  become: yes
  
  roles:
    - role: common
    - role: postgres
      vars:
        postgres_version: 15
        postgres_max_connections: 200

State Management

Remote State with Locking

# backend.tf
terraform {
  backend "s3" {
    bucket         = "company-terraform-state"
    key            = "environments/prod/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-locks"
  }
}

State Isolation

# Separate state files for each environment
terraform init -backend-config="key=dev/terraform.tfstate"
terraform init -backend-config="key=prod/terraform.tfstate"

Importing Existing Resources

# Import an existing AWS resource into Terraform
terraform import aws_instance.web i-1234567890abcdef0

Testing Infrastructure

Terraform Validation

# Format check
terraform fmt -check

# Validate syntax
terraform validate

# Check for unused variables
terraform graph | tfgraph

# Plan for review
terraform plan -out=tfplan
terraform show tfplan

Terratest Integration

// infrastructure_test.go
package test

import (
    "testing"
    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/stretchr/testify/assert"
)

func TestTerraformWebServer(t *testing.T) {
    terraformOptions := &terraform.Options{
        TerraformDir: "../examples/webserver",
        Vars: map[string]interface{}{
            "environment": "test",
        },
    }
    
    defer terraform.Destroy(t, terraformOptions)
    terraform.InitAndApply(t, terraformOptions)
    
    instanceID := terraform.Output(t, terraformOptions, "instance_id")
    assert.NotEmpty(t, instanceID)
}

Checkov for Security Scanning

# Scan Terraform files
checkov -d ./terraform --framework terraform

# Scan specific file
checkov -f main.tf

# Skip certain checks
checkov -d ./terraform --skip-check CK_AWS_1234

CI/CD Integration

GitHub Actions Workflow

name: Terraform CI

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  terraform:
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v4
      
      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.6.0
      
      - name: Terraform Format Check
        run: terraform fmt -check -recursive
      
      - name: Terraform Init
        run: terraform init
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      
      - name: Terraform Validate
        run: terraform validate
      
      - name: Terraform Plan
        run: terraform plan -out=tfplan
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      
      - name: Post Plan Comment
        uses: actions/github-script@v7
        with:
          script: |
            const fs = require('fs');
            const plan = fs.readFileSync('tfplan', 'utf8');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: '```terraform\n' + plan + '\n```'
            })

Best Practices

1. Use Remote State

# Always use remote state for team collaboration
terraform {
  backend "s3" {
    # ... configuration
  }
}

2. Enable State Locking

# Prevents concurrent modifications
dynamodb_table = "terraform-locks"

3. Use Modules for Reusability

# Instead of repeating code
module "vpc" {
  source = "./modules/vpc"
  # ... configuration
}

4. Never Store Secrets in Code

# Use environment variables or secret management
export TF_VAR_db_password=$(aws secretsmanager get-secret-value --secret-id prod/db-password --query SecretString --output text)

5. Implement Policy as Code

# sentinel/policies/restrict_instance_type.sentinel
import "tfplan/v2" as tfplan

main = rule {
    all tfplan.resource_changes as _, rc {
        rc.type is "aws_instance" implies
        rc.change.after.instance_type in ["t3.micro", "t3.small"]
    }
}

Conclusion

Infrastructure as Code transforms how teams provision and manage infrastructure. By treating infrastructure with the same care as application codeโ€”version control, testing, code review, and automationโ€”you achieve consistency, auditability, and speed that manual processes cannot match.

Key takeaways:

  • Start with Terraform for provisioning, Ansible for configuration
  • Use modules and roles to create reusable components
  • Always use remote state with locking for team workflows
  • Integrate testing and security scanning into CI/CD
  • Never store secrets in version control

Resources

Comments