Table of Contents
- Testing & Validation
- 📦 Code Examples
- Why Testing Infrastructure Actually Matters
- Static Analysis: Catch Issues Before Deployment
- Policy-as-Code: Enforce Your Rules
- Integration Testing with Terratest
- Setup Terratest
- Example 1: Test EC2 Instance Deployment
- Example 2: Test S3 Bucket Compliance
- Example 3: Test Web Application Works
- CI/CD Integration
- Testing Best Practices
- 1. Follow the Testing Pyramid
- 2. Use Dedicated Test Accounts
- 3. Parallel Testing Speeds Things Up
- 4. Test Idempotency
- 5. Clean Up Test Resources Automatically
- 6. Use Unique Names to Avoid Conflicts
- 7. Mock External APIs in Unit Tests
- Common Mistakes to Avoid
- What You've Learned
- What's Next
- References
Testing & Validation
Your Terraform code works perfectly in your dev account. You run terraform apply, watch the green success messages scroll by, and think you're done.
Then production happens.
Your S3 buckets are wide open to the internet. EC2 instances are running the wrong AMI. That database you thought was encrypted? It's not. And someone just spun up 47 c5.24xlarge instances in Tokyo because there were no guardrails.
The $18,000 AWS bill arrives. Your security team is having a bad day. And you're learning an expensive lesson: testing deployment isn't the same as testing correctness.
Here's the thing nobody tells you: Infrastructure code breaks in ways application code doesn't. A typo in Python gives you a stack trace. A typo in Terraform gives you a publicly exposed database that passes every health check.
This tutorial shows you how to catch disasters before they cost you money, compliance violations, or your job.
📦 Code Examples
Repository: terraform-hcl-tutorial-series This Part: Part 10 - Testing Examples
Get the working example:
git clone https://github.com/khuongdo/terraform-hcl-tutorial-series.git
cd terraform-hcl-tutorial-series
git checkout part-10
cd examples/part-10-testing/
# Run security scans and tests
tfsec .
terraform init
terraform plan
Why Testing Infrastructure Actually Matters
Let me guess: You're thinking "it's just config files, how bad can it be?"
I'll tell you exactly how bad.
Security breach, take one: Developer copies an S3 bucket config from StackOverflow. Forgets to change acl = "public-read". Customer PII leaks. Company makes headlines. GDPR fine: 4% of annual revenue.
Cost explosion, take two: Junior engineer sets desired_capacity = 50 instead of 5 for an autoscaling group. Nobody notices until Monday morning when AWS has helpfully billed you for 672 hours of compute you didn't need.
Compliance failure, take three: Your SOC 2 audit fails because 30% of your RDS instances aren't encrypted. You swear they were. Turns out someone copy-pasted old Terraform code that didn't enforce encryption.
See the pattern? These aren't bugs. They're successful deployments of bad decisions.
You need three layers of defense:
Layer 1: Static Analysis - Scan your .tf files without deploying anything. Catch obvious mistakes like unencrypted buckets, missing backups, overly permissive security groups. Fast, free, catches 80% of problems.
Layer 2: Policy-as-Code - Enforce your company's rules. "All resources must have a CostCenter tag." "RDS instances can't use db.t2.micro in production." "S3 buckets can't exist in Singapore region." Automated compliance checks.
Layer 3: Integration Tests - Actually deploy infrastructure to a test account and verify it works. Does the EC2 instance boot? Can it reach the database? Is the load balancer health check passing? Expensive but catches what static analysis misses.
You need all three. Here's how to set them up.
Static Analysis: Catch Issues Before Deployment
Static analysis is your first line of defense. It scans Terraform files for known problems without deploying anything. No AWS credentials needed. No costs incurred. Just fast feedback on what's broken.
tfsec: Your Security Scanner
tfsec knows every common Terraform security mistake. Unencrypted S3 buckets. Security groups open to 0.0.0.0/0. Missing CloudTrail logging. IAM policies with wildcard permissions. It's seen it all.
Install it:
# macOS
brew install tfsec
# Linux
curl -s https://raw.githubusercontent.com/aquasecurity/tfsec/master/scripts/install_linux.sh | bash
# Or use Docker if you prefer containers
docker pull aquasec/tfsec
Run it:
tfsec .
That's it. It scans your current directory and tells you everything that's wrong.
Here's what actual output looks like:
───────────────────────────────────────────────────────────────
Result #1 HIGH Bucket does not have encryption enabled
───────────────────────────────────────────────────────────────
main.tf:12-16
───────────────────────────────────────────────────────────────
12 | resource "aws_s3_bucket" "data_bucket" {
13 | bucket = "my-data-bucket"
14 | acl = "private"
15 | }
───────────────────────────────────────────────────────────────
Impact: The bucket objects could be read if compromised
Resolution: Configure bucket encryption
More info: https://tfsec.dev/docs/aws/s3/enable-bucket-encryption
───────────────────────────────────────────────────────────────
See that? You thought setting acl = "private" was enough. tfsec knows better. Private ACL means "restrict access," not "encrypt data." If someone gets your credentials, they can read everything. Encryption protects data at rest even if access controls fail.
Fix it:
resource "aws_s3_bucket" "data_bucket" {
bucket = "my-data-bucket"
}
resource "aws_s3_bucket_server_side_encryption_configuration" "data_bucket" {
bucket = aws_s3_bucket.data_bucket.id
rule {
apply_server_side_encryption_by_default {
sse_algorithm = "AES256"
}
}
}
Run tfsec . again. Green output. Problem solved.
What tfsec catches:
- Unencrypted resources (S3, EBS volumes, RDS databases, EFS file systems)
- Public exposure (security groups allowing
0.0.0.0/0, S3 buckets with public ACLs) - Missing logging (no CloudTrail, VPC flow logs disabled, S3 access logging off)
- Weak IAM policies (actions set to
"*", resources set to"*") - Outdated protocols (TLS 1.0 or 1.1 instead of 1.2+)
Sometimes tfsec is wrong:
You're building a public website. The load balancer needs to accept traffic from anywhere. tfsec flags it anyway.
Tell it to back off:
resource "aws_security_group" "allow_http" {
name = "allow-http"
#tfsec:ignore:aws-ec2-no-public-ingress-sgr
ingress {
description = "Public HTTP - required for ALB serving public website"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
The comment tells tfsec "yes, I know this looks dangerous, I did it on purpose." Always add a description explaining why you're ignoring the check. Future you will thank present you.
Pro move for CI/CD:
tfsec . --format json > tfsec-report.json
JSON output means you can parse results, fail builds on HIGH severity issues, and track metrics over time.
Trivy: The Multi-Tool Scanner
Trivy scans everything: Terraform, Docker images, Kubernetes manifests, even application dependencies. It's slower than tfsec but more comprehensive.
Install it:
# macOS
brew install trivy
# Linux
wget -qO - https://aquasecurity.github.io/trivy-repo/deb/public.key | sudo apt-key add -
echo "deb https://aquasecurity.github.io/trivy-repo/deb $(lsb_release -sc) main" | sudo tee /etc/apt/sources.list.d/trivy.list
sudo apt-get update && sudo apt-get install trivy
Scan Terraform:
trivy config .
Output looks similar to tfsec but includes additional checks:
main.tf (terraform)
LOW: No bucket versioning enabled
════════════════════════════════════════
S3 bucket does not have versioning enabled.
Without versioning, accidental deletes are permanent.
────────────────────────────────────────
main.tf:12-16
────────────────────────────────────────
12 ┌ resource "aws_s3_bucket" "data_bucket" {
13 │ bucket = "my-data-bucket"
14 │ acl = "private"
15 └ }
────────────────────────────────────────
When to use which:
- Working only with Terraform? Use tfsec. It's faster and Terraform-specific.
- Scanning containers + Terraform in the same pipeline? Use Trivy. One tool, multiple scan types.
Checkov: The Compliance Police
Checkov (from Bridgecrew, now owned by Palo Alto) has over 1,000 built-in policies mapped to compliance frameworks. HIPAA, PCI-DSS, CIS benchmarks, SOC 2, GDPR - if there's a compliance standard, Checkov checks it.
pip install checkov
checkov -d .
Check specific compliance framework:
checkov -d . --framework terraform --check HIPAA
It shows you exactly which resources violate which HIPAA controls. No more guessing what auditors will flag.
Example output:
Check: CKV_AWS_18: "Ensure the S3 bucket has server-side encryption enabled"
FAILED for resource: aws_s3_bucket.data_bucket
File: /main.tf:12-16
Guide: https://docs.bridgecrew.io/docs/s3_14-data-encrypted-at-rest
Same encryption issue, but now you know it's a HIPAA violation, not just a security best practice.
Policy-as-Code: Enforce Your Rules
Static scanners catch known problems. But they don't know your company's specific policies:
- "All production EC2 instances must have tag
CostCenter" - "S3 buckets can't be created in
ap-southeast-1(too expensive)" - "RDS instances in production can't be smaller than
db.t3.medium" - "Security groups can't have descriptions containing the word 'temporary'"
You need Policy-as-Code. Enter Open Policy Agent.
OPA and Conftest
Open Policy Agent (OPA) is a general-purpose policy engine. You write rules in Rego language. Conftest applies those rules to Terraform plans.
Install Conftest:
# macOS
brew install conftest
# Linux
wget https://github.com/open-policy-agent/conftest/releases/download/v0.45.0/conftest_0.45.0_Linux_x86_64.tar.gz
tar xzf conftest_0.45.0_Linux_x86_64.tar.gz
sudo mv conftest /usr/local/bin/
Create a policy: Require CostCenter tags on everything
Create file policy/tagging.rego:
package main
import future.keywords.contains
import future.keywords.if
# Deny EC2 instances without CostCenter tag
deny[msg] {
resource := input.resource.aws_instance[name]
not resource.tags.CostCenter
msg := sprintf("EC2 instance '%s' missing required tag 'CostCenter'", [name])
}
# Deny S3 buckets without Environment tag
deny[msg] {
resource := input.resource.aws_s3_bucket[name]
not resource.tags.Environment
msg := sprintf("S3 bucket '%s' missing required tag 'Environment'", [name])
}
Rego looks weird at first. Read it like this: "For each EC2 instance in the plan, if it doesn't have a CostCenter tag, create a denial message."
Test the policy:
# Create Terraform plan in JSON format
terraform init
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json
# Run policy check
conftest test tfplan.json
Output if you violate the policy:
FAIL - tfplan.json - main - EC2 instance 'web_server' missing required tag 'CostCenter'
FAIL - tfplan.json - main - S3 bucket 'data_bucket' missing required tag 'Environment'
2 tests, 0 passed, 0 warnings, 2 failures, 0 exceptions
Pipeline fails. No deployment until you fix it.
Fix the violation:
resource "aws_instance" "web_server" {
ami = "ami-0c55b159cbfafe1f0"
instance_type = "t3.micro"
tags = {
Name = "WebServer"
CostCenter = "Engineering" # Added
}
}
resource "aws_s3_bucket" "data_bucket" {
bucket = "my-data-bucket"
tags = {
Environment = "Production" # Added
}
}
Run conftest test tfplan.json again. All tests pass. Deploy approved.
Advanced Policy: Prevent Cost Explosions
Here's a real policy that prevents expensive instance types in non-production environments:
policy/cost.rego:
package main
import future.keywords.if
# List of expensive instance types
expensive_instances := [
"c5.24xlarge",
"m5.24xlarge",
"r5.24xlarge",
"p3.16xlarge",
"p4d.24xlarge"
]
# Deny expensive instances in dev/staging
deny[msg] {
resource := input.resource.aws_instance[name]
resource.instance_type == expensive_instances[_]
resource.tags.Environment != "Production"
msg := sprintf(
"Instance '%s' uses expensive type '%s' in non-production (Environment=%s). Use smaller instances for dev/staging.",
[name, resource.instance_type, resource.tags.Environment]
)
}
Someone tries to launch a c5.24xlarge in staging? Blocked. No exceptions.
Regional Restrictions Policy
Some AWS regions cost more. Prevent accidental deployments:
policy/regions.rego:
package main
import future.keywords.if
allowed_regions := [
"us-east-1",
"us-west-2",
"eu-west-1"
]
deny[msg] {
provider := input.provider.aws[_]
region := provider.region
not region_allowed(region)
msg := sprintf(
"AWS region '%s' not allowed. Use one of: %v",
[region, allowed_regions]
)
}
region_allowed(region) {
region == allowed_regions[_]
}
Try to deploy to ap-southeast-1? Policy says no. Stay in the cheap regions.
Run multiple policies at once:
conftest test tfplan.json -p policy/
Conftest checks every .rego file in the directory. One command, all policies enforced.
Integration Testing with Terratest
Static analysis catches configuration mistakes. But it can't tell you if your infrastructure actually works.
Questions static analysis can't answer:
- Does the EC2 instance boot successfully?
- Can the application connect to RDS?
- Is the load balancer health check passing?
- Do autoscaling policies trigger correctly under load?
You need integration tests. Real deployments to real AWS accounts.
Terratest (by Gruntwork) deploys your Terraform code, validates it works, then cleans up. All automated. All in Go.
Setup Terratest
Install Go:
# macOS
brew install go
# Linux: download from https://go.dev/dl/
# Verify
go version # Should be 1.19 or later
Project structure:
your-terraform-project/
├── examples/
│ └── aws-instance/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
└── test/
└── aws_instance_test.go
The examples/ directory contains minimal working Terraform configs. The test/ directory contains Go tests that deploy those examples and validate them.
Example 1: Test EC2 Instance Deployment
examples/aws-instance/main.tf:
provider "aws" {
region = var.region
}
resource "aws_instance" "test_instance" {
ami = var.ami_id
instance_type = var.instance_type
tags = {
Name = var.instance_name
Environment = "Test"
ManagedBy = "Terratest"
}
}
examples/aws-instance/variables.tf:
variable "region" {
type = string
default = "us-east-1"
}
variable "ami_id" {
type = string
}
variable "instance_type" {
type = string
default = "t3.micro"
}
variable "instance_name" {
type = string
}
examples/aws-instance/outputs.tf:
output "instance_id" {
value = aws_instance.test_instance.id
}
output "public_ip" {
value = aws_instance.test_instance.public_ip
}
test/aws_instance_test.go:
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/aws"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestAwsInstance(t *testing.T) {
t.Parallel()
// Pick random region to avoid conflicts
awsRegion := aws.GetRandomStableRegion(t, []string{"us-east-1", "us-west-2"}, nil)
terraformOptions := &terraform.Options{
TerraformDir: "../examples/aws-instance",
Vars: map[string]interface{}{
"region": awsRegion,
"ami_id": "ami-0c55b159cbfafe1f0", // Amazon Linux 2
"instance_name": "terratest-example",
},
}
// Clean up after test (runs even if test fails)
defer terraform.Destroy(t, terraformOptions)
// Deploy infrastructure
terraform.InitAndApply(t, terraformOptions)
// Validate outputs
instanceID := terraform.Output(t, terraformOptions, "instance_id")
assert.NotEmpty(t, instanceID, "Instance ID should not be empty")
// Validate instance exists in AWS
instanceIDs := aws.GetEc2InstanceIdsByTag(t, awsRegion, "Name", "terratest-example")
assert.Contains(t, instanceIDs, instanceID)
}
Initialize Go modules:
cd test/
go mod init github.com/yourusername/terraform-tests
go get github.com/gruntwork-io/terratest/modules/terraform
go get github.com/gruntwork-io/terratest/modules/aws
go get github.com/stretchr/testify/assert
Run the test:
export AWS_ACCESS_KEY_ID="your-key"
export AWS_SECRET_ACCESS_KEY="your-secret"
go test -v -timeout 30m
What happens:
- Terratest picks a random AWS region (avoids conflicts)
- Runs
terraform initandterraform apply - Captures outputs (instance ID, public IP)
- Validates instance actually exists in AWS
- Runs
terraform destroyto clean up - Reports pass/fail
Actual output:
=== RUN TestAwsInstance
TestAwsInstance 2025-12-30T14:23:10Z Running command terraform with args [init -upgrade=false]
TestAwsInstance 2025-12-30T14:23:12Z Running command terraform with args [apply -auto-approve]
TestAwsInstance 2025-12-30T14:24:45Z Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
TestAwsInstance 2025-12-30T14:24:46Z Running command terraform with args [output -no-color instance_id]
TestAwsInstance 2025-12-30T14:24:47Z i-0abcd1234efgh5678
TestAwsInstance 2025-12-30T14:24:51Z Running command terraform with args [destroy -auto-approve]
TestAwsInstance 2025-12-30T14:25:15Z Destroy complete! Resources: 1 destroyed.
--- PASS: TestAwsInstance (125.43s)
PASS
Your infrastructure deployed successfully, passed validation, and cleaned up. All automated.
Example 2: Test S3 Bucket Compliance
Don't just test that the bucket exists. Test that it's configured correctly.
test/s3_bucket_test.go:
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/aws"
"github.com/gruntwork-io/terratest/modules/random"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestS3Bucket(t *testing.T) {
t.Parallel()
awsRegion := "us-east-1"
uniqueID := random.UniqueId()
bucketName := "terratest-bucket-" + uniqueID
terraformOptions := &terraform.Options{
TerraformDir: "../examples/s3-bucket",
Vars: map[string]interface{}{
"bucket_name": bucketName,
"region": awsRegion,
},
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
// Validate bucket exists
aws.AssertS3BucketExists(t, awsRegion, bucketName)
// Validate versioning enabled (compliance requirement)
versioningStatus := aws.GetS3BucketVersioning(t, awsRegion, bucketName)
assert.Equal(t, "Enabled", versioningStatus)
// Validate encryption enabled (security requirement)
bucketEncryption := aws.GetS3BucketEncryption(t, awsRegion, bucketName)
assert.NotNil(t, bucketEncryption, "Bucket must have encryption enabled")
}
This test fails if:
- Bucket doesn't get created
- Versioning is disabled
- Encryption is missing
Catches drift between policy and reality.
Example 3: Test Web Application Works
Deploy a complete web stack (EC2 + ALB) and verify the app actually responds:
test/web_app_test.go:
package test
import (
"fmt"
"testing"
"time"
http_helper "github.com/gruntwork-io/terratest/modules/http-helper"
"github.com/gruntwork-io/terratest/modules/terraform"
)
func TestWebApp(t *testing.T) {
t.Parallel()
terraformOptions := &terraform.Options{
TerraformDir: "../examples/web-app",
}
defer terraform.Destroy(t, terraformOptions)
terraform.InitAndApply(t, terraformOptions)
// Get load balancer DNS name from output
albURL := terraform.Output(t, terraformOptions, "alb_dns_name")
url := fmt.Sprintf("http://%s", albURL)
// Validate HTTP 200 response with retries (ALB takes time to warm up)
http_helper.HttpGetWithRetry(
t,
url,
nil,
200,
"Hello, World!", // Expected response body
30, // Max retries
10*time.Second, // Wait between retries
)
}
This catches:
- Load balancer misconfiguration
- Wrong target group health checks
- Security group blocking traffic
- Application deployment failures
If terraform apply succeeds but the app doesn't respond, the test fails. That's the whole point.
CI/CD Integration
Manual test runs don't scale. You need automated pipelines.
GitHub Actions Pipeline
.github/workflows/terraform-test.yml:
name: Terraform Testing Pipeline
on:
pull_request:
paths:
- '**.tf'
- '.github/workflows/terraform-test.yml'
push:
branches:
- main
jobs:
static-analysis:
name: Static Security Scan
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Run tfsec
uses: aquasecurity/tfsec-action@v1.0.0
with:
soft_fail: false
- name: Run Trivy
uses: aquasecurity/trivy-action@master
with:
scan-type: 'config'
scan-ref: '.'
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload results to GitHub Security
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: 'trivy-results.sarif'
policy-check:
name: Policy Validation
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Setup Terraform
uses: hashicorp/setup-terraform@v2
- name: Generate plan
run: |
terraform init
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json
- name: Install Conftest
run: |
wget https://github.com/open-policy-agent/conftest/releases/download/v0.45.0/conftest_0.45.0_Linux_x86_64.tar.gz
tar xzf conftest_0.45.0_Linux_x86_64.tar.gz
sudo mv conftest /usr/local/bin/
- name: Run policies
run: conftest test tfplan.json -p policy/
terratest:
name: Integration Tests
runs-on: ubuntu-latest
needs: [static-analysis, policy-check]
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Setup Go
uses: actions/setup-go@v4
with:
go-version: '1.20'
- name: Setup Terraform
uses: hashicorp/setup-terraform@v2
- name: Configure AWS
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Run Terratest
working-directory: test
run: |
go mod download
go test -v -timeout 30m
This pipeline:
- Runs tfsec and Trivy (fast, catches obvious issues)
- Runs OPA policies (enforces your rules)
- Only runs Terratest if static checks pass (saves time and money)
- Uploads security findings to GitHub Security tab
Every pull request gets tested. No manual intervention required.
Testing Best Practices
1. Follow the Testing Pyramid
Don't over-rely on Terratest. It's slow and costs real money.
Recommended distribution:
- 70% static analysis (tfsec, Trivy, Checkov) - Fast, free, catches most issues
- 20% policy validation (OPA/Conftest) - Medium speed, no infrastructure costs
- 10% integration tests (Terratest) - Slow, expensive, catches what others miss
2. Use Dedicated Test Accounts
Never run Terratest in production accounts. Ever.
Set up AWS Organizations structure:
root-account/
├── production/ # Real customer traffic
├── staging/ # Pre-prod testing
└── testing/ # Terratest sandbox (costs don't matter)
Set billing alerts on the testing account. If someone accidentally spawns 100 EC2 instances, you want to know immediately.
3. Parallel Testing Speeds Things Up
Terratest supports parallel execution:
func TestMultipleRegions(t *testing.T) {
t.Parallel() // Runs alongside other t.Parallel() tests
// Test code...
}
But watch out for AWS service limits. Don't run 50 parallel tests that each create a VPC. You'll hit the VPC limit (5 per region by default) and everything fails.
4. Test Idempotency
Run terraform apply twice. The second run should show zero changes:
func TestIdempotency(t *testing.T) {
terraformOptions := &terraform.Options{
TerraformDir: "../examples/vpc",
}
defer terraform.Destroy(t, terraformOptions)
// First apply
terraform.InitAndApply(t, terraformOptions)
// Second apply - should be a no-op
output := terraform.Apply(t, terraformOptions)
assert.Contains(t, output, "0 to add, 0 to change, 0 to destroy")
}
If the second apply changes things, your Terraform code isn't idempotent. That causes drift and unpredictable behavior.
5. Clean Up Test Resources Automatically
Always use defer terraform.Destroy():
defer terraform.Destroy(t, terraformOptions) // Runs even if test fails
This prevents orphaned resources. If a test crashes halfway through, Terraform still destroys what it created.
But cleanup can fail. Resource dependencies, manual deletions, rate limits - all cause destroy failures. Monitor with AWS Config or write a cleanup Lambda that runs daily.
6. Use Unique Names to Avoid Conflicts
Multiple developers running tests at the same time? You need unique resource names:
bucketName := "test-bucket-" + random.UniqueId()
Terratest's random.UniqueId() generates unique strings. No more "bucket name already exists" errors when someone else is testing.
7. Mock External APIs in Unit Tests
Don't hit real AWS APIs for every test. Use LocalStack for local AWS emulation:
# docker-compose.yml
services:
localstack:
image: localstack/localstack
environment:
- SERVICES=s3,dynamodb,sqs,lambda
- DEFAULT_REGION=us-east-1
ports:
- "4566:4566"
Point Terraform at LocalStack:
provider "aws" {
region = "us-east-1"
access_key = "test"
secret_key = "test"
skip_credentials_validation = true
skip_metadata_api_check = true
endpoints {
s3 = "http://localhost:4566"
dynamodb = "http://localhost:4566"
sqs = "http://localhost:4566"
}
}
Tests run locally, no AWS costs, much faster feedback.
Common Mistakes to Avoid
Mistake 1: Testing in production
I shouldn't have to say this, but: never run destructive tests in production accounts. Use separate AWS accounts for testing.
Mistake 2: Ignoring cleanup failures
Terratest's defer terraform.Destroy() can fail. Resources get orphaned. AWS keeps billing you.
Set up automated cleanup:
# Tag all test resources
tags = {
ManagedBy = "Terratest"
TTL = "24h"
}
# Daily Lambda deletes resources older than 24 hours with ManagedBy=Terratest tag
Mistake 3: Hardcoding AWS credentials
Never commit credentials to Git. Use environment variables or IAM role assumption:
# Environment variables (local development)
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
# IAM roles (CI/CD)
aws sts assume-role --role-arn arn:aws:iam::123456789012:role/TerratestRole
Mistake 4: Slow test feedback loops
Terratest tests can take 15+ minutes. Optimize:
- Use smallest possible instance types (
t3.nanoinstead oft3.xlarge) - Reduce retry timeouts in HTTP checks
- Run expensive tests only on
mainbranch, not every PR - Cache Terraform providers in CI/CD
Mistake 5: No visibility into test costs
Terratest deploys real infrastructure. That costs money. Track it:
- Tag all test resources with
Environment=Testing - Use AWS Cost Explorer to filter by tag
- Set billing alerts ($50/day is reasonable for active testing)
What You've Learned
You now know how to:
- Catch security issues before deployment with tfsec, Trivy, and Checkov
- Enforce company-specific policies using OPA and Conftest
- Test real infrastructure behavior with Terratest
- Integrate all three testing layers into CI/CD pipelines
- Balance test coverage vs cost and speed
Before moving on, do this:
- Run
tfsec .on your Terraform code. Fix anything marked HIGH or CRITICAL. - Write one OPA policy that enforces your company's tagging standards.
- Create a basic Terratest test that deploys an S3 bucket and validates encryption.
- Add tfsec to your CI/CD pipeline so it runs on every PR.
Why this matters: One unencrypted S3 bucket can expose customer data. One missing policy check can cost thousands in wasted EC2 spend. Testing prevents disasters before they reach production.
What's Next
In Part 11: Terraform Cloud & Remote State, you'll learn:
- Moving from local state to Terraform Cloud
- Team collaboration workflows (RBAC, policy enforcement, drift detection)
- Sentinel policies for enterprise governance
- Cost estimation before you apply changes
- Private module registry for sharing code across teams
You've mastered testing. Next up: scaling Terraform for teams.
References
- tfsec Documentation
- Trivy Terraform Scanning
- Open Policy Agent
- Conftest
- Terratest Documentation
- Gruntwork Testing Examples
Series navigation:
- Part 1: Why Infrastructure as Code?
- Part 2: Setting Up Terraform
- Part 3: Your First Cloud Resource
- Part 4: HCL Fundamentals
- Part 5: Variables, Outputs & State
- Part 6: Core Terraform Workflow
- Part 7: Modules for Organization
- Part 8: Multi-Cloud Patterns
- Part 9: State Management & Team Workflows
- Part 10: Testing & Validation (You are here)
- Part 11: Security & Secrets Management
- Part 12: Production Patterns & DevSecOps
This post is part of the "Terraform from Fundamentals to Production" series.