DNS and TLS certificates are foundational infrastructure. Manual management doesn’t scale and creates security gaps. This guide covers DNS and certificate automation using modern tools and infrastructure-as-code patterns.
DNS Architecture
DNS Record Types
| Type | Purpose | Example |
|---|---|---|
| A | IPv4 address | example.com โ 1.2.3.4 |
| AAAA | IPv6 address | example.com โ 2001:db8::1 |
| CNAME | Canonical name | www โ @ |
| MX | Mail exchange | @ โ mail.example.com |
| TXT | Text records | @ โ “v=spf1 include:_spf.example.com ~all” |
| NS | Name servers | @ โ ns1.example.com |
| SOA | Start of Authority | Administrative info |
| CAA | Certificate Authority | example.com โ letsencrypt.org |
Route53 DNS Configuration
# Terraform - Route53 hosted zone and records
resource "aws_route53_zone" "main" {
name = "example.com"
tags = {
Environment = "production"
ManagedBy = "terraform"
}
}
resource "aws_route53_record" "api" {
zone_id = aws_route53_zone.main.zone_id
name = "api.example.com"
type = "A"
alias {
name = "alb-example-123456789.us-east-1.elb.amazonaws.com"
zone_id = "Z35SXDOWRQ4VI"
evaluate_target_health = true
}
}
resource "aws_route53_record" "www" {
zone_id = aws_route53_zone.main.zone_id
name = "www.example.com"
type = "CNAME"
ttl = 300
records = ["example.com"]
}
resource "aws_route53_record" "mx" {
zone_id = aws_route53_zone.main.zone_id
name = "example.com"
type = "MX"
ttl = 3600
records = [
"10 mail1.example.com",
"20 mail2.example.com"
]
}
resource "aws_route53_record" "spf" {
zone_id = aws_route53_zone.main.zone_id
name = "example.com"
type = "TXT"
ttl = 3600
records = ["v=spf1 include:_spf.example.com ~all"]
}
Cloudflare DNS
# Cloudflare Terraform provider
provider "cloudflare" {
api_token = var.cloudflare_api_token
}
resource "cloudflare_record" "api" {
zone_id = cloudflare_zone.example.id
name = "api"
value = "1.2.3.4"
type = "A"
proxied = true # Cloudflare proxy
}
resource "cloudflare_record" "cdn" {
zone_id = cloudflare_zone.example.id
name = "cdn"
value = "cdn.example.com"
type = "CNAME"
proxied = true
}
Certificate Management with cert-manager
Installation
# Install cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.0/cert-manager.yaml
# Verify installation
kubectl get pods -n cert-manager
Let’s Encrypt Issuer
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: [email protected]
# Use Route53 for DNS challenge
solvers:
- dns01:
route53:
region: us-east-1
hostedZoneID: Z1234567890ABC
# Or use Cloudflare
# dns01:
# cloudflare:
# apiTokenSecretRef:
# name: cloudflare-api-token
# key: api-token
Certificate Resource
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: example-com
namespace: production
spec:
secretName: example-com-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
group: cert-manager.io
dnsNames:
- example.com
- www.example.com
- api.example.com
duration: 2160h # 90 days
renewBefore: 360h # 15 days before expiry
# Store in multiple secrets for different uses
secretTemplates:
- annotations:
cert-manager.io/allow-cluster-issue: "true"
Ingress with TLS
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: production-ingress
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- example.com
- www.example.com
secretName: example-com-tls
rules:
- host: example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: web-service
port:
number: 80
DNS-01 Challenge
The DNS-01 challenge proves you control the domain by creating a TXT record.
# Manual DNS-01 challenge with Cloudflare
import cloudflare
def create_dns_challenge(domain, token):
"""Create DNS TXT record for Let's Encrypt challenge"""
client = cloudflare.Cloudflare(api_token=token)
# Get zone ID
zone_id = client.zones.get(params={'name': domain})[0]['id']
# Create TXT record
client.zones.dns_records.post(
zone_id,
data={
'type': 'TXT',
'name': f'_acme-challenge.{domain}',
'content': token,
'ttl': 60
}
)
# Wait for propagation
import time
time.sleep(30)
return True
def cleanup_dns_challenge(domain, token, record_id):
"""Clean up DNS TXT record"""
client = cloudflare.Cloudflare(api_token=token)
zone_id = client.zones.get(params={'name': domain})[0]['id']
client.zones.dns_records.delete(zone_id, record_id)
Multiple DNS Providers
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: multi-dns-issuer
spec:
acme:
solvers:
# Try Route53 first
- dns01:
route53:
region: us-east-1
hostedZoneID: Z1234567890ABC
selector:
dnsZones:
- "example.com"
# Fallback to Cloudflare
- dns01:
cloudflare:
apiTokenSecretRef:
name: cloudflare-api-token
key: api-token
selector:
dnsZones:
- "*.example.com"
ACM (AWS Certificate Manager)
Request Certificate
# Terraform - ACM certificate
resource "aws_acm_certificate" "main" {
domain_name = "example.com"
subject_alternative_names = ["*.example.com"]
validation_method = "DNS"
lifecycle {
create_before_destroy = true
}
}
# Auto-validate with Route53
resource "aws_route53_record" "cert_validation" {
for_each = {
for val in aws_acm_certificate.main.domain_validation_options :
val.domain_name => val
}
zone_id = aws_route53_zone.main.zone_id
name = each.value.resource_record_name
type = each.value.resource_record_type
ttl = 60
records = [each.value.resource_record_value]
}
ALB with HTTPS
# Terraform - ALB with HTTPS
resource "aws_lb" "main" {
name = "main-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = aws_subnet.public[*].id
}
resource "aws_lb_listener" "https" {
load_balancer_arn = aws_lb.main.arn
port = "443"
protocol = "HTTPS"
ssl_policy = "ELBSecurityPolicy-2016-08"
certificate_arn = aws_acm_certificate.main.arn
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.main.arn
}
}
Automated Renewal Scripts
Renew Certificates Script
#!/usr/bin/env python3
"""Certificate renewal monitoring and automation."""
import boto3
import datetime
import logging
from dataclasses import dataclass
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
@dataclass
class Certificate:
arn: str
domain: str
expires: datetime.datetime
def get_expiring_certificates(days=30):
"""Find certificates expiring within specified days"""
client = boto3.client('acm')
certs = client.list_certificates()['CertificateSummaryList']
expiring = []
for cert in certs:
detail = client.describe_certificate(CertificateArn=cert['CertificateArn'])
if detail['Certificate']['Status'] != 'ISSUED':
continue
not_after = detail['Certificate']['NotAfter']
days_until_expiry = (not_after - datetime.datetime.now(not_after.tzinfo)).days
if days_until_expiry <= days:
expiring.append(Certificate(
arn=cert['CertificateArn'],
domain=cert['DomainName'],
expires=not_after
))
return expiring
def send_alert(certs):
"""Send alert about expiring certificates"""
if not certs:
return
message = "Expiring certificates:\n"
for cert in certs:
message += f"- {cert.domain} expires {cert.expires}\n"
# Send via SNS
sns = boto3.client('sns')
sns.publish(
TopicArn='arn:aws:sns:us-east-1:123456789:alerts',
Subject='Certificate Expiry Alert',
Message=message
)
def main():
expiring = get_expiring_certificates(days=30)
if expiring:
logger.warning(f"Found {len(expiring)} expiring certificates")
send_alert(expiring)
else:
logger.info("No certificates expiring soon")
if __name__ == "__main__":
main()
cert-manager Renewal Monitor
# Prometheus alerts for cert-manager
- name: certificate-expiry
rules:
- alert: CertManagerCertificateExpiry
expr: |
certmanager_certificate_expiration_timestamp - time() < 604800
for: 1h
labels:
severity: warning
annotations:
summary: "Certificate expiring in less than 7 days"
- alert: CertManagerCertificateExpired
expr: |
certmanager_certificate_expiration_timestamp - time() < 0
for: 1m
labels:
severity: critical
annotations:
summary: "Certificate has expired"
DNSSEC
Enable DNSSEC on Route53
# Terraform - DNSSEC configuration
resource "aws_route53_zone" "main" {
name = "example.com"
}
resource "aws_route53_key_signing_key" "main" {
name = "example-com-key"
zone_id = aws_route53_zone.main.zone_id
key_management_service_arn = aws_kms_key.dnssec.arn
}
resource "aws_route53_dnssec" "main" {
hosted_zone_id = aws_route53_zone.main.zone_id
}
Cloudflare DNSSEC
# Enable DNSSEC via Cloudflare API
import cloudflare
def enable_dnssec(zone_id, zone_name):
client = cloudflare.Cloudflare()
# Get DNSSEC public key from Cloudflare
dnssec = client.zones.dnssec.post(zone_id, data={
'type': 'DS'
})
# Create DS record in parent zone
# (This must be done at your registrar)
print(f"DS Record: {dnssec['ds']}")
Traffic Routing Patterns
Weighted Routing
# Terraform - Weighted routing
resource "aws_route53_record" "api-v1" {
zone_id = aws_route53_zone.main.zone_id
name = "api.example.com"
type = "A"
set_identifier = "v1"
health_check_id = aws_route53_health_check.v1.id
alias {
name = "v1-alb.example.com"
zone_id = "ZONE_ID"
evaluate_target_health = true
}
ttl = 60
records = ["1.2.3.4"]
weight = 90
}
resource "aws_route53_record" "api-v2" {
zone_id = aws_route53_zone.main.zone_id
name = "api.example.com"
type = "A"
set_identifier = "v2"
health_check_id = aws_route53_health_check.v2.id
alias {
name = "v2-alb.example.com"
zone_id = "ZONE_ID"
evaluate_target_health = true
}
ttl = 60
records = ["5.6.7.8"]
weight = 10
}
Latency-Based Routing
resource "aws_route53_record" "us-east" {
zone_id = aws_route53_zone.main.zone_id
name = "api.example.com"
type = "A"
set_identifier = "us-east"
region = "us-east-1"
alias {
name = "alb-us-east.example.com"
zone_id = "ZONE_ID"
evaluate_target_health = true
}
}
resource "aws_route53_record" "eu-west" {
zone_id = aws_route53_zone.main.zone_id
name = "api.example.com"
type = "A"
set_identifier = "eu-west"
region = "eu-west-1"
alias {
name = "alb-eu-west.example.com"
zone_id = "ZONE_ID"
evaluate_target_health = true
}
}
Best Practices
DNS Best Practices
- Use multiple nameservers for redundancy
- Enable DNSSEC for security
- Set appropriate TTLs (shorter for dynamic records)
- Use ALIAS records instead of CNAMEs at apex
- Monitor DNS resolution latency
- Implement rate limiting protection
Certificate Best Practices
- Use short certificate validity (90 days for Let’s Encrypt)
- Automate renewal (at least 14 days before expiry)
- Use DNS-01 challenge for wildcard certificates
- Monitor certificate expiration proactively
- Store certificates in secrets, not configmaps
- Use dedicated certificates per service
Security
# CAA record - restrict certificate authorities
resource "aws_route53_record" "caa" {
zone_id = aws_route53_zone.main.zone_id
name = "example.com"
type = "CAA"
ttl = 3600
records = [
"0 issue \"letsencrypt.org\"",
"0 issuewild \";\"",
"0 iodef \"mailto:[email protected]\""
]
}
# DNSSEC signing
resource "cloudflare_record" "dmarc" {
zone_id = cloudflare_zone.example.id
name = "_dmarc"
value = "v=DMARC1; p=quarantine; rua=mailto:[email protected]"
type = "TXT"
}
Monitoring
# DNS monitoring
- name: dns
rules:
- alert: HighDNSLatency
expr: histogram_quantile(0.95, rate(dns_query_duration_seconds_bucket[5m])) > 0.5
for: 5m
labels:
severity: warning
annotations:
summary: "High DNS latency"
- alert: DNSErrors
expr: rate(dns_query_errors_total[5m]) > 10
for: 5m
labels:
severity: critical
annotations:
summary: "High DNS error rate"
Conclusion
Automated DNS and certificate management is essential:
- Use cert-manager for Kubernetes certificate automation
- Use Route53 or Cloudflare for DNS management as code
- Enable DNS-01 challenges for wildcard certificates
- Implement DNSSEC for domain security
- Monitor certificate expiration proactively
- Use traffic routing features for blue-green and canary
Start with automated certificates, then add DNS automation.
External Resources
Related Articles
- Zero Trust Security - TLS everywhere
- Secrets Management - Certificate storage
- Edge Computing - Edge TLS termination
Comments