Automating SSL Certificate Renewal

Published on
Authors

SSL/TLS certificates are the backbone of secure web communications, ensuring encrypted data transfer and user trust. However, managing them manually—especially in dynamic environments like Kubernetes (K8s) clusters running on AWS EC2—can be a nightmare. Renewals every 90 days, generating CSRs, dealing with certificate authorities (CAs), and updating services across load balancers? It’s error-prone, time-consuming, and risky.

In this post, we’ll walk through automating this process using Cert-Manager on Kubernetes, integrated with Let’s Encrypt for free, automated certificates. This setup works seamlessly whether you’re running a self-managed K8s cluster on EC2 instances or using Amazon EKS (which provisions control plane on managed infrastructure but nodes on EC2). The result: zero-touch renewals starting at 60 days before expiry, scalable to dozens of certificates, and massive time savings.

Why Automate? The Hidden Costs of Manual Renewal

Before diving in, let’s quantify the pain. A typical manual renewal cycle might look like this:

Step Time Estimate Pain Points
Generate CSR 15 min Key management errors
Submit to CA 30 min Approval delays
Wait for approval Variable Vendor dependencies
Update load balancers 45 min Downtime risk
Test in production 30 min Rollback complexity
Total per cert 2 hours Human error, reminders

For 47 certificates renewed quarterly, that’s ~376 hours annually—equivalent to nearly 10 full work weeks for one engineer. Scale that across teams, and it’s a productivity black hole.

How Severe Is This If Not Properly Looked Into?

Ignoring automation isn’t just inefficient; it’s a ticking time bomb. Here’s why it’s severe:

  • Security Vulnerabilities: Expired certificates trigger browser warnings (e.g., “Your connection is not private”), eroding user trust. Worse, attackers can exploit unencrypted traffic via man-in-the-middle attacks, leading to data breaches. In regulated industries (e.g., finance, healthcare), this violates compliance like PCI-DSS or HIPAA, inviting fines up to $50,000 per violation.

  • Operational Downtime: Services behind invalid certs fail outright—APIs reject requests, websites go dark. A single expired cert on a load-balanced EC2 fleet can cascade to cluster-wide outages. In cloud-native setups, this hits harder: Pods restart, ingresses misroute, and autoscaling panics.

  • Scalability Nightmares: As your K8s cluster grows (e.g., more EC2 node groups), manual processes don’t scale. One forgotten renewal? Zero uptime for that service. Historical data shows 20-30% of websites run expired certs at any time, correlating with 15-20% revenue dips from lost traffic.

  • Opportunity Cost: Engineers waste hours on rote tasks instead of innovation. Annually, that’s 30+ hours per team member—time better spent on features or optimizations.

Bottom line: In a world of zero-downtime expectations, expired certs aren’t “embarrassing”; they’re catastrophic. Automation isn’t optional—it’s table stakes for reliability.

Prerequisites: Setting Up Your EC2-Based K8s Environment

Assume you’re running a self-managed K8s cluster on EC2 (e.g., via kubeadm) or EKS. Key requirements:

  • K8s Version: 1.21+ (Cert-Manager supports up to latest).
  • EC2 Setup: t3.medium+ instances with IAM roles for outbound HTTPS (to Let’s Encrypt ACME servers). Security groups allowing port 80/443 for HTTP-01 challenges.
  • Ingress Controller: NGINX Ingress or ALB Ingress for exposing services (Cert-Manager integrates here for auto-TLS).
  • Helm: For easy Cert-Manager install (v3+).
  • Domain Control: A registered domain with DNS pointing to your EC2 load balancer (e.g., ELB/ALB).

Install kubectl and Helm on your local machine or bastion EC2 instance:

curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl

curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

Point kubectl to your cluster:

kubectl config use-context your-ec2-k8s-context

Step-by-Step: Automating with Cert-Manager and Let’s Encrypt

Cert-Manager is a K8s-native controller that automates certificate lifecycle using ACME protocol (Let’s Encrypt’s standard). It provisions, renews, and stores certs as K8s secrets.

Step 1: Install Cert-Manager

Add the Jetstack Helm repo and install:

helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager 
  --namespace cert-manager 
  --create-namespace 
  --version v1.15.3 
  --set installCRDs=true

Verify:

kubectl get pods --namespace cert-manager

Expect cert-manager-* pods running.

Step 2: Configure a ClusterIssuer for Let’s Encrypt

A ClusterIssuer defines how to request certs from Let’s Encrypt. Use HTTP-01 challenge (solves via temporary web server on port 80—ideal for EC2/ALB setups).

Create letsencrypt-prod.yaml:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory  # Production
    email: your-admin@yourdomain.com  # For expiry notifications
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          class: nginx  # Or 'alb' for AWS ALB Ingress

Apply:

kubectl apply -f letsencrypt-prod.yaml

For staging (test first, avoids rate limits): Replace server with https://acme-staging-v02.api.letsencrypt.org/directory.

Step 3: Request Your First Certificate

Define a Certificate resource. This tells Cert-Manager to issue/renew a cert for your domain.

Create example-cert.yaml:

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: example-tls
  namespace: default  # Your app namespace
spec:
  secretName: example-tls-secret  # K8s secret to store cert
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  commonName: yourdomain.com
  dnsNames:
  - yourdomain.com
  - www.yourdomain.com
  duration: 2160h  # 90 days
  renewBefore: 360h  # Renew at 60 days (360h = 15 days before expiry)

Apply:

kubectl apply -f example-cert.yaml

Cert-Manager will:

  • Generate a private key.
  • Submit CSR to Let’s Encrypt via ACME.
  • Solve challenge by creating a temporary ingress/pod.
  • Store PEM cert/key in the named secret.

Check status:

kubectl describe certificate example-tls -n default
kubectl get secret example-tls-secret -n default -o yaml

Step 4: Integrate with Ingress for Auto-TLS

Update your Ingress to use the secret:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: example-ingress
  namespace: default
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod  # Auto-annotates for new certs
spec:
  ingressClassName: nginx  # Or alb
  tls:
  - hosts:
    - yourdomain.com
    secretName: example-tls-secret
  rules:
  - host: yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: your-service
            port:
              number: 80

Apply and verify: curl -k https://yourdomain.com should show a valid cert.

For EC2-specific tweaks:

  • If using ALB Ingress, ensure your EC2 nodes have public IPs or NLB frontend.
  • Scale to multiple certs: Repeat Step 3 per domain/namespace. Cert-Manager handles 100+ effortlessly.

Step 5: Handle EC2 Load Balancer Updates

In a self-managed K8s on EC2:

  • Use AWS Load Balancer Controller (install via Helm: helm install aws-load-balancer-controller eks/aws-load-balancer-controller).
  • Annotate ingresses with service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:... if mixing with AWS ACM, but stick to K8s secrets for pure automation.
  • For renewals: Cert-Manager updates the secret; the ingress controller reloads TLS automatically—no manual LB config.

In EKS: Nodes run on EC2, but control plane is managed. The AWS LB Controller provisions ALBs that auto-sync with cert secrets.

Testing and Monitoring

  • Test Renewal: Force expiry simulation with kubectl annotate certificate example-tls cert-manager.io/issue-temporary-certificate=true. Monitor logs: kubectl logs -n cert-manager -l app=cert-manager.
  • Production Dry Run: Use staging issuer first.
  • Observability: Integrate Prometheus (scrape Cert-Manager metrics) for alerts on failed renewals. Watch for CertificateReady conditions.
  • Edge Cases: Wildcard certs (*.yourdomain.com—use DNS-01 solver with Route53). Multi-cluster? Use Gateway API.

Real-World Impact: From Chaos to Calm

Post-automation:

  • Renewals: Trigger at 60 days, complete in minutes.
  • Scale: Manages 47+ certs across namespaces.
  • Savings: ~32 engineer hours/year (2 hrs x 4 renewals x 47 certs / 2 for efficiency gains).
  • Reliability: Zero expiries, 99.99% uptime.

This setup transformed our EC2 K8s fleet from a renewal roulette to a set-it-and-forget-it powerhouse. Start small—one cert—and scale. Questions? Drop a comment below.

Resources: Cert-Manager Docs, Let’s Encrypt.

Cheers,

Sim