· — Dishant Sethi ·Apr 5, 2026·13 min read

How to Deploy on Air-Gapped AWS EKS for Regulated Financial Services

Q: How do EKS nodes pull container images without internet access?

Nodes pull images exclusively from private ECR repositories via the ECR VPC interface endpoint. No internet path is used. All images — including third-party controllers like cluster-autoscaler — must be mirrored into private ECR before deployment.

A practical engineering guide to deploying containerized applications on AWS EKS inside a fully air-gapped VPC — covering network isolation, private registries, CI/CD pipelines, and secrets management for regulated financial services environments.

Financial services data breaches cost an average of $6.08 million per incident — 22% above the global average across all industries (IBM Cost of a Data Breach 2024, 2024). For regulated institutions, the answer isn't just better firewalls. It's network architecture that eliminates the attack surface at the infrastructure level.

Air-gapped AWS EKS deployments — where private subnets have zero internet egress and all traffic routes through VPC endpoints — are becoming the standard for regulated financial services workloads. This guide walks through the full architecture, from VPC design to CI/CD pipeline, based on a real deployment we executed for a fintech platform at Prodinit.

Key Takeaways
Air-gapped EKS requires VPC interface and gateway endpoints for every AWS service your workloads touch — there's no fallback to the public internet
Your CI/CD pipeline must be redesigned from scratch: images push to private ECR via VPC endpoint, and deployment runs through Systems Manager or a bastion inside the VPC
Kubernetes External Secrets Operator + AWS Secrets Manager is the cleanest pattern for pod-level secret injection without exposing credentials in manifests
Every data store (RDS, DynamoDB, ElastiCache) must live in private subnets, accessed via security group rules — no public endpoints

What Does "Air-Gapped" Mean in AWS Context?

An air-gapped VPC means your private subnets have no route to the internet — no NAT Gateway in private subnets, no internet gateway attachment to private route tables. All communication between your workloads and AWS services (S3, ECR, Secrets Manager, CloudWatch, Bedrock) must route through VPC endpoints.

AWS supports two endpoint types (AWS VPC Endpoints documentation):

Gateway endpoints — for S3 and DynamoDB only; free, added as route table entries
Interface endpoints — for all other AWS services via AWS PrivateLink; billed per hour per AZ

Endpoint Type	Services	Cost	Configuration
Gateway	S3, DynamoDB only	Free	Route table entry — no ENI, no AZ cost
Interface (PrivateLink)	ECR, Secrets Manager, SSM, CloudWatch, STS, ELB, Bedrock, Transcribe, and more	$0.01/hr per AZ + $0.01/GB processed	Elastic network interface in each private subnet

For a typical EKS deployment, you'll need interface endpoints for: ECR API, ECR Docker, S3 (or gateway), Secrets Manager, Systems Manager, CloudWatch Logs, STS, ELB, Bedrock (if using AI services), and Transcribe (if using speech-to-text).

Why This Matters for Regulated Workloads

Network isolation is a hard requirement under FFIEC guidelines and SEC cybersecurity rules for financial institutions. An air-gapped VPC enforces this at the infrastructure layer — there's no misconfigured security group that can accidentally allow outbound internet access, because the route simply doesn't exist.

On the Client deployment, we discovered mid-project that several Helm charts we'd planned to use for in-cluster controllers (ALB Ingress Controller, cluster-autoscaler) attempt to pull their own images from public registries at install time. We had to mirror every controller image into private ECR before the cluster could bootstrap. This is a class of problem that only surfaces when you actually try to deploy — not in planning.

How Do You Design a Zero-Egress VPC for AWS EKS?

Regulated financial services environments under FFIEC and SEC cybersecurity guidelines require network isolation enforced at the infrastructure level — not as a policy overlay but as a structural property of the network. A multi-AZ VPC with private subnets carrying no internet route, combined with VPC interface endpoints for every AWS service, eliminates the outbound internet path entirely rather than restricting it.

Start with a multi-AZ VPC with distinct public and private subnet tiers.

Subnet Design

VPC: 10.0.0.0/16
├── Public Subnets (one per AZ)
│   ├── 10.0.1.0/24 (us-east-1a)
│   ├── 10.0.2.0/24 (us-east-1b)
│   └── NAT Gateways (for public-subnet resources only)
└── Private Subnets (one per AZ)
    ├── 10.0.10.0/24 (us-east-1a)
    └── 10.0.11.0/24 (us-east-1b)
        — No internet route in route table
        — VPC endpoints attached

Private subnet route tables should contain exactly two entries: the local VPC CIDR, and gateway endpoint routes for S3/DynamoDB. Nothing else.

VPC Endpoints to Provision

Create interface endpoints in your private subnets for each required AWS service:

# ECR — required for image pulls
aws ec2 create-vpc-endpoint --vpc-id vpc-xxx \
  --service-name com.amazonaws.us-east-1.ecr.api \
  --vpc-endpoint-type Interface \
  --subnet-ids subnet-xxx subnet-yyy \
  --security-group-ids sg-endpoints

aws ec2 create-vpc-endpoint --vpc-id vpc-xxx \
  --service-name com.amazonaws.us-east-1.ecr.dkr \
  --vpc-endpoint-type Interface \
  --subnet-ids subnet-xxx subnet-yyy \
  --security-group-ids sg-endpoints

# Secrets Manager
# Systems Manager (for bastion-free access)
# CloudWatch Logs
# STS (required for IRSA)
# ELB (required for ALB Ingress Controller)

Create a dedicated security group for endpoints that allows HTTPS (443) inbound from your VPC CIDR. Don't open it wider than needed.

IAM Roles

Set up three distinct IAM role categories before touching EKS:

Developer access role — scoped to read operations, no production deploy permissions
CI/CD role — ECR push, EKS kubectl apply, Secrets Manager read
Node instance role — ECR pull, CloudWatch logging, S3 read for application buckets

Use IAM Roles for Service Accounts (IRSA) for in-cluster components. This ties Kubernetes service accounts to IAM roles without storing credentials anywhere.

How Do You Run an EKS Cluster with No Public Internet Path?

As of 2024, 80% of organizations run Kubernetes in production (CNCF Annual Survey 2024). The hard part isn't Kubernetes — it's running it without any public internet path.

Cluster Creation

When creating the EKS cluster, set the API server endpoint access to private only:

eksctl create cluster \
  --name fintech-prod \
  --region us-east-1 \
  --vpc-private-subnets subnet-xxx,subnet-yyy \
  --node-private-networking \
  --endpoint-private-access true \
  --endpoint-public-access false

With --endpoint-public-access false, kubectl only works from inside the VPC. This is intentional. Access the cluster via a bastion host or AWS Systems Manager Session Manager.

Node Groups

Place node groups in private subnets with autoscaling enabled:

# nodegroup.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
  name: fintech-prod
  region: us-east-1
managedNodeGroups:
  - name: workers
    instanceTypes: ["m6i.xlarge", "m6i.2xlarge"]
    minSize: 2
    maxSize: 10
    desiredCapacity: 3
    privateNetworking: true
    iam:
      withAddonPolicies:
        autoScaler: true
        cloudWatch: true

Bootstrapping In-Cluster Controllers

This is where most air-gapped deployments stall. cluster-autoscaler and the AWS Load Balancer Controller both try to pull images from public registries during helm install. You must mirror them to ECR first:

# Pull, retag, and push to private ECR
docker pull registry.k8s.io/autoscaling/cluster-autoscaler:v1.29.0
docker tag registry.k8s.io/autoscaling/cluster-autoscaler:v1.29.0 \
  123456789.dkr.ecr.us-east-1.amazonaws.com/cluster-autoscaler:v1.29.0
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/cluster-autoscaler:v1.29.0

Override the image in Helm values to point to your private ECR before installing.

How Do You Build a CI/CD Pipeline Without Internet Access?

Standard GitHub Actions hosted runners and most CI/CD platforms assume outbound internet access for image pulls and API calls — assumptions that silently break the moment you remove internet egress. The working architecture requires a self-hosted runner deployed inside the VPC, all image pushes to private ECR via VPC endpoint, and all cluster deployments executed through AWS Systems Manager Session Manager with zero inbound ports.

Standard CI/CD tooling assumes internet access. GitHub Actions' hosted runners can't reach a private EKS API endpoint. CodePipeline agents can't pull from Docker Hub. You need a fundamentally different pipeline architecture.

The pattern that works:

Developer pushes code
        ↓
GitHub Actions (or CodePipeline)
        ↓
Build image in CI environment (with internet access)
        ↓
Push image to Private ECR via VPC endpoint
        ↓
Trigger deployment (CodePipeline or self-hosted runner in VPC)
        ↓
kubectl/Helm apply via Systems Manager Session Manager
        ↓
EKS pulls image from private ECR (no internet needed)
        ↓
ALB routes traffic

Self-Hosted Runner in VPC

If using GitHub Actions, deploy a self-hosted runner inside the VPC. It can reach the private EKS API endpoint and ECR via VPC endpoints:

# .github/workflows/deploy.yml
jobs:
  deploy:
    runs-on: self-hosted  # runner inside VPC
    steps:
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789:role/cicd-deploy-role
          aws-region: us-east-1

      - name: Login to ECR
        run: |
          aws ecr get-login-password | docker login \
            --username AWS \
            --password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com

      - name: Deploy to EKS
        run: |
          aws eks update-kubeconfig --name fintech-prod
          helm upgrade --install my-app ./helm/my-app \
            --set image.tag=${{ github.sha }}

First Deployment Validation

Don't call the pipeline "working" until you've traced the full path end to end: image push → ECR → EKS pod pull → running pod → ALB health check → live traffic. Each hop can fail independently in an air-gapped setup.

How Should Data Stores Be Configured in an Air-Gapped VPC?

Every data store in a regulated EKS deployment — RDS PostgreSQL, DynamoDB, and ElastiCache Redis — must be provisioned in private subnets with publicly_accessible: false set explicitly at the resource level, not just through security group rules. Security groups can be modified; the publicly_accessible flag removes the public DNS endpoint entirely, closing the exposure regardless of any future policy drift.

All data stores must be provisioned in private subnets with no public endpoint exposure.

RDS PostgreSQL with pgvector

For AI-augmented fintech applications, pgvector enables vector similarity search inside Postgres — useful for semantic search over transaction data, document embeddings, or fraud pattern matching.

# Terraform
resource "aws_db_instance" "postgres" {
  identifier           = "fintech-postgres"
  engine               = "postgres"
  engine_version       = "16.1"
  instance_class       = "db.r6g.large"
  multi_az             = true
  db_subnet_group_name = aws_db_subnet_group.private.name
  vpc_security_group_ids = [aws_security_group.rds.id]
  publicly_accessible  = false
  storage_encrypted    = true

  # Enable pgvector via parameter group
  parameter_group_name = aws_db_parameter_group.postgres_pgvector.name
}

Install the extension after provisioning:

CREATE EXTENSION IF NOT EXISTS vector;
CREATE INDEX ON embeddings USING ivfflat (embedding vector_cosine_ops);

ElastiCache Redis and DynamoDB

Both run entirely in private subnets. ElastiCache Redis requires a subnet group scoped to private subnets. DynamoDB uses a gateway endpoint (free) — no interface endpoint needed.

How Do You Manage Secrets and Security Without Exposing Credentials?

Kubernetes Secret objects are base64-encoded, not encrypted — any cluster administrator with RBAC read access can decode them with a single command. In regulated environments, AWS External Secrets Operator resolves this by pulling credentials from AWS Secrets Manager at pod startup and syncing them into ephemeral Kubernetes Secrets. Credentials never appear in manifest files, Git history, or container image layers.

AWS WAF on the ALB

Attach a Web ACL to your Application Load Balancer with at minimum:

Core Rule Set (CRS) — protects against OWASP Top 10
Known Bad Inputs — blocks common injection payloads

The ALB sits in public subnets (it receives external traffic), but the security group only allows 443 inbound. Backend EKS nodes only allow traffic from the ALB security group.

Kubernetes Secrets Without Kubernetes Secrets

Storing secrets as Kubernetes Secret objects is fine for development, but they're base64-encoded, not encrypted, and cluster admins can read them. In a regulated environment, use External Secrets Operator to pull from AWS Secrets Manager instead:

# ExternalSecret — pulls from Secrets Manager into a Kubernetes Secret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-credentials
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: ClusterSecretStore
  target:
    name: db-credentials
    creationPolicy: Owner
  data:
    - secretKey: password
      remoteRef:
        key: fintech/prod/db
        property: password

Pods mount the resulting Kubernetes Secret normally — but the actual credential lives in Secrets Manager, has rotation enabled, and never touches a manifest file.

TLS and HTTPS Enforcement

Provision an ACM certificate for your domain and configure the ALB to redirect HTTP to HTTPS. Set HTTPS-only enforcement at the ALB listener level — don't rely on application code to enforce it.

How Do You Run Amazon Bedrock and Transcribe in an Air-Gapped Cluster?

Amazon Bedrock and Amazon Transcribe both support VPC interface endpoints, meaning all LLM inference and speech-to-text requests from private EKS workloads never leave the AWS network. For regulated industries, this keeps AI processing within the same network boundary as the rest of the application — data residency compliance is maintained without routing inference traffic through the public internet.

For platforms using Amazon Bedrock (LLM inference) or Amazon Transcribe (speech-to-text), both services support VPC interface endpoints — meaning model inference requests never leave the AWS network.

# Bedrock VPC endpoint
aws ec2 create-vpc-endpoint \
  --service-name com.amazonaws.us-east-1.bedrock-runtime \
  --vpc-endpoint-type Interface \
  --vpc-id vpc-xxx \
  --subnet-ids subnet-xxx subnet-yyy \
  --security-group-ids sg-endpoints

IAM policies for Bedrock should be scoped to specific model ARNs — don't grant bedrock:* broadly.

Wrapping Up

Air-gapped EKS deployments are non-trivial but entirely repeatable once the architecture decisions are made upfront. The three things that trip teams most often:

Forgetting to mirror controller images before cluster bootstrap — the cluster won't come up
Not planning the CI/CD pipeline for VPC-internal operation — standard hosted runners won't reach the private API endpoint
Missing a VPC endpoint for a service a workload calls at runtime — pods start fine, then fail on first API call

Get the VPC endpoint list right, mirror your images, and design your pipeline for VPC-internal operation from day one. The rest is standard Kubernetes operations.

Get Prodinit's AI engineering guides in your inbox

Deep-dives on production LLMs, voice AI, and MLOps — published weekly. No sales emails.

Frequently Asked Questions

Can I use public subnets at all in an air-gapped deployment?

Yes. Public subnets host the Application Load Balancer (which receives inbound traffic) and NAT Gateways (which your public-subnet resources can use). The air-gapped constraint applies to your private subnets — EKS nodes, RDS, Redis, and application pods run there with no outbound internet route.

How do EKS nodes pull container images without internet access?

Nodes pull images exclusively from private ECR repositories via the ECR VPC interface endpoint. No internet path is used. All images — including third-party controllers like cluster-autoscaler — must be mirrored into private ECR before deployment.

What's the cost overhead of VPC endpoints?

Interface endpoints are billed at $0.01/hour per AZ, plus $0.01/GB of data processed. A typical fintech EKS deployment in two AZs with 10 interface endpoints runs roughly $140–160/month in endpoint costs alone. Gateway endpoints (S3, DynamoDB) are free. Factor this into your infrastructure budget.

How do developers access the private EKS cluster?

Two patterns work well: (1) a bastion EC2 instance in a public subnet, accessible via SSH, with kubectl configured to the private API endpoint; or (2) AWS Systems Manager Session Manager, which requires no open inbound ports and leaves an audit trail — preferred in regulated environments.

Does air-gapping affect Kubernetes cluster upgrades?

Yes. Managed EKS node group upgrades pull new AMIs from AWS. This works natively since AWS AMIs are sourced from within the AWS network. However, any new controller images introduced by a Kubernetes version bump must be mirrored to private ECR before upgrading.

More from the blog

Air-Gapped