Financial services data breaches cost an average of $6.08 million per incident — 22% above the global average across all industries (IBM Cost of a Data Breach 2024, 2024). For regulated institutions, the answer isn't just better firewalls. It's network architecture that eliminates the attack surface at the infrastructure level.
Air-gapped AWS EKS deployments — where private subnets have zero internet egress and all traffic routes through VPC endpoints — are becoming the standard for regulated financial services workloads. This guide walks through the full architecture, from VPC design to CI/CD pipeline, based on a real deployment we executed for a fintech platform at Prodinit.
Key Takeaways
- Air-gapped EKS requires VPC interface and gateway endpoints for every AWS service your workloads touch — there's no fallback to the public internet
- Your CI/CD pipeline must be redesigned from scratch: images push to private ECR via VPC endpoint, and deployment runs through Systems Manager or a bastion inside the VPC
- Kubernetes External Secrets Operator + AWS Secrets Manager is the cleanest pattern for pod-level secret injection without exposing credentials in manifests
- Every data store (RDS, DynamoDB, ElastiCache) must live in private subnets, accessed via security group rules — no public endpoints
What Does "Air-Gapped" Mean in AWS Context?
An air-gapped VPC means your private subnets have no route to the internet — no NAT Gateway in private subnets, no internet gateway attachment to private route tables. All communication between your workloads and AWS services (S3, ECR, Secrets Manager, CloudWatch, Bedrock) must route through VPC endpoints.
AWS supports two endpoint types (AWS VPC Endpoints documentation):
- Gateway endpoints — for S3 and DynamoDB only; free, added as route table entries
- Interface endpoints — for all other AWS services via AWS PrivateLink; billed per hour per AZ
| Endpoint Type | Services | Cost | Configuration |
|---|---|---|---|
| Gateway | S3, DynamoDB only | Free | Route table entry — no ENI, no AZ cost |
| Interface (PrivateLink) | ECR, Secrets Manager, SSM, CloudWatch, STS, ELB, Bedrock, Transcribe, and more | $0.01/hr per AZ + $0.01/GB processed | Elastic network interface in each private subnet |
For a typical EKS deployment, you'll need interface endpoints for: ECR API, ECR Docker, S3 (or gateway), Secrets Manager, Systems Manager, CloudWatch Logs, STS, ELB, Bedrock (if using AI services), and Transcribe (if using speech-to-text).
Why This Matters for Regulated Workloads
Network isolation is a hard requirement under FFIEC guidelines and SEC cybersecurity rules for financial institutions. An air-gapped VPC enforces this at the infrastructure layer — there's no misconfigured security group that can accidentally allow outbound internet access, because the route simply doesn't exist.
On the Client deployment, we discovered mid-project that several Helm charts we'd planned to use for in-cluster controllers (ALB Ingress Controller, cluster-autoscaler) attempt to pull their own images from public registries at install time. We had to mirror every controller image into private ECR before the cluster could bootstrap. This is a class of problem that only surfaces when you actually try to deploy — not in planning.
How Do You Design a Zero-Egress VPC for AWS EKS?
Regulated financial services environments under FFIEC and SEC cybersecurity guidelines require network isolation enforced at the infrastructure level — not as a policy overlay but as a structural property of the network. A multi-AZ VPC with private subnets carrying no internet route, combined with VPC interface endpoints for every AWS service, eliminates the outbound internet path entirely rather than restricting it.
Start with a multi-AZ VPC with distinct public and private subnet tiers.
Subnet Design
VPC: 10.0.0.0/16
├── Public Subnets (one per AZ)
│ ├── 10.0.1.0/24 (us-east-1a)
│ ├── 10.0.2.0/24 (us-east-1b)
│ └── NAT Gateways (for public-subnet resources only)
└── Private Subnets (one per AZ)
├── 10.0.10.0/24 (us-east-1a)
└── 10.0.11.0/24 (us-east-1b)
— No internet route in route table
— VPC endpoints attached
Private subnet route tables should contain exactly two entries: the local VPC CIDR, and gateway endpoint routes for S3/DynamoDB. Nothing else.
VPC Endpoints to Provision
Create interface endpoints in your private subnets for each required AWS service:
# ECR — required for image pulls
aws ec2 create-vpc-endpoint --vpc-id vpc-xxx \
--service-name com.amazonaws.us-east-1.ecr.api \
--vpc-endpoint-type Interface \
--subnet-ids subnet-xxx subnet-yyy \
--security-group-ids sg-endpoints
aws ec2 create-vpc-endpoint --vpc-id vpc-xxx \
--service-name com.amazonaws.us-east-1.ecr.dkr \
--vpc-endpoint-type Interface \
--subnet-ids subnet-xxx subnet-yyy \
--security-group-ids sg-endpoints
# Secrets Manager
# Systems Manager (for bastion-free access)
# CloudWatch Logs
# STS (required for IRSA)
# ELB (required for ALB Ingress Controller)
Create a dedicated security group for endpoints that allows HTTPS (443) inbound from your VPC CIDR. Don't open it wider than needed.
IAM Roles
Set up three distinct IAM role categories before touching EKS:
- Developer access role — scoped to read operations, no production deploy permissions
- CI/CD role — ECR push, EKS
kubectl apply, Secrets Manager read - Node instance role — ECR pull, CloudWatch logging, S3 read for application buckets
Use IAM Roles for Service Accounts (IRSA) for in-cluster components. This ties Kubernetes service accounts to IAM roles without storing credentials anywhere.
How Do You Run an EKS Cluster with No Public Internet Path?
As of 2024, 80% of organizations run Kubernetes in production (CNCF Annual Survey 2024). The hard part isn't Kubernetes — it's running it without any public internet path.
Cluster Creation
When creating the EKS cluster, set the API server endpoint access to private only:
eksctl create cluster \
--name fintech-prod \
--region us-east-1 \
--vpc-private-subnets subnet-xxx,subnet-yyy \
--node-private-networking \
--endpoint-private-access true \
--endpoint-public-access false
With --endpoint-public-access false, kubectl only works from inside the VPC. This is intentional. Access the cluster via a bastion host or AWS Systems Manager Session Manager.
Node Groups
Place node groups in private subnets with autoscaling enabled:
# nodegroup.yaml
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: fintech-prod
region: us-east-1
managedNodeGroups:
- name: workers
instanceTypes: ["m6i.xlarge", "m6i.2xlarge"]
minSize: 2
maxSize: 10
desiredCapacity: 3
privateNetworking: true
iam:
withAddonPolicies:
autoScaler: true
cloudWatch: true
Bootstrapping In-Cluster Controllers
This is where most air-gapped deployments stall. cluster-autoscaler and the AWS Load Balancer Controller both try to pull images from public registries during helm install. You must mirror them to ECR first:
# Pull, retag, and push to private ECR
docker pull registry.k8s.io/autoscaling/cluster-autoscaler:v1.29.0
docker tag registry.k8s.io/autoscaling/cluster-autoscaler:v1.29.0 \
123456789.dkr.ecr.us-east-1.amazonaws.com/cluster-autoscaler:v1.29.0
docker push 123456789.dkr.ecr.us-east-1.amazonaws.com/cluster-autoscaler:v1.29.0
Override the image in Helm values to point to your private ECR before installing.
How Do You Build a CI/CD Pipeline Without Internet Access?
Standard GitHub Actions hosted runners and most CI/CD platforms assume outbound internet access for image pulls and API calls — assumptions that silently break the moment you remove internet egress. The working architecture requires a self-hosted runner deployed inside the VPC, all image pushes to private ECR via VPC endpoint, and all cluster deployments executed through AWS Systems Manager Session Manager with zero inbound ports.
Standard CI/CD tooling assumes internet access. GitHub Actions' hosted runners can't reach a private EKS API endpoint. CodePipeline agents can't pull from Docker Hub. You need a fundamentally different pipeline architecture.
The pattern that works:
Developer pushes code
↓
GitHub Actions (or CodePipeline)
↓
Build image in CI environment (with internet access)
↓
Push image to Private ECR via VPC endpoint
↓
Trigger deployment (CodePipeline or self-hosted runner in VPC)
↓
kubectl/Helm apply via Systems Manager Session Manager
↓
EKS pulls image from private ECR (no internet needed)
↓
ALB routes traffic
Self-Hosted Runner in VPC
If using GitHub Actions, deploy a self-hosted runner inside the VPC. It can reach the private EKS API endpoint and ECR via VPC endpoints:
# .github/workflows/deploy.yml
jobs:
deploy:
runs-on: self-hosted # runner inside VPC
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789:role/cicd-deploy-role
aws-region: us-east-1
- name: Login to ECR
run: |
aws ecr get-login-password | docker login \
--username AWS \
--password-stdin 123456789.dkr.ecr.us-east-1.amazonaws.com
- name: Deploy to EKS
run: |
aws eks update-kubeconfig --name fintech-prod
helm upgrade --install my-app ./helm/my-app \
--set image.tag=${{ github.sha }}
First Deployment Validation
Don't call the pipeline "working" until you've traced the full path end to end: image push → ECR → EKS pod pull → running pod → ALB health check → live traffic. Each hop can fail independently in an air-gapped setup.
How Should Data Stores Be Configured in an Air-Gapped VPC?
Every data store in a regulated EKS deployment — RDS PostgreSQL, DynamoDB, and ElastiCache Redis — must be provisioned in private subnets with publicly_accessible: false set explicitly at the resource level, not just through security group rules. Security groups can be modified; the publicly_accessible flag removes the public DNS endpoint entirely, closing the exposure regardless of any future policy drift.
All data stores must be provisioned in private subnets with no public endpoint exposure.
RDS PostgreSQL with pgvector
For AI-augmented fintech applications, pgvector enables vector similarity search inside Postgres — useful for semantic search over transaction data, document embeddings, or fraud pattern matching.
# Terraform
resource "aws_db_instance" "postgres" {
identifier = "fintech-postgres"
engine = "postgres"
engine_version = "16.1"
instance_class = "db.r6g.large"
multi_az = true
db_subnet_group_name = aws_db_subnet_group.private.name
vpc_security_group_ids = [aws_security_group.rds.id]
publicly_accessible = false
storage_encrypted = true
# Enable pgvector via parameter group
parameter_group_name = aws_db_parameter_group.postgres_pgvector.name
}
Install the extension after provisioning:
CREATE EXTENSION IF NOT EXISTS vector;
CREATE INDEX ON embeddings USING ivfflat (embedding vector_cosine_ops);
ElastiCache Redis and DynamoDB
Both run entirely in private subnets. ElastiCache Redis requires a subnet group scoped to private subnets. DynamoDB uses a gateway endpoint (free) — no interface endpoint needed.
How Do You Manage Secrets and Security Without Exposing Credentials?
Kubernetes Secret objects are base64-encoded, not encrypted — any cluster administrator with RBAC read access can decode them with a single command. In regulated environments, AWS External Secrets Operator resolves this by pulling credentials from AWS Secrets Manager at pod startup and syncing them into ephemeral Kubernetes Secrets. Credentials never appear in manifest files, Git history, or container image layers.
AWS WAF on the ALB
Attach a Web ACL to your Application Load Balancer with at minimum:
- Core Rule Set (CRS) — protects against OWASP Top 10
- Known Bad Inputs — blocks common injection payloads
The ALB sits in public subnets (it receives external traffic), but the security group only allows 443 inbound. Backend EKS nodes only allow traffic from the ALB security group.
Kubernetes Secrets Without Kubernetes Secrets
Storing secrets as Kubernetes Secret objects is fine for development, but they're base64-encoded, not encrypted, and cluster admins can read them. In a regulated environment, use External Secrets Operator to pull from AWS Secrets Manager instead:
# ExternalSecret — pulls from Secrets Manager into a Kubernetes Secret
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: db-credentials
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: db-credentials
creationPolicy: Owner
data:
- secretKey: password
remoteRef:
key: fintech/prod/db
property: password
Pods mount the resulting Kubernetes Secret normally — but the actual credential lives in Secrets Manager, has rotation enabled, and never touches a manifest file.
TLS and HTTPS Enforcement
Provision an ACM certificate for your domain and configure the ALB to redirect HTTP to HTTPS. Set HTTPS-only enforcement at the ALB listener level — don't rely on application code to enforce it.
How Do You Run Amazon Bedrock and Transcribe in an Air-Gapped Cluster?
Amazon Bedrock and Amazon Transcribe both support VPC interface endpoints, meaning all LLM inference and speech-to-text requests from private EKS workloads never leave the AWS network. For regulated industries, this keeps AI processing within the same network boundary as the rest of the application — data residency compliance is maintained without routing inference traffic through the public internet.
For platforms using Amazon Bedrock (LLM inference) or Amazon Transcribe (speech-to-text), both services support VPC interface endpoints — meaning model inference requests never leave the AWS network.
# Bedrock VPC endpoint
aws ec2 create-vpc-endpoint \
--service-name com.amazonaws.us-east-1.bedrock-runtime \
--vpc-endpoint-type Interface \
--vpc-id vpc-xxx \
--subnet-ids subnet-xxx subnet-yyy \
--security-group-ids sg-endpoints
IAM policies for Bedrock should be scoped to specific model ARNs — don't grant bedrock:* broadly.
Wrapping Up
Air-gapped EKS deployments are non-trivial but entirely repeatable once the architecture decisions are made upfront. The three things that trip teams most often:
- Forgetting to mirror controller images before cluster bootstrap — the cluster won't come up
- Not planning the CI/CD pipeline for VPC-internal operation — standard hosted runners won't reach the private API endpoint
- Missing a VPC endpoint for a service a workload calls at runtime — pods start fine, then fail on first API call
Get the VPC endpoint list right, mirror your images, and design your pipeline for VPC-internal operation from day one. The rest is standard Kubernetes operations.
Prodinit builds production-ready AWS infrastructure for fintech and regulated industries. If you're planning an air-gapped deployment and want a second opinion on your architecture, book a free 30-minute call.
Frequently Asked Questions
Yes. Public subnets host the Application Load Balancer (which receives inbound traffic) and NAT Gateways (which your public-subnet resources can use). The air-gapped constraint applies to your private subnets — EKS nodes, RDS, Redis, and application pods run there with no outbound internet route.
Nodes pull images exclusively from private ECR repositories via the ECR VPC interface endpoint. No internet path is used. All images — including third-party controllers like cluster-autoscaler — must be mirrored into private ECR before deployment.
Interface endpoints are billed at $0.01/hour per AZ, plus $0.01/GB of data processed. A typical fintech EKS deployment in two AZs with 10 interface endpoints runs roughly $140–160/month in endpoint costs alone. Gateway endpoints (S3, DynamoDB) are free. Factor this into your infrastructure budget.
Two patterns work well: (1) a bastion EC2 instance in a public subnet, accessible via SSH, with kubectl configured to the private API endpoint; or (2) AWS Systems Manager Session Manager, which requires no open inbound ports and leaves an audit trail — preferred in regulated environments.
Yes. Managed EKS node group upgrades pull new AMIs from AWS. This works natively since AWS AMIs are sourced from within the AWS network. However, any new controller images introduced by a Kubernetes version bump must be mirrored to private ECR before upgrading.
