Part IX · Build, Deploy, Operate
Chapter 111 ~21 min read

Secrets management: 1Password, Vault, External Secrets Operator

"A secret in a git repository is a secret in 10,000 git repositories. Treat it accordingly"

Every serving platform has secrets. Database passwords, API keys, TLS private keys, OAuth client secrets, signing keys for JWTs, webhook signing secrets, cloud credentials for one system that doesn’t have IAM yet. How those secrets are created, where they live at rest, how they reach the running workloads, and how they rotate — that is secrets management. It is boring until it goes wrong, at which point it is the most interesting problem on the team and somebody’s resume is already being updated.

This chapter is the practical tour. What the storage options are (Vault, cloud-provider secret managers, 1Password), how Kubernetes secrets are actually delivered to pods (Sealed Secrets vs External Secrets Operator vs SOPS vs CSI drivers), why “secret stores” and “KMS” are different things even though people conflate them, and what rotation actually looks like in production. Most teams get the first 80% of this right and then leave a long tail of broken rotations, leaked credentials, and .env files in S3 buckets. The goal is to understand the space well enough to avoid the long tail.

Outline:

  1. What a secret actually is.
  2. The storage options — Vault, cloud secret managers, 1Password.
  3. KMS vs secret stores — different jobs.
  4. Kubernetes and the secret delivery problem.
  5. Sealed Secrets.
  6. External Secrets Operator (ESO).
  7. CSI Secrets Store driver.
  8. SOPS — Git-friendly encryption.
  9. Rotation hygiene.
  10. Auditing, leak detection, and incident response.
  11. The mental model.

111.1 What a secret actually is

A secret is any value whose disclosure materially harms the system or its users. That definition is worth unpacking, because it’s broader than “password.” It includes:

  • Access credentials: database passwords, API keys, cloud provider credentials, TLS private keys.
  • Signing material: JWT signing keys, webhook HMAC secrets, SSH keys, cryptographic identity material.
  • Encryption keys: symmetric keys for application-level encryption, master keys that encrypt other keys (see KMS, §111.3).
  • Identity tokens: session tokens, OAuth tokens, bearer tokens.
  • Configuration that enables privileged access: a connection string to an internal Redis that has no auth but is only reachable from inside the VPC.

And it does not include:

  • Configuration that is not itself privileged: the hostname of your RDS, the name of your S3 bucket, feature flags. These are configuration, not secrets. Treating them as secrets is expensive (every read goes through the secret store) and makes debugging harder. They belong in plain config.

The distinction matters because secret-management tooling is slower, more operationally expensive, and less ergonomic than config tooling. Overclassifying everything as a secret produces a system where nobody knows which values are actually sensitive, rotation is impossible, and the secret store becomes a dumping ground. Classify secrets narrowly and specifically.

A useful exercise: take every value in your configuration and ask “if this were tweeted, would I need to rotate anything?” If yes, it’s a secret. If no, it’s config.

111.2 The storage options

Four main categories, each with its own niche.

HashiCorp Vault. The Cadillac of secret stores. Self-hosted, multi-backend (it talks to many storage engines), with first-class support for dynamic secrets (it generates short-lived database credentials on demand), secret rotation, audit logging, and fine-grained ACLs. Vault is what you run when secrets management is a central concern and you want a single authoritative system for the whole org.

The strengths: dynamic secrets (every app gets its own short-lived DB credentials), broad ecosystem (the Vault Kubernetes integration, the CLI, the API), strong audit log, policy language that handles complex access patterns, supports custom secret engines. The weaknesses: it’s an operational beast. Running Vault in HA mode with Raft storage, sealing/unsealing, backup/restore, and the auto-unseal story via cloud KMS is its own project. Many teams get Vault stood up, realize the operational cost, and either put it in a managed form (HCP Vault) or revisit whether they need it.

Cloud-provider secret managers. AWS Secrets Manager, GCP Secret Manager, Azure Key Vault. These are managed services that the cloud provider runs for you. Simpler than Vault: you store a secret, you read a secret, you rotate via a Lambda (or equivalent), you audit via CloudTrail. Less powerful than Vault (no dynamic secrets out of the box, limited secret engines), but much cheaper to operate because you don’t run the infrastructure.

For teams already fully committed to one cloud, the cloud secret manager is the default. It integrates with IAM, it’s audited by CloudTrail, rotation is a built-in primitive, and there is no control plane to run. The only limitation is multi-cloud: if you have workloads across AWS and GCP, you either run two secret stores or you run Vault as the common layer.

1Password, Bitwarden, Doppler, and similar. Tools that started as password managers and grew secrets features. 1Password Connect in particular is popular for small teams because the UX is good (engineers already have 1Password installed) and the pricing is reasonable. It exposes a local API daemon and an SDK that application code can call to fetch secrets.

These tools are great for human-readable secrets (the OAuth token for the internal admin dashboard, the SSH key for the bastion host) and serve as a reasonable starter secret store for teams without a dedicated platform team. They are less well suited to high-volume programmatic access or to dynamic secrets. The niche is “humans need to share this, and also one or two services consume it.”

Environment variables and .env files. The bad option. Common, hard to eradicate, and responsible for the majority of leaked credentials in public git repositories. Any secret in a .env file on a laptop or CI runner is one git add . away from being public. Real secrets management starts with banning this pattern and providing an alternative that’s at least as easy.

The common mature setup: a cloud secret manager as the primary store for programmatic access, plus 1Password for human-shared credentials and break-glass accounts, plus Vault only if multi-cloud or dynamic-secrets requirements force it.

111.3 KMS vs secret stores

People conflate these. They do different jobs.

A secret store (Vault, Secrets Manager, etc.) stores arbitrary secret values. You put “my database password” in, you get “my database password” out. It’s a database for confidential data.

A KMS (Key Management Service — AWS KMS, GCP Cloud KMS, Azure Key Vault Keys, HSM appliances) stores cryptographic keys and performs cryptographic operations with those keys without ever exposing them. You don’t get the key out. You hand the KMS a plaintext, the KMS encrypts it with the key and returns the ciphertext. You hand the KMS a ciphertext, it returns the plaintext. The key itself never leaves the KMS (or its attached HSM).

The difference matters because the security properties are different. A compromised secret store is bad — the attacker gets your secrets. A compromised KMS is also bad, but an attacker who steals an encrypted ciphertext without compromising the KMS can’t decrypt it. KMS is the root of trust; secret stores sit above it.

The common pattern is envelope encryption:

  1. Application generates a random data encryption key (DEK) and encrypts the actual secret data with it (AES-GCM, etc.).
  2. Application asks KMS to encrypt the DEK with a key encryption key (KEK) that lives in the KMS. The KMS returns the encrypted DEK.
  3. Application stores the ciphertext + the encrypted DEK in the secret store (or anywhere, really — it’s safe to store alongside the ciphertext).
  4. To decrypt: the application asks KMS to decrypt the DEK, then uses the decrypted DEK to decrypt the ciphertext.
Envelope encryption: plaintext encrypted with a DEK; the DEK encrypted by a KMS KEK; only the encrypted DEK is stored with the ciphertext, so a compromised store cannot decrypt without the KMS. plaintextsecret value DEKrandom key AES-GCM ciphertextstored anywhere KMSKEK never leaves encrypt DEK enc(DEK)stored with ciphertext stored useless without KMS
Envelope encryption means the secret store holds only ciphertexts — even a full dump of the store is useless without the KMS key, which never leaves its HSM boundary.

Why this matters for secret stores. The secret store itself is usually running on top of envelope encryption under the hood — Vault’s transit engine does this, AWS Secrets Manager does this with AWS KMS. When you “store a secret in AWS Secrets Manager,” what actually happens is the secret is encrypted with a DEK, the DEK is encrypted by AWS KMS, and the ciphertexts are stored in a DynamoDB-like backend. The KMS key is the root of trust; the secret store handles the storage and access control.

The operational upshot: secure your KMS keys as carefully as your secret store. A compromised KMS key compromises every secret encrypted under it. KMS key rotation is a thing — most cloud KMSes offer automatic annual key rotation, which rotates the underlying key material while keeping the key ARN stable.

111.4 Kubernetes and the secret delivery problem

A secret stored in Vault or a cloud secret manager has to reach the pod that needs it. How? This is the delivery problem, and it has three main solutions in Kubernetes.

First, a note on the native Secret resource. A Kubernetes Secret is a YAML object with base64-encoded data. Base64 is not encryption. A Kubernetes Secret is exactly as confidential as the etcd it lives in and as the access controls on the namespace. Kubernetes Secrets are perfectly fine as the runtime delivery mechanism (pods mount them as env vars or files) provided the etcd is encrypted at rest (KMS-backed encryption provider) and RBAC is tight. They are not a store — you would never commit them to git — but they are a reasonable last-mile container.

The delivery solutions then differ by how the secret gets into the Kubernetes Secret:

  1. Sealed Secrets — encrypt the secret at author time, commit the ciphertext to git, let a controller in the cluster decrypt at apply time.
  2. External Secrets Operator (ESO) — put the actual secret in an external store (Vault, Secrets Manager), have a controller in the cluster pull and sync it into a Kubernetes Secret.
  3. CSI Secrets Store driver — mount secrets from an external store directly into pod filesystems via a CSI driver, no Kubernetes Secret in between.

Plus one honorable mention that isn’t Kubernetes-specific but is widely used: SOPS, which encrypts YAML/JSON files field-by-field using KMS or GPG, so you can commit the encrypted file to git and decrypt at deploy time.

Each has different tradeoffs on where the trust boundary sits and who has read access.

111.5 Sealed Secrets

Bitnami’s Sealed Secrets is the simplest approach. A controller runs in the cluster with a private key. You encrypt your secrets against the corresponding public key (using the kubeseal CLI), commit the ciphertext to git as a SealedSecret CRD, and on apply the controller decrypts it into a regular Secret.

apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: db-credentials
  namespace: prod
spec:
  encryptedData:
    password: AgBy3i4OJSWK+PiTySYZZA9rO43cGDEq...

The strengths: simple, fully offline (no external store required), git-native (the ciphertext is just a file), works with GitOps naturally. Drop the file in the repo, ArgoCD applies it, the controller decrypts it into a Secret, the pod consumes it.

The weaknesses: rotation is manual. If you change a secret value, you have to re-encrypt it and commit a new ciphertext. There’s no store of record — if the cluster is destroyed and the controller’s private key is lost, all existing SealedSecrets are unrecoverable unless you kept backups of the private key. Rotating the private key itself requires re-encrypting every SealedSecret in the fleet, which is painful at scale.

Sealed Secrets is appropriate for small teams with a small number of rarely-changing secrets. It’s less appropriate as the core of a production secret-management story at scale.

111.6 External Secrets Operator (ESO)

ESO is the modern standard. The architecture: an operator runs in the cluster and watches ExternalSecret CRDs. Each ExternalSecret points to a secret in an external store (Vault, AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, 1Password, many others) and names a Kubernetes Secret to sync it into. The operator fetches from the store on a schedule (and on change), creates or updates the Kubernetes Secret, and the pod consumes the Secret as normal.

apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: aws-secrets
  namespace: prod
spec:
  provider:
    aws:
      service: SecretsManager
      region: us-east-1
      auth:
        jwt:
          serviceAccountRef:
            name: eso-sa
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: db-credentials
  namespace: prod
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets
    kind: SecretStore
  target:
    name: db-credentials
    creationPolicy: Owner
  data:
    - secretKey: password
      remoteRef:
        key: prod/db/main
        property: password

What this gives you:

  • Store is the source of truth. Secrets live in AWS Secrets Manager (or wherever). The Kubernetes Secret is a replica, refreshed periodically.
  • Rotation works. Rotate the secret in the store, and ESO picks it up on the next refresh. The pod then has to reload it (via a rollout, a signal, or a sidecar).
  • No decryption key lives in the cluster. The cluster authenticates to the store via a service account + IAM (for AWS/GCP/Azure), which means the “key material” is IAM, not a private key file.
  • Audit lives in the store. Every read of a secret is a Secrets Manager API call, logged by CloudTrail.
  • Cross-cluster consistency. Multiple clusters read from the same store and get the same secret.

The tradeoffs:

  • The store is a new dependency. If the store is down, new pods can’t start (existing pods already have the secret mounted and continue running).
  • Refresh lag. Rotation in the store isn’t instantaneous in the cluster; there’s a refresh interval. Can be tuned down to 1 minute.
  • Secret is still in etcd. ESO still creates a Kubernetes Secret as the delivery mechanism. If you don’t trust etcd, this doesn’t help. (The fix is the CSI driver, §111.7.)

ESO is the default choice for most modern Kubernetes platforms. It’s a CNCF incubation project, widely deployed, and actively maintained. The ecosystem of supported stores is broad — Vault, AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, 1Password Connect, Akeyless, Doppler, IBM Cloud Secrets Manager, many more.

graph LR
  Store[(Secret Store<br/>AWS SM / Vault)] -->|pull on schedule| ESO[External Secrets<br/>Operator]
  ESO -->|creates/updates| KS[Kubernetes Secret]
  KS -->|mounts as env/file| Pod[Pod]
  Pod -->|IAM/OIDC| Store
  style ESO fill:var(--fig-accent-soft),stroke:var(--fig-accent)

ESO’s model: the store is the source of truth; ESO syncs it into a Kubernetes Secret on a refresh interval; rotation in the store propagates automatically without redeploying the operator.

111.7 CSI Secrets Store driver

The CSI Secrets Store driver sidesteps the Kubernetes Secret entirely. A CSI (Container Storage Interface) volume is mounted into the pod, and the driver reads secrets from the external store and writes them as files into the mount. The pod reads the files. No Kubernetes Secret is ever created.

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
  name: db-credentials
spec:
  provider: aws
  parameters:
    objects: |
      - objectName: "prod/db/main"
        objectType: "secretsmanager"
        jmesPath:
          - path: "password"
            objectAlias: "password"
---
# Pod spec:
volumes:
  - name: secrets
    csi:
      driver: secrets-store.csi.k8s.io
      readOnly: true
      volumeAttributes:
        secretProviderClass: db-credentials

The benefits:

  • No secret in etcd. The secret never becomes a Kubernetes Secret object (unless you configure the driver to also sync it, which is optional).
  • Tighter blast radius. etcd compromise doesn’t expose secrets for workloads using CSI-only mode.
  • Native to the pod lifecycle. Secrets are mounted when the pod starts and unmounted when it terminates.

The costs:

  • More complex driver stack. The CSI driver is a DaemonSet plus provider-specific plugins. More moving parts.
  • No env-var support. CSI gives you files, not env vars. Some apps really want env vars. A workaround is to set syncSecret.enabled: true which creates a Kubernetes Secret from the mounted files, but then you’ve given up the main benefit.
  • Less mature ecosystem than ESO. Fewer providers, smaller community.

CSI Secrets Store driver is the right choice when the “secret in etcd” concern is operative — high-security environments, regulated industries, cases where a compromised etcd must not expose secrets. For most teams, ESO with etcd encryption at rest is sufficient and has a better ergonomics.

111.8 SOPS — Git-friendly encryption

Mozilla’s SOPS (Secrets OPerationS) is a command-line tool that encrypts YAML, JSON, or ENV files field-by-field using a KMS (AWS KMS, GCP KMS, Azure Key Vault, age, GPG). The encrypted file is structurally identical to the plaintext but with values replaced by ciphertexts. You commit the encrypted file to git; at deploy time, the tool decrypts it in memory.

# secrets.yaml (SOPS-encrypted)
database:
  host: db.internal
  username: app
  password: ENC[AES256_GCM,data:WzZFkw==,iv:mJhX...,tag:9vQ==,type:str]
sops:
  kms:
    - arn: arn:aws:kms:us-east-1:123456789012:key/abcdef-...

Notice that host and username are plaintext (they’re not sensitive), and only password is encrypted. That’s field-level encryption — you commit the file, anyone can read the structure and non-sensitive fields, but only someone with KMS decrypt permissions can read the password. Diffs in git show structural changes cleanly.

SOPS integrates with Helm, Kustomize, Terraform, and most deployment tools via plugins or wrappers. ArgoCD has a SOPS plugin (or you can use the helm-secrets plugin). The workflow: commit the encrypted file, the deploy tool decrypts on apply, the resulting plaintext is consumed.

The strengths:

  • Git is the source of truth. No external store needed. The secret lives in your repo, encrypted.
  • KMS is the trust root. Access is IAM-based; revoking KMS access revokes read access.
  • Field-level. Non-sensitive structure is readable.
  • Simple rotation. Update the value, re-encrypt, commit, apply.

The weaknesses:

  • No dynamic secrets. The value in the file is the value.
  • Rotation requires a commit. Every secret change produces a git commit, which isn’t what you want for credentials that rotate daily.
  • Audit is git history, not a store audit log. Reads aren’t audited (you can only audit who had git access).
  • KMS costs per decrypt. High-frequency decrypts add up.

SOPS is the right choice for small-to-medium teams that want a Git-only workflow and don’t need fancy rotation or dynamic secrets. It’s particularly popular with GitOps-first teams because it fits cleanly into the “everything in git” model.

Note: age is a modern alternative to GPG for SOPS’s encryption backend. If you’re using SOPS without cloud KMS, use age rather than GPG. GPG’s UX and key management are famously painful.

111.9 Rotation hygiene

Rotation is where secret management stops being easy. Creating a secret once is trivial. Rotating it every 90 days, without downtime, without waking an on-call engineer, across every service that uses it — that’s the real problem.

The categories of secrets, by rotation difficulty:

Static and rarely rotated: TLS private keys for internal services (rotated on certificate renewal, maybe annually), SSH keys, signing keys for long-lived tokens. These rotate rarely, and rotation is a planned event.

Scheduled rotation: database passwords, API keys with expiration. Typically on a 30-90 day cycle. Rotation is automated by a tool (AWS Secrets Manager’s managed rotation, Vault’s database secrets engine).

Dynamic / per-request: Vault’s database dynamic secrets — every application instance gets its own short-lived credential from Vault, valid for an hour or a day, and Vault creates the user in the database on demand. The “rotation” is the lease expiring and a new credential being fetched. No explicit rotation step.

Dynamic secrets are the gold standard, and Vault’s database secrets engine is the canonical implementation. The idea: instead of a shared database password that every pod knows, each pod fetches a credential from Vault at startup, Vault creates a user in the database just for that pod (with a TTL), and when the lease expires, the credential is invalidated. A leaked credential is worthless after its TTL; a compromised pod doesn’t leak a shared password.

The catch: dynamic secrets require application support. The app must know how to renew its lease (Vault’s sidecar agent handles this for you), it must handle credential rotation mid-runtime (or restart cleanly), and the database must support user-level permissions that match what Vault wants to create. Retrofitting dynamic secrets into an existing app is a project. Greenfield apps should plan for it from the start.

For scheduled rotation without dynamic secrets, the pattern is:

  1. Rotation trigger (time-based, from Secrets Manager or a CronJob): generate a new secret value.
  2. Write the new value to the store, keeping the old value accessible for a grace period.
  3. Notify or restart dependents so they pick up the new value. This is the hard part — the app has to either re-read the secret on a signal, restart, or the operator restarts it.
  4. Revoke the old value after the grace period.

The “restart to pick up the new secret” pattern is common but problematic. If every secret rotation forces a pod restart, rotation becomes a rolling-update event, which is disruptive. The alternative is apps that reload secrets on a SIGHUP or on a file-watcher — but this requires the app to be written to support it.

For small teams, rotation is often “configured but not actually happening.” The rotation Lambda exists, but nobody checks whether it’s actually running, and if it fails silently, nobody notices until something has been 180 days past its rotation deadline. Alert on rotation failures explicitly — a failed rotation is a production incident.

111.10 Auditing, leak detection, and incident response

Three pieces of the operational story.

Auditing. Every access to a secret should be logged, with caller identity, secret identity, and timestamp. Secrets Manager does this via CloudTrail. Vault does it via audit devices. ESO logs its reads on the controller side. The purpose: after a breach, you need to know which secrets the attacker could have accessed. If the audit log is missing, every secret has to be assumed compromised.

Leak detection. Secrets do get committed to git by mistake. Tools like git-secrets, gitleaks, trufflehog, and GitHub’s own secret scanning look for known secret patterns in commits and alert. Run them as a pre-commit hook on developer machines and as a required CI check on every PR. Enable GitHub secret scanning on all repositories; when GitHub detects a leaked credential, it will often notify the cloud provider automatically to revoke it.

Incident response. The playbook for a leaked secret:

  1. Immediately revoke the secret in the store. Don’t rotate — revoke. An attacker with the old value must lose access immediately.
  2. Create a replacement in the store with a new value.
  3. Update dependents to use the new value (rolling restart, or whatever your app needs).
  4. Check the audit log for unauthorized access before the revocation. Was the secret used by the attacker?
  5. Check for blast radius: if the leaked secret grants access to other systems (an IAM access key that can read S3, a DB password that can read PII), treat the entire blast radius as potentially compromised.
  6. Post-mortem: how did the leak happen? Was it a human error, a tooling gap, a missing pre-commit hook? Fix the process so the same class of leak can’t recur.

Leaked credentials get exploited fast. Public GitHub leaks are typically scanned and abused within minutes. Time-to-revoke is the metric that matters — you want minutes, not hours. Automated revocation pipelines (triggered by leak-detection alerts) are a genuine maturity marker for a platform team.

111.11 The mental model

Eight points to take into Chapter 112:

  1. Classify secrets narrowly. Overclassifying makes the system unusable. Config is not a secret.
  2. Vault for full control, cloud secret managers for simplicity, 1Password for human-shared credentials. Choose based on team size and operational appetite.
  3. KMS is different from a secret store. KMS holds keys and performs crypto; the secret store holds values. Envelope encryption ties them together.
  4. ESO is the default delivery mechanism for Kubernetes. Store is the source of truth; ESO syncs to a Kubernetes Secret.
  5. Sealed Secrets is for small teams with few, stable secrets. SOPS is for Git-native workflows with KMS as the trust root. CSI driver is for high-security environments where etcd cannot be trusted.
  6. Rotation is the hard problem. Dynamic secrets (Vault’s DB engine) are the gold standard. Scheduled rotation requires application cooperation.
  7. Audit every access. After a breach, the audit log is the only source of truth for blast-radius assessment.
  8. Time-to-revoke is the leak-response metric. Automate detection and revocation.

In Chapter 112, everything that was running inside the cluster has to be exposed to the outside world. Ingress, gateways, tunnels, meshes.


Read it yourself

  • The HashiCorp Vault documentation, especially the Kubernetes auth method and the database secrets engine.
  • The External Secrets Operator documentation and the SecretStore / ExternalSecret API reference.
  • Mozilla SOPS README and the age encryption tool.
  • AWS Secrets Manager user guide, specifically the rotation Lambda patterns.
  • The Kubernetes documentation on encrypting secrets at rest with a KMS provider.
  • Filippo Valsorda’s blog posts on age and modern encryption defaults.
  • The OWASP Secrets Management Cheat Sheet.

Practice

  1. Classify the following as secret or not-secret: database password, S3 bucket name, database hostname, feature flag value, TLS certificate (public cert), TLS private key, OAuth client ID, OAuth client secret. Justify each.
  2. Explain envelope encryption. Why is the DEK generated per-message and the KEK held in the KMS?
  3. Write an ExternalSecret CRD that syncs a JSON secret from AWS Secrets Manager (with three fields) into a Kubernetes Secret with those three fields as keys.
  4. Compare Sealed Secrets, ESO, and CSI Secrets Store driver on five axes: source of truth, trust boundary, rotation support, etcd exposure, operational complexity.
  5. Design a rotation pipeline for a database password that rotates every 30 days with zero downtime. Which components are needed, what’s the sequence of operations, what can go wrong?
  6. Your CI logs accidentally printed a production database password. Write the incident response playbook for the next 30 minutes.
  7. Stretch: Set up Vault in dev mode locally, configure the database secrets engine against a Postgres, fetch a dynamic credential from a sample app, and verify that the credential is automatically revoked after the TTL expires. Document every step.