Chapter 106: The OCI image lifecycle: registries, digest pinning, the digest-update pattern

Chapter 102 covered what a container is. Chapter 101 covered how to build its contents. This chapter covers what happens next: the journey from docker build to kubectl rollout — across registries, digest resolution, manifest patches, and the git commit that closes the loop. This is the unglamorous middle layer of the deploy pipeline, and getting it right is the difference between “our deploys are boring” and “our deploys are a source of outages.”

By the end of this chapter the reader knows why digests matter more than tags, how a typical “build → push → digest → patch → PR” pipeline is wired up, what tools like crane and cosign and ko actually do, and how to reason about supply-chain attestation without drowning in spec acronyms. The concepts here flow directly into Chapter 107 (GitOps), which assumes all of this is in place.

Outline:

The anatomy of an OCI image.
Tags vs digests — why one is a fact and the other is a wish.
Registries — GHCR, ECR, GCR, Harbor, Artifactory.
The push step and content-addressable storage.
The digest-update pattern.
crane, skopeo, oras, ko.
Signing with cosign and sigstore.
SBOMs, provenance, and SLSA.
Reproducibility as a property.
The full lifecycle from PR merge to running pod.
The mental model.

106.1 The anatomy of an OCI image

An OCI image is three kinds of files sitting in a content-addressable store, addressed by SHA256 digests. Understanding the layout is the key to understanding everything that follows.

The three kinds:

Layers — gzipped tarballs of filesystem content. Each layer is a diff from the layer below. The base image contributes one or more layers; each RUN, COPY, or ADD instruction in a Dockerfile adds another layer. A typical Go image has 2-3 layers; a Python image has 5-10.
Config — a JSON blob containing the image’s runtime config (entrypoint, environment, user, working directory, exposed ports) and the ordered list of layer digests. Also includes the image’s history and labels.
Manifest — a JSON blob that points to the config and the layers. This is the entry point; when you pull an image, you pull the manifest first, then the config and layers it references.

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "config": {
    "mediaType": "application/vnd.oci.image.config.v1+json",
    "digest": "sha256:c8a5d3f2e1...",
    "size": 2847
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:a591fd4c62...",
      "size": 32654123
    },
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar+gzip",
      "digest": "sha256:b2d1e9ab3d...",
      "size": 1245678
    }
  ]
}

Every digest is a SHA256 of the content it refers to. Change one byte anywhere — in a layer, in the config, in the manifest — and the digest changes. This is what makes digests immutable: you cannot update “what sha256:abc123... means.” The digest is the content.

The manifest’s own digest is the image’s canonical identifier. When you pull my-service@sha256:abc123..., you are asking for the content whose manifest hashes to abc123.... The registry either has it or doesn’t. No ambiguity, no mutability, no spoofing.

An OCI image is three content-addressed objects — manifest, config, and layers; the manifest's SHA256 is the only immutable identifier, so deploying by digest rather than tag is the difference between determinism and a ticking bug.

Multi-arch images add one more layer: an image index (or “manifest list”). The index is a manifest of manifests, listing one manifest per architecture (linux/amd64, linux/arm64). The registry returns the index when you pull; the client picks the right platform-specific manifest. The index has its own digest; pinning by index digest gives you a single name that resolves to the right platform for any puller.

106.2 Tags vs digests

A tag is a human-readable name that points to a digest. my-service:v1.2.3 is a tag. my-service:latest is a tag. my-service:main-a7b3c9f is a tag. Tags are mutable: the registry holds a mapping (repository, tag) -> digest, and anyone with push permission can update it.

This mutability is the source of most deploy-pipeline bugs. A team pushes my-service:v1.2.3, tests it, approves it, rolls it out. A week later someone re-tags v1.2.3 to a newer build (accidentally or not). A node restarts, pulls my-service:v1.2.3, gets the new content, and the service behaves differently from last week. The tag “moved.” Nothing in the system notices.

A digest is the SHA256 of the manifest. my-service@sha256:a7b3c9f... is a digest reference. It is immutable. The registry can garbage-collect the content behind it (which fails subsequent pulls), but it cannot make the digest point somewhere else. If you want reproducibility — the same exact image running in production today as in the test yesterday — you pin by digest.

The rule: tags are for humans, digests are for machines. Tags live in documentation, changelogs, and developer memory. Digests live in the deployed Kubernetes manifests, the GitOps repo, the audit trail. A Kubernetes Deployment spec that says image: my-service:v1.2.3 is a ticking bug; one that says image: my-service@sha256:a7b3c9f... is deterministic.

The operational corollary: the build pipeline must resolve the tag to a digest and rewrite the deployment spec before commit. This is the digest-update pattern that §106.5 gets into.

106.3 Registries

A registry is a server that stores images and serves them over HTTP/HTTPS using the OCI distribution spec. Every major cloud has one: GHCR (GitHub Container Registry), ECR (AWS), GCR / Artifact Registry (Google), ACR (Azure). Self-hosted options include Harbor (open-source, feature-rich, the most common on-prem choice), Artifactory (JFrog, commercial), and Distribution (the reference CNCF implementation).

The differences matter in practice but are mostly operational:

Authentication. GHCR uses GitHub tokens. ECR uses AWS IAM. GCR uses GCP service accounts. Kubernetes pulls through imagePullSecrets or cloud-native workload identity (IRSA, GKE Workload Identity, Azure Federated Identity).
Region and latency. A pod in us-east-1 pulling from an ECR repo in us-west-2 is slow. For multi-region deployments, replicate the registry to each region. ECR does pull-through caches; Harbor has replication; GHCR is a single global endpoint with CDN edges.
Garbage collection. Untagged layers pile up and cost money. Every registry has GC, but the details differ — ECR’s lifecycle policies are declarative; Harbor has a scheduled GC; GHCR cleans up untagged manifests over time.
Pull rate limits. Docker Hub is notorious for rate-limiting anonymous pulls, which bites a K8s cluster pulling lots of public images. Always use a pull-through cache (ECR’s is built in; Harbor has a proxy cache project) to avoid the problem.
Vulnerability scanning. Most registries have built-in scanners (Trivy, Grype, Clair). The scan runs on push and produces a report attached to the image. Chainguard and similar “zero-CVE” images exist specifically to sidestep the constant flood of alerts.

For a new project: pick the registry that matches your cloud (ECR on AWS, Artifact Registry on GCP), or Harbor if you need self-hosted. Avoid Docker Hub for anything non-trivial because of the rate limits.

106.4 The push step and content-addressable storage

When docker push my-service:v1.2.3 runs, the sequence is:

The client hashes each layer tarball.
For each layer, the client asks the registry “do you already have this digest?” via a HEAD request.
If yes, the layer is skipped (the registry already has the content). This is how layer caching works across images — the base layer sha256:abc... is stored once and shared by every image that uses it.
For each missing layer, the client uploads it. The upload uses a chunked protocol so it can resume on failure.
The client uploads the config blob.
The client uploads the manifest, which lists the config and layer digests.
The registry associates the manifest with the tag v1.2.3.
The registry returns the manifest’s digest. This is the key moment: the push response includes Docker-Content-Digest: sha256:xyz..., which is the immutable identifier for what was just pushed.

The digest is what the build pipeline records. A typical pipeline captures it from the docker push output or by running crane digest my-service:v1.2.3 after the push, and then uses it to patch deployment manifests.

The content-addressable property means push is idempotent. Pushing the same image twice is a no-op for the bytes — the layers are already there. Only the tag gets re-associated. This is why “rebuild and push” is cheap even for huge images: only changed layers upload.

106.5 The digest-update pattern

Here is the core pattern that ties the build pipeline to GitOps (Chapter 107). The canonical flow, which most mature teams implement some version of:

A developer merges a PR to the app repo. The merge triggers CI.
CI builds the image: docker build -t registry/my-service:git-sha-$SHA . where $SHA is the commit SHA.
CI pushes: docker push registry/my-service:git-sha-$SHA.
CI captures the digest: DIGEST=$(crane digest registry/my-service:git-sha-$SHA).
CI opens a PR against the deployment repo (separate from the app repo) that updates the Kubernetes manifest:

# before
image: registry/my-service@sha256:OLDDIGEST

# after
image: registry/my-service@sha256:NEWDIGEST

The deployment repo PR runs its own checks (kubeconform, policy validation, etc.).
The PR is auto-merged (for dev environments) or requires review (for prod).
ArgoCD/Flux detects the merge and syncs the cluster to the new manifest.
Kubernetes rolls out pods with the new image digest.

The two-repo split is the important part. The app repo holds code; the deployment repo holds what’s actually running. They are coupled via the digest, which is the only coordination point.

sequenceDiagram
  participant Dev as Developer
  participant AppCI as App CI
  participant Reg as Registry
  participant Bot as Digest-update bot
  participant DeployRepo as Deploy repo
  participant Argo as ArgoCD / Flux

  Dev->>AppCI: merge PR to main
  AppCI->>Reg: docker build + push :git-sha
  Reg-->>AppCI: sha256:xyz...
  AppCI->>Bot: trigger with sha256:xyz
  Bot->>DeployRepo: PR: update image to sha256:xyz
  DeployRepo->>Argo: merge → sync
  Argo->>Reg: pull sha256:xyz
  Argo-->>Dev: pods running new image

The digest-update pattern decouples the app CI pipeline from the cluster: CI ends at “push a commit to the deploy repo”, and the GitOps controller takes it from there — no CI system holds cluster credentials. The deployment repo is always a complete, accurate description of the cluster state — if you want to know “what’s running in prod right now?”, you read the deployment repo.

The digest-update commit is usually done by a bot (renovate, a custom GitHub App, or a bespoke script). The bot has push access to the deployment repo and opens PRs like “chore(my-service): update image to sha256:abc123…” with a link back to the app commit. The PR description auto-populates with the app’s changelog between the previous and new commits. This is the audit trail — every deployment is a reviewable git commit.

The bot’s implementation is often a few hundred lines of Go or Python. Core logic:

1. After push, call `crane digest <repo>:<tag>` to get SHA256.
2. Clone the deployment repo.
3. Edit the relevant YAML: sed / yq / kustomize edit.
4. Commit with a conventional message.
5. Push to a branch, open a PR via the Git API.
6. Optionally auto-merge if target environment is "dev".

Simple, boring, and the foundation of every reliable deploy pipeline.

106.6 `crane`, `skopeo`, `oras`, `ko`

A handful of tools cover the registry-operations surface area:

crane (from the go-containerregistry project) is the Swiss Army knife. It does everything a docker push does without needing a Docker daemon. Key commands:

crane digest my-service:v1.2.3          # print the manifest digest
crane pull my-service:v1.2.3 image.tar  # download without Docker
crane push image.tar my-service:v1.2.4  # upload without Docker
crane copy a:v1 b:v1                    # server-side copy across repos
crane manifest my-service:v1.2.3        # print the raw manifest
crane config my-service:v1.2.3          # print the config JSON

crane copy is especially valuable: it copies an image from one registry to another without the bytes flowing through your laptop. The registry-to-registry transfer happens server-side when both registries support OCI distribution. This is how you promote an image from a dev registry to a prod registry without re-pushing.

skopeo is the Red Hat equivalent, similar feature set, slightly different CLI. Often preferred in RHEL / Fedora / podman environments. skopeo copy docker://src docker://dst is the same as crane copy.

oras extends the OCI spec to arbitrary artifacts — Helm charts, WASM modules, SBOMs, attestations. It treats the registry as a generic content-addressable store, not just for container images. This is the plumbing for signing and attestation in §106.7.

ko is a Go-specific build tool that skips the Dockerfile entirely. You run ko build ./cmd/server and ko compiles the Go binary, wraps it in a distroless image, pushes it, and prints the digest. No Dockerfile, no intermediate image. For pure-Go services this is the fastest path from source to pushed image — often under 10 seconds on a warm cache. It also handles multi-arch builds (--platform=linux/amd64,linux/arm64) in one command.

ko build --platform=linux/amd64,linux/arm64 \
  --bare \
  --image-refs=/tmp/refs \
  ./cmd/server
# prints: registry/my-service@sha256:abc...

--bare means “use the repo path verbatim.” --image-refs writes the resulting digest to a file, perfect for piping into the digest-update bot.

For a modern Go service the whole “build → push → digest → commit” pipeline can be four lines of CI:

DIGEST=$(ko build --bare --image-refs=/dev/stdout ./cmd/server | tail -1)
cd deploy-repo
yq -i ".spec.template.spec.containers[0].image = \"$DIGEST\"" kustomize/base/deployment.yaml
git commit -am "chore: update my-service to $DIGEST" && git push

106.7 Signing with cosign and sigstore

Signing images is about supply-chain security. Without signing, anyone who compromises the registry can replace an image’s content with malicious content, and the cluster will happily pull and run it. With signing, the cluster verifies a cryptographic signature before running, and tampered content is rejected.

Cosign (part of the Sigstore project) is the standard tool. It signs OCI images by attaching a signature as an OCI artifact (via the same oras model from §106.6) — the signature lives in the registry alongside the image, keyed by the image’s digest. A verifier fetches the image and the signature, checks the signature against a public key or identity, and allows or rejects.

# Sign an image
cosign sign --key cosign.key registry/my-service@sha256:abc...

# Verify
cosign verify --key cosign.pub registry/my-service@sha256:abc...

The keyed mode uses a traditional public/private keypair. The more interesting mode is keyless signing via sigstore: the signer authenticates with an OIDC provider (GitHub Actions, GitLab CI, Google), cosign generates a short-lived signing key, and the signature is published to a public transparency log (Rekor). Verifiers check that the signer’s OIDC identity matches an allowlist (“must be signed by github-actions from my-org/my-repo”). No long-lived secrets, auditable provenance.

# Keyless sign (in GitHub Actions)
cosign sign registry/my-service@sha256:abc...
# Keyless verify
cosign verify \
  --certificate-identity "https://github.com/my-org/my-repo/.github/workflows/ci.yml@refs/heads/main" \
  --certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
  registry/my-service@sha256:abc...

On the cluster side, policy controllers like Kyverno or Connaisseur enforce verification at admission time. A policy says “every pod in the prod namespace must run a cosign-verified image signed by this identity.” Pods with unverified images are rejected before they schedule. This is the enforcement mechanism that turns signing from a ceremony into a guarantee.

106.8 SBOMs, provenance, and SLSA

Beyond signing the image, the supply-chain-security world wants two more artifacts attached to every image: the SBOM and the provenance.

An SBOM (Software Bill of Materials) lists every package and version that went into the image. Think of it as dpkg -l output plus language-specific dependencies — the Go modules, Python packages, system libraries, everything. The standard formats are SPDX and CycloneDX. Tools like syft generate SBOMs from an image:

syft registry/my-service@sha256:abc... -o spdx-json > sbom.json
cosign attest --predicate sbom.json --type spdx registry/my-service@sha256:abc...

cosign attest attaches the SBOM to the image as a signed attestation, so anyone can pull the image and verify what’s in it. When a CVE is disclosed, a team with SBOMs for every image can answer “which of our services are vulnerable?” in seconds by querying a database of (image, package) -> cve. Without SBOMs, the same question takes days.

Provenance is the “how was this built” metadata. SLSA (Supply-chain Levels for Software Artifacts) defines levels of provenance rigor:

SLSA 1: build process is documented.
SLSA 2: build runs in a hosted service with authenticated provenance.
SLSA 3: build runs in a hardened, isolated environment with verifiable provenance.
SLSA 4: the build is reproducible and two-party-reviewed.

Most real teams target SLSA 3. The provenance attestation, generated by the build system (GitHub Actions provides this natively, as does Google Cloud Build and Buildkite), records: the source repository and commit, the build job ID, the builder identity, the hash of the inputs, and the hash of the outputs. Cosign attaches it to the image. A verifier can check “this image was built from commit a7b3c9f of my-org/my-repo by GitHub Actions” and enforce it as an admission policy.

The combined “image + signature + SBOM + provenance” is the modern supply-chain package. Each piece is an OCI artifact in the same registry, keyed by the image’s digest. The image is verifiable to its source, its contents, and its build environment.

106.9 Reproducibility as a property

Reproducibility is the property that building the same source at the same commit produces the same image bytes. It is closely related to but distinct from hermeticity (Chapter 101). Hermetic means the build is a pure function; reproducible means that pure function produces bit-identical output on every run.

Why it matters. A reproducible image can be independently rebuilt by a verifier — “I built the source from commit a7b3c9f and got digest sha256:abc...; the published image is sha256:abc...; they match; I believe the image corresponds to the source.” Without reproducibility, the only evidence that an image comes from a particular source is the provenance attestation from the original build, which depends on trusting the builder. With reproducibility, anyone can verify.

The obstacles are usually small details: build timestamps embedded in binaries, uid/gid of files in tarballs, file ordering in archives, random-sort ordering in glob expansions. Most of these are fixable with environment variables (SOURCE_DATE_EPOCH) or build flags. Go binaries with -trimpath and -buildid="" are mostly reproducible. Python’s .pyc files have timestamps but can be normalized. Multi-arch images add complications because each platform builds its own manifest, so reproducibility is per-platform.

For most teams, exact byte-for-byte reproducibility is not a requirement. Close-enough-to-verify (same provenance, same dependency digests, similar size) is enough. The full SLSA 4 story is aspirational outside specialized environments (TUF, in-toto, critical infrastructure). For now, aim for SLSA 3: hermetic builds with signed provenance, verified at deploy time. That is the achievable state of the art in 2026.

106.10 The full lifecycle from PR merge to running pod

Putting it all together. The end-to-end journey of a code change:

Developer merges PR to app repo. Commit SHA is a7b3c9f.
CI triggers. Builds, tests, lints, type-checks.
Build the image. ko build or docker build; produces a multi-arch image tagged registry/my-service:a7b3c9f.
Push to registry. Content-addressable storage dedupes unchanged layers. Manifest digest returned: sha256:xyz....
Sign. cosign sign attaches a keyless signature via OIDC to the registry.
Attest. syft generates an SBOM. Build system emits provenance. cosign attest attaches both.
Scan. Registry runs Trivy/Grype; fails the pipeline if high-severity CVEs found (configurable).
Open deploy PR. Bot opens PR against deploy repo: “update my-service to sha256:xyz...”.
Deploy-repo CI runs. kubeconform validates the manifests; Kyverno checks policies; OPA/Gatekeeper confirms image is signed.
Auto-merge (dev) or review (prod). Depends on environment.
ArgoCD/Flux detects the merge. Within seconds (dev) or on next sync window (prod).
Argo applies the change. kubectl apply of the patched Deployment. The Deployment’s pod template now points to the new digest.
Kubernetes rolls out. New pods are created with the new image. Old pods drain and terminate.
Admission controller verifies signatures. Kyverno (or equivalent) intercepts the pod creation, pulls the signature from the registry, verifies against the trusted identity, and allows or denies the pod.
Kubelet pulls the image. The node’s container runtime (containerd) pulls the image from the registry, unpacks the layers into the local content store, and calls runc to create the container (Chapter 102).
Readiness probe passes. The service is now live.

Every step is automated. Every step is auditable. Every step produces an artifact that can be traced back through git commits to the original PR. “What’s running in prod?” is answered by reading the deploy repo at HEAD. “How did this image get there?” is answered by following the provenance attestation to the build job to the source commit.

This is the pattern. It looks like bureaucracy until the first time you need to answer “was our platform affected by this 0-day in libxyz?” and you can answer in 5 minutes with SBOM queries. The investment pays for itself on the first incident.

106.11 The mental model

Eight points to take into Chapter 107:

An OCI image is layers + config + manifest, each content-addressed by SHA256.
Tags are human wishes; digests are machine facts. Deploy by digest, always.
The registry is content-addressable storage. Unchanged layers upload once; push is idempotent.
The digest-update pattern couples the app repo to the deploy repo via a bot. One commit per deployment.
crane and ko are the core tools. ko for Go, crane for everything else.
Cosign + sigstore gives keyless, transparent signing. Verify at admission time.
SBOMs + provenance = SLSA 3. The minimum modern supply-chain story.
The full lifecycle is 16 steps, all automated, all auditable. Boring deploys are the goal.

Chapter 107 picks up at step 11 and goes deep on GitOps — the philosophy and tooling that drive the sync from the deploy repo to the cluster.

Read it yourself

The OCI Distribution Specification at github.com/opencontainers/distribution-spec. Short and readable; explains the registry HTTP protocol.
The go-containerregistry repo (github.com/google/go-containerregistry) — crane and the underlying library. The README and examples are a tour of OCI operations.
ko documentation at ko.build. Start with the “Why ko?” page.
Sigstore documentation at docs.sigstore.dev. Particularly the “Cosign” and “Keyless” sections.
The SLSA specification at slsa.dev. The levels page is the one to read first.
“Understanding the OCI Image Spec,” a blog post series by Jake Moshenko (the author of crane).

Practice

Pull a public image (say, alpine:latest) and print its manifest and config with crane manifest and crane config. Note the layer digests and sizes.
Pull alpine:latest twice (with docker rmi in between). Why is the second pull fast? Which layer digests are shared?
Build a Go hello-world with ko. Measure the end-to-end time from source to pushed image. Compare to a multi-stage Dockerfile.
Write a shell script that: builds an image, captures its digest, and patches a Kubernetes Deployment manifest to use the digest. 20 lines or under.
Sign a public image with keyless cosign (GitHub OIDC). Verify the signature with the correct identity; try again with a wrong identity and observe the failure.
Generate an SBOM for an image with syft. Count the number of packages. Grep for one you know is in there.
Stretch: Set up a toy Kyverno policy that only admits pods whose image is signed by a specific cosign identity. Deploy a signed pod (should succeed) and an unsigned one (should fail).