Part VI · Distributed Systems & Request Lifecycle
Chapter 74 ~24 min read

AuthN vs AuthZ: the distinction interviewers test on

"Authentication asks 'who are you?' Authorization asks 'what can you do?' Confuse them in an interview and the interviewer stops listening"

Authentication and authorization are two of the most frequently conflated words in software. They are distinct steps with distinct failure modes, distinct threat models, and distinct implementations. Interviewers love the distinction because it reveals whether a candidate has actually operated a production system or only read blog posts. Getting it right is table stakes for any senior role.

The goal here is not a shallow glossary; it is the concrete mechanics that make each step work: what a JWT actually contains, why OAuth2 has five grant types, what OIDC adds on top of OAuth2, the difference between session-based auth and token-based auth, how mTLS works between services, and how tokens get revoked in a system that is nominally stateless.

Outline:

  1. The distinction in one page.
  2. Sessions vs tokens.
  3. JWT anatomy.
  4. OAuth2 grants and when to use each.
  5. OIDC is OAuth2 plus identity.
  6. API keys, their uses and limits.
  7. mTLS for service-to-service auth.
  8. Token lifetimes, refresh flows, revocation.
  9. Authorization models: RBAC, ABAC, ReBAC.
  10. The policy decision point.
  11. Common pitfalls in interviews and in production.
  12. The mental model.

74.1 The distinction in one page

Authentication (AuthN) is the process of establishing who is making a request. The caller presents a credential — a password, a token, a certificate — and the server verifies it against some trust anchor. If verification succeeds, the request is associated with a principal: a user ID, a service ID, sometimes a tenant ID. AuthN produces an identity.

Authorization (AuthZ) is the process of deciding whether that identity is allowed to perform the requested action on the requested resource. The input to AuthZ is the identity (from AuthN), the action (HTTP verb + route, or an RPC method), and the resource (which record, which tenant, which file). The output is a boolean: allowed or denied.

These two steps run in this order. AuthN always comes first. If AuthN fails, the correct response is 401 Unauthorized (badly named — it should be “unauthenticated” but HTTP shipped with the wrong word). If AuthN succeeds but AuthZ denies, the correct response is 403 Forbidden. Mixing up 401 and 403 is the most common bug in custom auth code.

AuthN and AuthZ run in sequence: AuthN produces an identity, AuthZ uses that identity to produce an allow/deny decision. Request AuthN verify credential → identity 401 if fails principal AuthZ identity + action + resource → allow / deny 403 if denied Handler "who are you?" "what can you do?"
AuthN and AuthZ are separate steps with separate errors: a valid token for the wrong resource is 403, not 401.

A production request hits a gateway (Chapter 73). The gateway extracts a token from the Authorization header, validates it (AuthN), attaches the resulting principal to the request context, and forwards. The backend then performs AuthZ using that principal: “Is user 42 allowed to read document 7?” The two steps are separated so that one identity system can serve many authorization policies. The gateway does not need to know about document 7; the backend does not need to know how user 42 was authenticated. The boundary between them is clean.

Interviewers test this by asking variants of: “A user calls /admin/delete-everything with a valid token. What goes wrong?” The answer they want is: “Token validity is AuthN. The question is whether the identity in the token is authorized for that action — that is a separate check.” If you conflate the two, the whole admin interface becomes a vulnerability, because a valid token for any user could trigger admin actions.

74.2 Sessions vs tokens

Two architectures for carrying identity across requests.

Session-based authentication stores the identity on the server. When a user logs in, the server creates a session record (user ID, expiration, metadata) in a store (Redis, Memcached, a database) and hands the client a random session ID in a cookie. On every subsequent request, the client sends the cookie, the server looks up the session in the store, and retrieves the identity. The cookie is an opaque reference; the actual identity never leaves the server.

Advantages: the server has full control. Invalidation is instant — delete the session record and the user is logged out everywhere. The cookie is small. The identity can carry arbitrary server-side state without bloating the request.

Disadvantages: every request hits the session store, which has to be fast, highly available, and shared across all servers handling the user’s requests. Scaling it across regions is painful. A store outage is a global outage.

Token-based authentication stores the identity in the request itself, signed by the server. When a user logs in, the server mints a token containing the identity (user ID, claims, expiration), signs it, and hands it back. On every subsequent request, the client sends the token; the server verifies the signature and trusts the embedded claims without any lookup. The token is self-describing.

Advantages: stateless. No session store. Horizontally scalable across regions trivially. The verification is a signature check, which is fast and local.

Disadvantages: revocation is hard. Once a token is issued, it is valid until its expiration. If a user logs out or an account is compromised, there is no server-side flip; you either rely on short expiration and refresh, or maintain a denylist that reintroduces the session store you were trying to avoid. The token is larger (kilobytes, not bytes). Claims go stale — if the user’s role changes, the old token still has the old role.

Production systems often use both. User-facing web apps use sessions (HTTPOnly cookies, fast revocation). APIs use tokens (stateless, multi-region). An OAuth2 login flow can produce both: a session for the web app and an access token for its backend calls.

74.3 JWT anatomy

The JSON Web Token (RFC 7519) is the most common token format. A JWT is three base64url-encoded strings separated by dots:

eyJhbGciOiJSUzI1NiIs... . eyJzdWIiOiJ1c2VyXzQyI... . MEUCIGx3...
 <---- header ---->     <---- payload ---->     <---- signature ---->
A JWT has three parts: header names the algorithm, payload carries claims, signature proves neither was tampered with. Header alg: RS256 kid: 2026-key typ: JWT which algorithm + key . Payload (claims) iss: https://auth.example.com sub: user_42 aud: api.example.com exp: 1728003600 tenant: acme who, for whom, expires when . Signature sign( b64(header) + "." + b64(payload), private_key ) NOT encrypted — anyone can read the payload; the signature only proves it was not tampered with
Verification must check the signature, then iss, aud, exp, and nbf — in that order; skipping any one is a security bug.

Header: a JSON object that declares the signing algorithm and the key ID.

{"alg": "RS256", "kid": "2026-01-key", "typ": "JWT"}

Payload (also called claims): a JSON object with key/value pairs. Standard claims are defined in the RFC:

  • iss — issuer. Who minted the token. Typically a URL like https://auth.example.com.
  • sub — subject. The principal. Usually a user or service ID.
  • aud — audience. Who the token is for. A recipient that sees a token whose aud does not match its own ID must reject it.
  • exp — expiration, as Unix seconds.
  • nbf — not-before. Token is invalid before this time.
  • iat — issued-at, as Unix seconds.
  • jti — JWT ID, a unique identifier used for revocation denylists.

Plus whatever application-specific claims you want: tenant, scopes, email, role.

Signature: the cryptographic proof that the header and payload were not tampered with. Computed as sign(base64url(header) + "." + base64url(payload), key). The signing algorithm is declared in the header’s alg. Modern systems use RS256 (RSA-SHA256), ES256 (ECDSA-SHA256), or EdDSA (Ed25519). Never use HS256 across trust boundaries — it requires both sides to share the secret, which is fine for a single service but a footgun when tokens cross services. Also: never accept alg: none. Every popular JWT library has had a bug where alg: none skipped signature verification. Always validate alg against an allowlist.

Verification on the server: fetch the public key (via JWKS — JSON Web Key Set, typically at https://issuer/.well-known/jwks.json), look up the key by kid, verify the signature, check exp, nbf, iss, and aud. Only then trust the claims. A common disaster: trusting claims without verifying the signature, because “we only get requests through our own gateway.” Chapter 75 explains why this assumption breaks.

The JWT is not encrypted. Anyone who intercepts it can read the claims. If the claims are sensitive (PII, internal IDs), use JWE (JSON Web Encryption) or simply do not put them in the token. Treat a JWT like a signed postcard: the signature guarantees it is from you, but anyone can read the message.

74.4 OAuth2 grants and when to use each

OAuth2 (RFC 6749) is an authorization framework for delegating access between parties. The canonical example: a user wants a third-party app to post to their calendar without giving the app their calendar password. OAuth2 solves this by inserting an authorization server that mints short-lived access tokens the app can use against the resource server.

OAuth2 defines several grant types. Four are still relevant.

Authorization Code (the “real” one). Used when a human logs in to a web or mobile app. The flow:

OAuth2 Authorization Code flow with PKCE: the user authenticates at the auth server, not at the app, keeping credentials out of the app entirely. Browser Auth Server App (client) ① redirect → /authorize?code_challenge=H(verifier) ② user logs in at auth server ③ redirect → app/callback?code=short-lived-code ④ browser sends code to app backend ⑤ POST /token {code, code_verifier} ⑥ access_token + refresh_token
The user's credentials never touch the app; the app only receives a short-lived code that it exchanges for tokens, keeping the credential surface minimal.
1. App redirects user to auth server's /authorize endpoint.
2. User authenticates at the auth server (not the app).
3. Auth server redirects back to the app with a short-lived code.
4. App exchanges the code for an access token at /token.
5. App uses the access token against the resource server.

PKCE (Proof Key for Code Exchange, RFC 7636) is a mandatory extension for public clients (mobile apps, SPAs, CLIs). The client generates a random code_verifier, sends its SHA-256 hash as code_challenge in step 1, and sends the raw code_verifier in step 4. This prevents a code-interception attack even on clients that cannot keep a secret. Every modern Authorization Code flow should use PKCE, including confidential clients.

Client Credentials. Used for service-to-service. A backend service has a client ID and a client secret; it calls /token directly and gets an access token for itself. No user involved. This is the right flow for a cron job calling an API, a backend calling another backend (if it is not using mTLS), or a CLI tool with a service account.

Refresh Token. Not a grant in its own right — it is how you exchange an old refresh token for a new access token without user interaction. Issued alongside an access token during the Authorization Code flow. When the access token expires, the client uses the refresh token to get a new one. Refresh tokens are long-lived (days to months); access tokens are short-lived (minutes). This is the core lifetime story: short access tokens limit the blast radius of theft, long refresh tokens avoid forcing users to re-login constantly.

Device Authorization Grant (RFC 8628). Used for inputs-limited devices: smart TVs, IoT, CLI tools. The device shows a URL and a code; the user opens the URL on another device, enters the code, and authorizes. The device polls /token until the user finishes. This is the flow the GitHub CLI and the gcloud CLI use for browser-based logins.

Implicit and Resource Owner Password Credentials are deprecated and should not be used in new systems. If you see either in a codebase, plan a migration.

74.5 OIDC is OAuth2 plus identity

OAuth2 authorizes access. It says nothing about who the user is to the app. If the app wants the user’s identity — their email, name, avatar — OAuth2 alone does not give it cleanly; the access token is for the resource server, not the app.

OpenID Connect (OIDC) layers a thin protocol on top of OAuth2 that adds identity. The key addition: in addition to an access token, the token endpoint returns an ID token, which is a JWT containing identity claims (sub, email, name, email_verified, etc.). The ID token is signed by the OIDC provider and the app verifies it locally. The app now has a trustworthy identity for the user without a separate call.

OIDC also standardizes:

  • The /userinfo endpoint (an authorized endpoint that returns the full user profile when given an access token).
  • The /.well-known/openid-configuration discovery document (a JSON file describing all the provider’s endpoints and supported features).
  • Standard scopes: openid, email, profile, address, phone.

When someone says “Sign in with Google,” “Sign in with GitHub,” or “Sign in with Microsoft,” it is OIDC. Auth0, Okta, Keycloak, and AWS Cognito all implement OIDC. If you build a login flow, use OIDC via one of these providers rather than rolling your own password database. The prize is not saving code; it is avoiding being in the credential-storage business.

74.6 API keys, their uses and limits

An API key is a long, random, opaque string that a server uses to identify a client. No expiration, no claims, no signature. The server stores a hash of the key and compares on each request. That is it.

API keys are appropriate for:

  • Machine-to-machine calls where the caller is stable and trusted (backend service, CI job, internal tool).
  • Developer-facing APIs where a human reads a key from a dashboard and pastes it into a config file.
  • Simple access control where a token’s claims are not needed.

API keys are inappropriate for:

  • End-user authentication. Users have browsers and can handle real OAuth2 flows; do not burden them with long strings.
  • Fine-grained claims. Keys carry no scope, no tenant, no audience. You have to look them up in a database every time.
  • High-stakes actions. Key theft is permanent until you rotate; there is no short expiration.

When you use API keys, do three things: store only the hash (not the raw key — if your DB leaks, the attacker should not be able to authenticate), prefix with a non-secret identifier (sk_live_abc123... — so logs can record which key was used without leaking the secret), and support rotation (users need the ability to create a new key, switch traffic, then revoke the old one with zero downtime). Stripe and GitHub both do this well; copy their patterns.

Key rotation also needs a grace period. A user who rotates their key should be able to have two keys valid simultaneously for at least a few minutes while they update configurations. Systems that force immediate cutover make rotation so painful that nobody does it, and then keys never rotate.

74.7 mTLS for service-to-service auth

Mutual TLS authenticates both ends of a TLS connection using X.509 certificates. The client presents a certificate to the server, the server presents a certificate to the client, and both are verified against a trusted CA (or a SPIFFE ID, or whatever trust root your mesh uses).

For service-to-service auth, mTLS is often the right answer. Why:

  • No tokens to manage. The certificate is the credential. No need to mint JWTs, refresh them, or pass them in headers.
  • Bidirectional. The server is also authenticated to the client. Prevents MITM even inside a “trusted” network.
  • Cryptographically strong. A stolen certificate is only useful if the attacker also has the private key, which never crosses the network.
  • Fast after handshake. TLS session resumption means the handshake cost amortizes across many requests.

Service meshes (Istio, Linkerd, Consul Connect) automate mTLS: every pod gets a sidecar proxy with an auto-rotated certificate issued by the mesh CA, all pod-to-pod traffic is wrapped in mTLS transparently. Your application code sees plain HTTP; the mesh adds mTLS at the connection layer. This is the SPIFFE / SPIRE pattern: every service has a verifiable identity (spiffe://cluster/ns/default/sa/my-service) that it can present to other services.

mTLS handles authentication. It does not handle authorization. Once you know which service is calling you, you still need to decide whether that service is allowed to do what it is asking. AuthZ is a separate step. Chapter 75 unpacks this further because the interesting case is when the service is calling on behalf of a user and the user’s identity has to flow through.

74.8 Token lifetimes, refresh flows, revocation

Three design parameters shape the security/usability tradeoff.

Access token lifetime. Short enough that a stolen token expires quickly, long enough that every request is not a token-fetching round trip. Typical values: 5 to 60 minutes for user-facing web apps, 15 minutes to an hour for machine-to-machine. Shorter is safer but increases load on the auth server.

Refresh token lifetime. Long enough that users are not forced to re-login constantly, short enough that a stolen refresh token eventually stops working. Typical values: hours to weeks. Some systems use rotating refresh tokens — every time you use a refresh token, the server mints a new one and invalidates the old one. If an attacker uses a stolen refresh token, the legitimate user’s next use of their refresh token fails, which is a detectable event.

Revocation. The hard one. A user clicks “log out” or an admin disables an account. What happens to tokens already issued? Options:

  1. Do nothing, wait for expiration. Acceptable only with short access token lifetimes. The window is the lifetime.
  2. Denylist. Maintain a server-side list of revoked jti values. Every request checks the list. Reintroduces the statefulness JWTs were avoiding, but the denylist only needs entries for tokens that are both revoked and not yet expired — usually a small set.
  3. Token introspection (RFC 7662). The resource server calls back to the auth server’s /introspect endpoint to check if a token is still valid. Maximum accuracy, maximum latency. Suitable for high-value operations; too slow for every request.
  4. Short-lived access tokens plus refresh token revocation. Revoke the refresh token; the access token continues to work for up to 5 minutes; users are effectively logged out after that. Good enough for most applications.
  5. Per-user version claim. Include a uv claim (user version) in the token. Increment it in the auth DB on revocation. Services check the claim against the current version from a fast cache. Only the cache is stateful; tokens remain self-contained.

Production systems pick one and live with the tradeoffs. Stripe-style APIs use short access tokens plus rotating refreshes. High-security systems use introspection on sensitive operations and cached claims elsewhere. There is no free lunch: anything fully stateless cannot revoke instantly; anything that revokes instantly has some form of server-side state.

74.9 Authorization models: RBAC, ABAC, ReBAC

Once identity is known, how is the allow/deny decision actually made?

graph TD
  A[Request arrives] --> B{AuthN passed?}
  B -->|no| C[401 Unauthorized]
  B -->|yes| D{AuthZ check}
  D -->|denied| E[403 Forbidden]
  D -->|allowed| F[Handler]
  style B fill:var(--fig-accent-soft),stroke:var(--fig-accent)
  style D fill:var(--fig-accent-soft),stroke:var(--fig-accent)
  style C fill:var(--fig-surface),stroke:var(--fig-border)
  style E fill:var(--fig-surface),stroke:var(--fig-border)

AuthN and AuthZ are separate decision points; passing AuthN is not sufficient to proceed — the caller must also pass AuthZ.

RBAC (Role-Based Access Control). Users are assigned roles (admin, editor, viewer); roles have permissions (create_doc, delete_doc). A request is allowed if the user’s role has the required permission. Simple, explicit, easy to audit. Breaks down when permissions depend on the specific resource, not just the action. “Can Alice edit this document?” is not well answered by RBAC alone.

ABAC (Attribute-Based Access Control). Decisions are functions of attributes: user attributes (department, clearance), resource attributes (owner, classification), environment attributes (time, IP). A policy might say: “Allow read if resource.owner == user.id OR user.role == ‘admin’.” More expressive than RBAC, harder to reason about and audit. The canonical language is XACML; modern systems use OPA (Open Policy Agent) with Rego.

ReBAC (Relationship-Based Access Control). Decisions follow relationships in a graph. “Alice can edit Doc7 because Alice is a member of Team3 which is an editor of Project9 which owns Doc7.” Google’s Zanzibar paper and its reimplementations (SpiceDB, OpenFGA, Warrant) are the modern incarnation. Best for social/collaborative products where access cascades through relationships.

Most production systems are hybrids. RBAC for coarse-grained platform permissions (“can access the admin API”), ReBAC or ABAC for resource-level decisions (“can edit this specific document”). The mistake is trying to force one model to do everything; simple authZ checks stay simple, complex ones get their own policy engine.

74.10 The policy decision point

A design decision that matters: where does authZ execute?

In the handler — the endpoint code checks if user.has_permission(X): ... inline. Easy to write, easy to forget, easy to get wrong. One endpoint missing the check is a vulnerability. Reviews and tests help; they do not guarantee.

As a middleware — every request passes through an authZ layer before the handler. Centralizes the logic; makes it harder to forget. Works well for coarse checks (“is the user authenticated and in the right tenant?”) but hard to parameterize for resource-level checks.

As a dedicated policy decision point (PDP) — a standalone service (or sidecar) answers “can user X do action Y on resource Z?” given a request context. The handler calls the PDP for every sensitive action. The policy is written once, centrally, and can be updated without redeploying services. OPA and OpenFGA are designed for this.

The PDP pattern scales best for large platforms. The tradeoff is an extra network call per authZ decision. Mitigate with caching or by embedding the PDP as a library (OPA can run in-process via Wasm or Go). For small systems, middleware plus careful handler-level checks is fine.

The other pattern is policy enforcement points (PEPs) — the things that actually block the request. The gateway is a PEP for coarse checks. The handler is a PEP for fine-grained checks. PEPs call PDPs. This vocabulary (NIST’s) is useful when designing a system where policy authoring, decision, and enforcement are separated by team or by deployment unit.

74.11 Common pitfalls in interviews and in production

The pitfalls interviewers listen for:

  • Confusing 401 and 403. 401 means unauthenticated; 403 means authenticated but not allowed. A valid token for the wrong resource is 403.
  • Trusting a JWT without verifying the signature. “We’re behind a gateway so it’s fine.” Chapter 75 explains why this is wrong.
  • Accepting alg: none. Read any JWT library’s changelog; this bug has happened everywhere.
  • Not validating aud. A token minted for service A is accepted by service B because B did not check the audience claim. Cross-service confused deputy attack.
  • Long-lived access tokens with no revocation story. “Access tokens live 30 days” is a sign someone did not think about theft.
  • Roles hard-coded in JWTs that never refresh. User is demoted; their existing token still claims admin for another hour.
  • API keys in URLs. Logs everywhere, browser history, referer headers. Always in Authorization or a custom header.
  • alg specified by the token itself. Always validate against an allowlist of acceptable algorithms; never trust the header’s claim.
  • Symmetric signing across services. HS256 requires a shared secret; any service with the secret can mint valid tokens for any other. Use asymmetric signing (RS256, ES256, EdDSA).

The production pitfalls:

  • JWKS caching with no refresh. When the issuer rotates keys, services keep verifying against the old keys until they restart. Cache with a TTL and a refresh-on-unknown-kid path.
  • Clock skew. exp checks fail spuriously on servers with drifted clocks. Allow ~60 s of leeway (most libraries do by default).
  • Scopes that drift from APIs. New endpoints added without scope definitions; clients get unrestricted access by default. Deny-by-default in the authZ layer.
  • No audit log. When an incident happens and you need to know who did what, the absence of authZ logs is a nightmare. Log every decision with the identity, action, resource, and result.

74.12 The mental model

Eight points to take into Chapter 75:

  1. AuthN establishes identity; AuthZ decides what that identity can do. They run in that order, produce different errors (401 vs 403), and should be implemented separately.
  2. Session vs token is the first architectural choice. Sessions are stateful and easy to revoke; tokens are stateless and hard to revoke. Most systems mix both.
  3. A JWT is three base64 strings: header, payload, signature. Verify the signature, the algorithm, the issuer, the audience, and the expiration — every time.
  4. OAuth2 is authorization delegation. Authorization Code + PKCE for users, Client Credentials for services, Device Flow for inputs-limited devices. Implicit and ROPC are dead.
  5. OIDC is OAuth2 with an ID token. That is the entire difference that matters. Use it to not be in the password business.
  6. API keys are fine for machine clients if you hash them, prefix them, support rotation, and never log the raw key.
  7. mTLS handles service-to-service AuthN cleanly. Service meshes automate it. AuthZ is still a separate step after mTLS confirms identity.
  8. Authorization models (RBAC, ABAC, ReBAC) trade simplicity for expressiveness. Most real systems hybrid them, often with a policy decision point like OPA or OpenFGA.

In Chapter 75 the question becomes: when a request traverses five services, how does the original user’s identity propagate without being lost or forged along the way?


Read it yourself

  • RFC 7519 — JSON Web Token (JWT).
  • RFC 6749 — The OAuth 2.0 Authorization Framework. Long but readable.
  • RFC 8252 — OAuth 2.0 for Native Apps (why PKCE exists).
  • The OpenID Connect Core 1.0 specification.
  • The OWASP ASVS (Application Security Verification Standard), authentication and session management sections.
  • Google’s Zanzibar paper, “Zanzibar: Google’s Consistent, Global Authorization System.”
  • Auth0’s JWT handbook (free e-book) for a friendlier treatment.

Practice

  1. You have an endpoint /v1/documents/:id. A user presents a valid token but is not the owner of the document. Which HTTP status do you return, and why?
  2. Write the pseudocode for JWT verification. Include every check that must happen before you trust the claims.
  3. A CLI tool needs to authenticate to your API on behalf of a user’s laptop (no browser). Which OAuth2 grant? Justify.
  4. Explain the aud (audience) claim in plain terms. Why is failing to validate it a security bug?
  5. An access token lives 24 hours with no denylist. Why is this dangerous, and what is the minimal fix?
  6. Compare RBAC and ReBAC for a Google Docs-style product. Which resource-level decisions cannot be expressed in RBAC alone?
  7. Stretch: Stand up Keycloak or a local OIDC provider, implement an Authorization Code + PKCE flow in a small web app, and verify the ID token’s signature against the JWKS endpoint from scratch (not with a high-level library).