Chapter 86: Document stores: MongoDB and DynamoDB compared

Relational databases make perfect sense in interviews and are the wrong choice for roughly half of the data a modern ML system stores. Request metadata, user sessions, feature flags, model registries, experiment configs, job state — these are documents, not rows. They have irregular shapes, they evolve at the speed of product, they are read far more than they are joined, and they usually live in the hot path of a serving system where every millisecond of latency shows up on a dashboard. Two database families dominate this space: the document-model camp, led by MongoDB, and the wide-column / key-value camp, led by DynamoDB. They solve similar-looking problems with completely different mental models.

This chapter compares them in the level of detail a senior engineer needs to pick one for a new system without regret. The surface comparison — “Mongo is flexible, Dynamo is scalable” — is useless. The real differences are in what kind of queries they natively support, how they charge you, what they require you to know upfront, and where they quietly lose data if you do not read the docs. Knowing those differences lets you design a storage layer that survives scale, survives product change, and survives the bill.

Outline:

Document model vs key-value model.
MongoDB: the data model and the query language.
MongoDB: indexing, replica sets, sharding.
DynamoDB: the data model and the access API.
DynamoDB: single-table design.
Consistency models compared.
Cost models: RCU/WCU vs per-document.
Query patterns and where each breaks.
When each is the right call.
The mental model.

86.1 Document model vs key-value model

A document is a nested map of string keys to arbitrary values, where values can be scalars, lists, or other maps. JSON is the reference format. BSON (MongoDB’s binary JSON) is the wire format. A document has no fixed schema — two documents in the same collection can have completely different shapes — though MongoDB has added optional schema validation and most production deployments use it.

A key-value store, in the Dynamo sense, stores items under a key. The key has internal structure (partition key + sort key in DynamoDB), and the value is a map of attributes. The attributes are typed and named, but the set of attributes on an item is flexible: different items can have different attribute sets, like documents. The difference from a pure KV store (like Redis) is that Dynamo knows about the attributes and can index them.

Both models are schemaless at the engine level and schema-ful at the application level. In practice, you always end up with a de-facto schema maintained in application code. The difference is what the database lets you do with that schema:

Mongo treats every document as independently queryable. You can find({"status": "ready", "tags": "english", "size": {"$gt": 100}}) and Mongo will scan or use indexes to satisfy it. The query language is rich (boolean operators, array operators, full-text, geo, graph lookups via aggregation) and it mostly works without advance planning.
Dynamo only supports two query primitives: GetItem (by primary key) and Query (by partition key + optional sort key condition). You cannot ask “find all users whose email ends in @acme.com.” If your access pattern requires a condition on a non-key attribute, you must either create a secondary index specifically for that query or do a full table Scan, which is slow, expensive, and bad.

The consequence is that MongoDB is permissive, and DynamoDB is prescriptive. Mongo lets you store first and figure out the queries later. Dynamo forces you to figure out the queries first and then design the keys to make them cheap. Neither is wrong. They fit different decision contexts.

MongoDB lets you add indexes for new query patterns post-hoc; DynamoDB exposes only GetItem and Query — any access pattern not encoded in the primary key or a GSI requires a full Scan.

86.2 MongoDB: the data model and the query language

A MongoDB cluster is a collection of databases, each containing collections, each containing documents. A document is a BSON map with a required _id field (which is the primary key, typically an ObjectId, a 12-byte value encoding timestamp + machine id + counter). Collections are schemaless by default. Documents can nest maps and arrays to arbitrary depth.

A typical ML-system document — say, an inference request log:

{
  "_id": ObjectId("6612..."),
  "request_id": "req_7c3e91",
  "tenant": "acme",
  "model": "llama-3-70b",
  "ts": ISODate("2026-04-10T09:33:12Z"),
  "prompt_tokens": 412,
  "completion_tokens": 183,
  "latency_ms": 2184,
  "tool_calls": [
    {"name": "search", "latency_ms": 42},
    {"name": "fetch_url", "latency_ms": 310}
  ],
  "user": {"id": "u_9a21", "tier": "pro"}
}

Queries are expressed as BSON documents too. The query language is match-based: you hand in a document that describes the shape you want, and the engine returns documents that match.

db.requests.find({
  "tenant": "acme",
  "model": "llama-3-70b",
  "ts": {"$gte": ISODate("2026-04-10T00:00:00Z")},
  "latency_ms": {"$gt": 1000}
}).sort({"ts": -1}).limit(100)

The full aggregation framework ($lookup, $group, $match, $project, $unwind, etc.) adds join-like operations, grouping, and transformation. It is not SQL, but it is close enough in expressive power for most operational workloads. What it is not is cheap if you design your data poorly — $lookup against an unindexed foreign collection is a full scan.

The key feature is that the query language does not require you to know your queries in advance. If you decide tomorrow to filter by user.tier == "pro", you add an index and it just works. This is what makes Mongo the default choice for product-facing CRUD.

86.3 MongoDB: indexing, replica sets, sharding

MongoDB indexes are B-tree indexes on one or more fields, including nested paths and array elements. Common index types:

Single-field: createIndex({"user.id": 1}).
Compound: createIndex({"tenant": 1, "ts": -1}). Order matters — the prefix rule says that a compound index on (A, B, C) can serve queries on A, (A, B), and (A, B, C), but not on B alone.
Multi-key: an index on an array field indexes each element. createIndex({"tool_calls.name": 1}) lets you query documents where any tool call has a given name.
TTL: createIndex({"ts": 1}, {expireAfterSeconds: 2592000}) auto-deletes documents older than 30 days. Very useful for log tables.
Text: full-text search. Rarely used; most teams layer Elasticsearch or OpenSearch instead.
Geo: 2D and spherical indexes for location queries.

The trap is that indexes cost write throughput and storage. A collection with 10 indexes pays ~10× write amplification. For high-write workloads (log ingestion, audit trails), keep indexes to a minimum.

Replication in Mongo is via replica sets: a primary and N secondaries. Writes go to the primary; reads can go to secondaries (with configurable consistency). If the primary fails, the secondaries elect a new primary using a Raft-like protocol. A typical production replica set has 3 nodes (one primary, two secondaries), spread across availability zones. Failover takes 10-30 seconds; during failover, writes fail.

Sharding splits a collection across multiple shards, each itself a replica set. You choose a shard key (typically a field like tenant or user_id with an optional hash). Mongo auto-balances chunks of the key range across shards. Sharding is necessary beyond ~2 TB per collection or past the write throughput of a single primary, but it adds operational complexity: cross-shard queries require a scatter-gather, and shard key choice is a one-time decision that is painful to change.

For ML systems, a typical Mongo deployment is a 3-node replica set for the metadata store, unsharded, sitting behind the API gateway. Sharding only comes in when the request log or event log grows past what a single primary can handle, and at that point many teams migrate the hot log off Mongo to a column store or stream (Kafka + ClickHouse).

86.4 DynamoDB: the data model and the access API

DynamoDB is a fully-managed, serverless, wide-column key-value store built by AWS on the principles of the Dynamo paper. There is nothing to install, no replica set to manage, no shards to balance. You create a table, you specify a primary key, you start reading and writing. Scale is effectively unbounded (AWS has customers running tables at hundreds of thousands of requests per second).

The primary key is one or two attributes:

Partition key (PK): hashed to assign the item to a partition.
Optional sort key (SK): items with the same partition key are stored together, ordered by sort key.

GetItem fetches one item by full primary key. Query fetches items with a given partition key, optionally filtered by a condition on the sort key (=, <, <=, >, >=, between, begins_with). Scan reads the whole table, page by page, and applies a filter in the client. BatchGetItem and BatchWriteItem batch up to 100 items per call.

Items can have arbitrary additional attributes beyond the key — strings, numbers, binary, booleans, lists, maps, sets — but only the primary key and explicit secondary indexes can be queried efficiently. Everything else requires a scan.

Two types of secondary index:

Local Secondary Index (LSI): same partition key, different sort key. Must be defined at table creation time. Shares throughput with the base table.
Global Secondary Index (GSI): different partition key and/or sort key. Created at any time, has its own provisioned throughput, replicates asynchronously from the base table (eventually consistent by default).

GSIs are how you support multiple access patterns in Dynamo. If your base table is keyed by user_id and you also need to query by email, you create a GSI with email as the partition key. The index is a separate copy of the data, updated asynchronously after each write.

86.5 DynamoDB: single-table design

The Dynamo school of thought — most forcefully articulated by Rick Houlihan and codified in The DynamoDB Book — is that a well-designed Dynamo application should use a single table to store all entity types. Instead of one table per entity (users, orders, products), you have one big table with a generic schema and use the partition key and sort key to encode entity identity and relationships.

A canonical single-table design for an e-commerce-style application:

PK	SK	type	data
`USER#u123`	`PROFILE`	user	name, email, created_at
`USER#u123`	`ORDER#2026-04-10#o9876`	order	total, status
`USER#u123`	`ORDER#2026-04-09#o9875`	order	total, status
`ORDER#o9876`	`ITEM#p42`	item	qty, price
`ORDER#o9876`	`ITEM#p17`	item	qty, price

Now the access patterns are:

Get user profile: GetItem(PK="USER#u123", SK="PROFILE").
Get user’s recent orders: Query(PK="USER#u123", SK begins_with "ORDER#"), with ScanIndexForward=false to get newest first.
Get all items in an order: Query(PK="ORDER#o9876", SK begins_with "ITEM#").

One table, three access patterns, all O(1) or O(k) where k is the number of returned items. No joins, no scans, no cross-table coordination.

The design pattern generalizes: every access pattern gets a specific PK/SK combination, and if two entities need to appear together in a query result, they share a partition key. If you need to query the same entity from multiple angles, you create a GSI with alternate PK/SK keyed off other attributes.

Why do it this way? The answer is cost and latency. Each Dynamo Query is a single network round trip to one partition. A multi-table design with client-side joins is multiple round trips and multiple reads. At 100 ms of user-facing latency budget, every round trip matters. At billions of reads per month, every read matters for cost.

Every entity type (user, order, item) shares one table: the PK/SK encoding collapses multi-table joins into a single Query call at O(k), which is why single-table design wins on latency and cost.

The pattern is also brutal to change. Once you have petabytes of data under a single-table schema, rearranging the schema is a full migration. This is why Dynamo forces you to plan your access patterns up front — because you cannot easily change them later.

86.6 Consistency models compared

MongoDB offers per-operation read concerns and write concerns:

Read concern local: return whatever is on this node, including unreplicated writes. Fastest, weakest.
Read concern majority: return only data that a majority of the replica set has acknowledged. Stronger, slower.
Read concern linearizable: stronger still; reads always reflect the most recent majority write. Expensive.
Write concern w:1: acknowledged when the primary has the write. Fast.
Write concern w:majority: acknowledged when a majority has the write. Durable, slower.
w:majority, j:true: majority plus journaled to disk. Durable against power loss.

For ML metadata stores, w:majority with readConcern: majority is the right default. It costs ~10-30% throughput but gives you the consistency you need for correctness.

Mongo also supports multi-document transactions (since 4.0 on replica sets, 4.2 on sharded clusters). They are real ACID transactions, but they are not cheap and not designed for hot-path use. Use them for registration flows, billing, anything where multi-doc atomicity matters. Don’t use them per-request in your hot path.

DynamoDB offers two consistency levels per read:

Eventually consistent read: the default. May return stale data up to a second old. Costs 0.5 RCU (read capacity unit).
Strongly consistent read: reflects all prior writes in the same region. Costs 1 RCU (twice as much).
Transactional read: part of a transaction, reflects a consistent snapshot across multiple items. Costs 2 RCU.

Writes are always durable: a successful write is replicated to 3 AZs before returning. Dynamo also supports transactions via TransactWriteItems and TransactGetItems, which do ACID across up to 100 items in a single call. Cost is double the per-item cost.

For metadata tables that back hot-path serving, eventually consistent reads are usually fine and save 50% on read cost. For systems that need read-your-writes (user just updated their profile and immediately loaded it), use strongly consistent reads.

86.7 Cost models: RCU/WCU vs per-document

The cost models are the most surprising difference in practice.

DynamoDB charges by request units and storage:

Read Capacity Unit (RCU): one strongly consistent read of an item up to 4 KB, or two eventually consistent reads of up to 4 KB. Items over 4 KB consume more RCUs proportionally.
Write Capacity Unit (WCU): one write of an item up to 1 KB. Items over 1 KB consume more WCUs.
Storage: $0.25/GB/month.

You pay for either provisioned capacity (you reserve X RCU/WCU, pay a flat rate) or on-demand (you pay per request, higher per-request cost, no need to provision). On-demand is easier but ~7× the cost of provisioned at the same steady load. Most production tables use provisioned with autoscaling.

A back-of-envelope for a read-heavy table: 1000 reads/sec of 2 KB items, eventually consistent, on-demand, 30 days. That’s 1000 × 0.5 RCU × 86400 × 30 × $0.25 per million RCU-hours = roughly $100/month for the reads alone. For comparison, the same workload in Mongo is a single replica set, $300/month of RAM, and you’re paying for the provisioned hardware regardless of whether you hit it.

MongoDB pricing depends on how you deploy it:

MongoDB Atlas (hosted): per-instance hourly cost based on RAM and storage. An M30 (8 GB RAM) is ~$0.54/hour, ~$400/month. Scale by choosing larger instances.
Self-hosted on K8s: you pay for the nodes and operate it yourself. Dramatically cheaper at scale, dramatically more operational load.

The cost math: Dynamo is cheaper at low throughput and gets expensive at high throughput. Mongo has a fixed floor (the replica set cost) but scales cheaply to a high ceiling per replica set. The crossover point is workload-dependent, but a rough rule: Dynamo wins for spiky workloads under ~1000 req/sec; Mongo wins for steady workloads above that.

For ML platforms where you can’t predict traffic (a new product launch, a viral demo), Dynamo’s on-demand model is a risk reducer: if traffic spikes 100×, you get 100× the bill but the system keeps working. Mongo’s fixed capacity fails hard at 100× and you have to scale up the replica set, which takes minutes.

DynamoDB on-demand costs scale linearly with traffic while MongoDB Atlas charges a fixed floor — the crossover is roughly 1000 req/s, making Dynamo the better choice for spiky or low-volume workloads and Mongo better for high-steady-state load.

86.8 Query patterns and where each breaks

The failure modes are where the rubber meets the road.

Mongo breaks when:

Working set exceeds RAM. Mongo is a memory-mapped database; if the hot data doesn’t fit in RAM, every query hits disk and latency explodes. The cure is more RAM or sharding.
You add indexes during traffic. Background index builds used to lock the collection; modern versions are online, but still expensive.
Hot shard. A poorly chosen shard key concentrates writes on one node. The cluster is fine on average; one node is on fire.
Query planner picks the wrong index. Mongo’s query planner caches plans, and occasionally it caches a bad one. explain() is your friend.
Unbounded $lookup. An aggregation pipeline that does a lookup against a large unindexed collection will run for hours.

Dynamo breaks when:

You need a query pattern you didn’t plan for. The fix is always a new GSI, and GSIs take time to backfill — hours or days for a large table.
Partition hot spots. Even after the 2018 adaptive capacity improvements, a single hot partition key still throttles. user_id is fine; "global_counter" is not.
Item size exceeds 400 KB. Dynamo has a hard 400 KB per-item limit. If your document is bigger, you need to split it or store the payload in S3 and reference it.
Large Scan operations. Scans are expensive, slow, and compete with real traffic. Never do scans in the hot path.
You want real text search or analytical joins. Dynamo does neither. Export to OpenSearch or Redshift for those.

The two systems fail in opposite directions. Mongo fails when you push it past its operational limits. Dynamo fails when you ask it a question you didn’t plan for. Both can bite.

86.9 When each is the right call

Pick MongoDB when:

Your access patterns are not fully known yet. You’re in product iteration.
Your schema is genuinely document-shaped (deep nesting, arrays, variable fields).
You need rich query (text search, aggregations, $lookup).
You prefer a single-cluster model where you control the hardware and the ops.
Your team has Mongo experience already.

Pick DynamoDB when:

Your access patterns are known and stable. You’ve thought them through.
You need effectively unbounded scale without provisioning work.
You don’t want to operate a database.
Your workload is spiky and you want on-demand pricing.
You’re already on AWS and want IAM, VPC endpoints, and integration for free.

Pick neither when:

Your workload is truly relational with many joins. Use Postgres.
Your workload is time-series. Use a time-series DB (Chapter 87).
Your workload is wide analytics on hundreds of columns. Use ClickHouse or a lakehouse (Chapter 90).
Your workload is a key-value cache hot path with sub-millisecond latency. Use Redis (Chapter 89).

For ML systems, a typical pattern is: DynamoDB for the hot serving metadata (request routing, rate limits, feature flags, user tier lookups) and MongoDB or Postgres for the platform control plane (experiments, datasets, model registry, run history) where schema evolves and rich queries matter. Different databases for different jobs, with clear ownership.

86.10 The mental model

Eight points to take into Chapter 87:

Document model (Mongo) lets you model what you want; key-value model (Dynamo) forces you to model what you query.
Mongo gives you a full query language; Dynamo gives you GetItem, Query, and Scan.
Single-table design is the Dynamo school of thought: one table, generic PK/SK, every access pattern is one round trip.
Mongo indexes are cheap to add, expensive to write through. Dynamo GSIs are cheap to query, expensive to add later.
Consistency models differ: Mongo has per-operation read/write concerns; Dynamo has eventual vs strongly consistent reads.
Cost models differ sharply. Dynamo charges per request; Mongo charges per instance. Crossover depends on throughput.
Failure modes are opposite: Mongo fails when you exceed operational limits; Dynamo fails when you ask unplanned questions.
Most production ML platforms use both: Dynamo for hot serving metadata, Mongo or Postgres for the control plane.

In Chapter 87 the focus narrows to a specific query shape: time-series data.

Read it yourself

Alex DeBrie, The DynamoDB Book (2020). The canonical text on single-table design and Dynamo access patterns.
DeCandia et al., Dynamo: Amazon’s Highly Available Key-value Store (2007). The paper that started it all.
Kyle Banker et al., MongoDB in Action, 2nd edition. The standard MongoDB reference.
Rick Houlihan’s AWS re:Invent talks on DynamoDB single-table design — especially DAT401 and DAT403.
The MongoDB documentation on read and write concerns.
The DynamoDB documentation on on-demand vs provisioned pricing.

Practice

A social network needs to store users, posts, and comments. Design a single-table DynamoDB schema supporting these queries: get a user’s profile; get a user’s recent posts; get all comments on a post; get a post’s author.
The same schema in MongoDB. Compare: how many collections do you use, what indexes, what queries?
Compute the DynamoDB cost for a table with 100 GB of data, 5000 reads/sec (eventually consistent, 2 KB items), and 500 writes/sec (1 KB items) on on-demand pricing. Then on provisioned at 80% utilization.
Why is Dynamo’s 400 KB item limit there? What are the architectural reasons, and what do you do when you have a 5 MB document?
Walk through a Mongo query planner picking a wrong index. How do you diagnose with explain()? How do you force a specific index?
When is TransactWriteItems in Dynamo worth its 2× cost compared to a sequence of PutItem calls? Construct a scenario.
Stretch: Stand up MongoDB and DynamoDB Local side by side. Implement a small experiment tracker (experiments, runs, metrics) in both. Compare the query code for “show me all runs for experiment X sorted by timestamp descending,” and measure the p99 latency of that query with 1 million runs in each system.