Skip to main content

Capped collections

Capped collections are a special type of MongoDB collection that have a fixed size and support high-throughput operations. They automatically remove the oldest documents to make space for new ones when they reach their maximum size. Capped collections are ideal for use-cases like logging, caching, and real-time analytics where you need a FIFO (First-In, First-Out) data structure.

Characteristics of Capped Collections

  1. Fixed Size: The size of the capped collection is predetermined. Once the size limit is reached, older documents are automatically removed.

  2. Preserves Insertion Order: Documents are stored in the order they were inserted, which makes it easy to retrieve documents based on insertion order.

  3. High Throughput: Capped collections are optimized for high-speed read and write operations.

  4. No Updates That Increase Size: You can update documents in a capped collection, but updates that increase the document size are not allowed, as this would violate the fixed size constraint.

  5. No Deletes: While you can't remove individual documents, you can still empty the entire collection or remove it.

Creating a Capped Collection

You can create a capped collection using the createCollection method with the capped and size options:

db.createCollection("myCappedCollection", { capped: true, size: 100000 })

Here, size is the maximum size in bytes for the capped collection.

Converting a Regular Collection to Capped

You can convert an existing collection to a capped collection using the convertToCapped command:

db.runCommand({ convertToCapped: 'myCollection', size: 100000 })

Querying a Capped Collection

Querying a capped collection is the same as querying a regular collection. However, you can take advantage of the natural order in which documents are stored:

db.myCappedCollection.find().sort({ $natural: -1 })

This query retrieves the most recently inserted documents first.

Use Cases

  1. Logging: Store log entries and automatically remove the oldest when the collection fills up.

  2. Real-time Analytics: Use for real-time metrics where only the most recent data is relevant.

  3. Caching: Store frequently accessed data up to a certain limit.

Considerations

  • No Indexes: By default, capped collections only have an index on the _id field. You can add additional indexes, but remember that indexes consume space, which is limited in a capped collection.

  • No Sharding: Capped collections cannot be sharded, which means they are not suitable for horizontal scaling.