2.1. DATA STORAGE

2.1.1. Document-Oriented Storage
MongoDB’s document-oriented design, which we’ve discussed earlier, works seamlessly with BSON, the binary format used for storing documents. BSON, short for Binary JSON, is an open standard, and you can find its specification at bsonspec.org.

2.1.2. Why BSON?
- Efficiency Over Space-Saving:
MongoDB prioritizes speed over space efficiency. Although BSON might occupy more space than JSON in some cases, it enables faster document traversal and indexing. This tradeoff is generally acceptable because MongoDB is designed to scale across machines, and storage is relatively inexpensive. WiredTiger, MongoDB’s default storage engine, supports multiple compression libraries with index and data compression enabled by default. Compression levels can be configured both at the server level and per collection, allowing you to balance CPU usage with disk space savings. - Ease of Conversion:
BSON is easy to convert to a programming language’s native data format. If MongoDB used pure JSON, a higher-level conversion would be necessary, potentially slowing down operations. MongoDB drivers are available for many programming languages (e.g., Python, Ruby, PHP, C, C++, and C#), and each works slightly differently. The simple binary format of BSON allows native data structures to be built quickly in each language, making the code simpler and faster. - BSON Extension:
BSON provides some extensions to JSON, such as the ability to store binary data and specific data types. While BSON can store any JSON document, a valid BSON document may not be valid JSON. This distinction is handled by MongoDB’s drivers, which convert data to and from BSON without needing to use JSON as an intermediary format.
2.1.3. Order of documents
It’s important to note that documents in MongoDB don’t have a particular order. This is due to how documents are stored in data files. If documents were stored in a strict order, any increase in a document’s size would require all subsequent documents to be shifted to make room, which would be inefficient.
MongoDB stores documents in records, which are part of extents. Extents contain documents from a single collection to maintain a degree of continuity, enabling faster collection scans. Records are slightly larger than the documents they contain to allow for some growth, and this extra space is referred to as “padding.” However, if a document grows too large, it is moved to a bigger record, which may change the document order as a side effect.
Additionally, MongoDB has a maximum document size of 16MB. This limit acts as a sanity check rather than a technical limitation. For storing larger data, MongoDB uses GridFS
.