A Practical Guide to Reclaim Disk Space from MongoDB
Dedicated to Ops who are weighed down with worry about high-disk usage of MongoDB.
This post applies to MongoDB 3.2 or better and all following operations run on Mongo Shell. You need to confirm your Mongo’s engine is WiredTiger:
|
|
Where to start
Find the main concern, use show dbs
to locate the biggest databases, and then locate the biggest collections after use your_biggest_db
:
|
|
Reclaim Disk Space
There are mainly two ways to go: MongoDB compact and move/delete data.
During reclaiming process, you need to pay attention to the extra IO stress brought by reclaiming. High IO stress enlarges database read/write latency and may make MongoDB slaves out of sync. You may monitor the sync health using following script on any node in the cluster:
|
|
MongoDB Compact
MongoDB seldom returns disk space to operating system during deleting, these disk pages are marked dirty and reserved for writing in the future. Compact command lets WiredTiger return these space. (like PostgreSQL’s Vacuum)
Compact needs to be executed on every node in the cluster.
Compact locks the whole database and the sync also stops if you are running it on a slave node. When you are running a single-node MongoDB, you should execute compact during scheduled maintenance time.
However, in MongoDB cluster, you can do rolling compact which brings zero system down time. Start with slave nodes(don’t forget rs.slaveOk()
), and finally run rs.stepDown()
in master node then compact.
The following script compacts all collections in all non-system databases of your MongoDB, and it sleeps between long-running compacts to reduce OpLog sync lagging:
|
|
Move or Delete Data
Moving and deleting data requires business feasibility assessment.
Moving and deleting should be executed before compact.
When deleting aged data, if the accesses to the collection can be held temporarily, we can swap the old collection with a newly created one. Dropping the whole collection is much faster than deleting records, which brings a lot more IO stress. Furthermore, the disk space taken by the old collection will be returned to operating system immediately.
|
|
If it is not feasible to hold access to collection, we have only to delete data one by one. Note sleeping between every deleteMany
reduces overall IO stress.
|
|
Now the work is done, let’s get a drink!