![]() |
VOOZH | about |
Document-oriented NoSQL database MongoDB offers users a flexible schema design and rich query language as well as supports client drivers in multiple programming languages. Whether you're migrating to MongoDB or configuring it for the first time, this Refcard covers everything you need to know about setup and view options, using the shell, query and update operators, indexing, replica set maintenance, backups, user management, and much more — all accompanied by code samples and tips for success.
Written By
MongoDB is a popular document-oriented NoSQL database that offers flexible schema design. It stores data in JSON-like documents, provides a rich query language, and supports client drivers in multiple programming languages. This Refcard covers MongoDB v4.4 onward up to v6.0. It is intended to help you get the most out of MongoDB and assumes that you already know the basics.
If you're just starting out, first, explore these resources:
In this section, we cover configuration options for MongoDB.
Startup options for MongoDB can be set on the command line or in a configuration file. The syntax is slightly different between the two:
Table 1
| Command Line | Config File |
--dbpath <path> |
dbpath=<path> |
--auth |
auth=true |
Run mongod --help for a full list of options. Here are some of the most useful:
Table 2
| Option | Description |
--config <filename> |
File for runtime configuration options |
--dbpath <path> |
The directory where the mongod instance stores its data |
--logpath <path> |
The path where MongoDB creates a file to send diagnostic logging information |
--logappend |
Appends new entries to the end of the existing log file when the mongod instance restarts |
--fork |
Enables daemon mode that runs the mongod process in the background |
--auth |
Enables authorization to control user access to database resources and operations |
--keyFile <path> |
Path to a shared secret that enables authentication on replica sets and sharding |
--bind_ip <options> |
Specifies where mongod should listen for client connections; this could be hostnames, IP addresses, and/or full Unix domain socket paths |
If you started mongod with a bunch of options six months ago, how can you see which options you used? The shell has a helper:
> db.serverCmdLineOpts()
{ "argv" : [ "./mongod", "--port", "30000" ], "parsed" : { }, "ok" : 1 }
The parsed field is a list of arguments read from a config file.
This section covers topics around MongoDB shell usage. Note that the mongo shell was deprecated in MongoDB v5.0 and removed from MongoDB v6.0. The replacement shell is mongosh.
There are a number of functions that give you a little help if you forget a command:
> // basic help
> help
Shell Help:
use Set current database
show 'show databases'/'show dbs': Print a list of all available databases.
'show collections'/'show tables': Print a list of all collections for current database.
'show profile': Prints system.profile information.
'show users': Print a list of all users for current database.
'show roles': Print a list of all roles for current database.
'show log <type>': log for current connection, if type is not set uses 'global'
'show logs': Print all logs.
exit Quit the MongoDB shell with exit/exit()/.exit
quit Quit the MongoDB shell with quit/quit()
...
Note that there are separate help functions for databases, collections, replica sets, sharding, administration, and more. Although not listed explicitly, there is also help for cursors:
> // list common cursor functions
> db.foo.find().help()
You can use these functions and helpers as built-in cheat sheets. You can find the full list here: https://www.mongodb.com/docs/mongodb-shell/reference/access-mdb-shell-help
If you don't understand what a function is doing, you can run it without the parentheses in the shell to see its source code:
> // run the function
> db.serverCmdLineOpts()
{ "argv" : [ "./mongod" ], "parsed" : { }, "ok" : 1 }
> // see its source
> db.serverCmdLineOpts
This can be helpful for seeing what arguments the function expects or what errors it can throw, as well as how to run it from another language.
The shell has limited multi-line support, so it can be difficult to program in. The shell helper edit makes this easier, which opens up a text editor, allowing you to edit variables from there. For example:
> x = function() { /* some function we're going to fill in */ }
> edit x
<opens emacs with the contents of x>
Modify the variable in your editor, then save and exit. The variable will be set in the shell.
Either the EDITOR environment variable or a MongoDB shell variable EDITOR must be set to use edit. You can set it in the MongoDB shell as follows:
> EDITOR="/user/bin/emacs"
Note that edit is not available from JavaScript scripts, only in the interactive shell.
If a .mongoshrc.js file exists in your home directory, it will run on shell startup automatically. Use it to initialize any helper functions you use regularly and remove functions you don't want to accidentally use. Use the --norc option to prevent .mongoshrc.js from being loaded.
For example, if you would prefer to not have dropDatabase() available by default, you could add the following lines to your .mongoshrc.js file:
DB.prototype.dropDatabase = function() {
print("No dropping DBs!");
}
db.dropDatabase = DB.prototype.dropDatabase;
The example above will change the dropDatabase() helper to only print a message, not to drop databases. Note: This technique should not be used for security because a determined user can still drop a database without the helper. However, removing dangerous admin commands, as shown in the example above, can help prevent fat-fingering.
Here are a couple suggestions for helpers you may want to remove from .mongoshrc.js:
DB.prototype.shutdownServerDBCollection.prototype.dropDBCollection.prototype.ensureIndexDBCollection.prototype.reIndexDBCollection.prototype.dropIndexesThe shell prompt can be customized by setting the prompt variable to a function that returns a string:
prompt = function() {
try {
db.getLastError();
}
catch (e) {
print(e);
}
return (new Date())+"$";
}
If you set a prompt, it will be executed each time the prompt is drawn (thus, the example above would give you the time the last operation completed).
Try to include the db.getLastError() function call in your prompt. This is included in the default prompt and takes care of server reconnection and returning errors from writes. Also, always put any code that could throw an exception in a try/catch block, as shown in the example above. It's annoying to have your prompt turn into an exception!
This section covers how to get detailed information on operations, index usage, replication status, and more.
You can see current operations with the currentOp function:
> db.currentOp()
{
"inprog" : [
{
"opid" : 123,
"active" : false,
"locktype" : "write",
"waitingForLock" : false,
"secs_running" : 200,
"op" : "query",
"ns" : "foo.bar",
"query" : {
}
...
},
...
]
}
Using the opid field from above, you can kill operations:
> db.killOp(123)
Not all operations can be killed or will be killed immediately. In general, operations that are waiting for a lock cannot be killed until they acquire the lock.
Use explain() to see which index MongoDB is using for a query. verbosity specifies the mode, which determines the amount of returned information. Possible modes include allPlansExecution (default), queryPlanner, and executionStats.
> db.runCommand({
explain: {
count: "users",
query: { age: { $gt: 30 } },
},
verbosity: "queryPlanner",
});
{
explainVersion: '1',
queryPlanner: {
namespace: 'test.users',
indexFilterSet: false,
maxIndexedOrSolutionsReached: false,
maxIndexedAndSolutionsReached: false,
maxScansToExplodeReached: false,
winningPlan: { stage: 'COUNT', inputStage: { stage: 'EOF' } },
rejectedPlans: []
},
command: { count: 'users', query: { age: { '$gt': 30 } }, '$db': 'test' },
serverInfo: {
host: 'bdc9e348c602',
port: 27017,
version: '7.0.4',
gitVersion: '38f3e37057a43d2e9f41a39142681a76062d582e'
},
serverParameters: {
internalQueryFacetBufferSizeBytes: 104857600,
internalQueryFacetMaxOutputDocSizeBytes: 104857600,
internalLookupStageIntermediateDocumentMaxSizeBytes: 104857600,
internalDocumentSourceGroupMaxMemoryBytes: 104857600,
internalQueryMaxBlockingSortMemoryUsageBytes: 104857600,
internalQueryProhibitBlockingMergeOnMongoS: 0,
internalQueryMaxAddToSetBytes: 104857600,
internalDocumentSourceSetWindowFieldsMaxMemoryBytes: 104857600,
internalQueryFrameworkControl: 'trySbeEngine'
},
ok: 1
}
There are several important fields in the output of explain():
explainVersion is the output format version. command is the command being explained.queryPlanner provides information about the selected and rejected plans by the query optimizer.executionStats provides execution details of the accepted and rejected plans.serverInfo provides information about the MongoDB instance.serverParameters provides details about the internal parameters.Here are some common cursor types in MongoDB:
db.collection.find(). It iterates over query results in batches, retrieving data on demand from the server.Use hint() to force a particular index to be used for a query:
> db.foo.find().hint({x:1})
You can turn on system profiling to see operations currently happening on a database. Note that there is a performance penalty to profiling, but it can help isolate slow queries.
> db.setProfilingLevel(2) // profile all operations
> db.setProfilingLevel(1) // profile operations that take longer than 100ms
> db.setProfilingLevel(1, 500) // profile operations that take longer than 500ms
> db.setProfilingLevel(0) // turn off profiling
> db.getProfilingLevel(1) // see current profiling setting
Profile entries are stored in a capped collection called system.profile within the database in which profiling was enabled. Profiling can be turned on and off for each database.
To find replication lag information for each secondary node, connect to the primary node of the replica set and run this command:
> rs.printSecondaryReplicationInfo()
source: m1.demo.net:27002
syncedTo: Mon Feb 01 2023 10:20:40 GMT-0800 (PST)
20 secs (0 hrs) behind the primary
The above command prints a formatted output of the replica set status. You can also use db.printReplicationInfo() to retrieve the replica set member's oplog. Its output is identical to that of rs.printReplicationInfo().
To see a member's view of the entire set, connect to it and run the following command:
> rs.status()
This command returns a structured JSON output and shows you what it thinks the state and status of the other members are. Running rs.status() on a secondary node will show you which node the secondary is syncing from in the syncSourceHost field.
To see your cluster's metadata (shards, databases, chunks, etc.), execute the following command from the MongoDB shell (mongosh) connected to any member of the sharded cluster:
> db.printShardingStatus()
If verbosity is set to true, it displays full details of the chunk distribution across shards along with the number of chunks on each shard:
> db.printShardingStatus(true)
sh.status can also be executed on a mongos instance to fetch sharding configuration. Its output is the same as that of printShardingStatus:
> sh.status()
You can also connect to the mongos and see data about your shards, databases, collections, or chunks by using use config, then querying the relevant collections:
> use config
switched to db config
> show collections
changelog
chunks
collections
csrs.indexes
databases
migrationCoordinators
mongos
rangeDeletions
settings
shards
tags
version
Always connect to a mongos to get sharding information. Never connect or write directly to a config server; always use sharding commands and helpers.
After maintenance, sometimes mongos processes that were not actually performing the maintenance will not have an updated version of the config. Either bouncing these servers or running the flushRouterConfig command is generally a quick fix to this issue:
> use admin
> db.runCommand({flushRouterConfig:1})
Often this problem will manifest as setShardVersion failed errors. Don't worry about setShardVersion errors in the logs, but they should not trickle up to your application. Note that you shouldn't get the errors from a driver unless the mongos it's connecting to cannot reach any config servers.
The table below provides several index options. For a complete list, refer to: https://www.mongodb.com/docs/manual/reference/method/db.collection.createIndex
Table 3
| Index Option | Description |
unique |
If not specified, MongoDB generates the index name by concatenating the names of indexed fields and the sort order. |
name |
The directory where the mongod instance stores its data |
partialFilterExpression |
If specified, the index will only reference documents that match the provided filter expression. |
sparse |
If set to true, the index will only reference documents with the specified field; it is false by default. |
expireAfterSeconds |
Time to live (in seconds) that controls how long MongoDB retains documents in this collection. |
hidden |
Controls whether the index is hidden from the query planner. |
storageEngine |
Specifies storage engine during index creation. |
Queries are generally of the form: {key : {$op : value}}
For example: {age : {$gte : 18}}
There are three exceptions to this rule — $and, $or, and $nor — which are all top level: {$or : [{age: {$gte : 18}}, {age : {$lt : 18}, parentalConsent:true}}]}
Updates are always of the form: {key : {$mod : value}}
For example: {age : {$inc : 1}}
The symbols in Table 4 indicate the following:
Table 4
| Operator | Example Query | Result |
|
|
{numSold : {$lt:3}} |
✓ X X |
|
|
{hand : {$all : ["10","J","Q","K","A"]}} |
✓ X |
$all |
{hand : {$all : ["10","J","Q","K","A"]}} |
✓ X |
$not |
{ $nor: [{ status: "active" }, { age: { $gte: 65 } }] } |
✓ X |
$mod |
{age : {$mod : [10, 0]}} |
✓ X |
$exists |
{phone: {$exists: true}} |
✓ X |
$type* |
{age : {$type : 2}} |
✓ X |
$size |
{"top-three":{$size:3}} |
✓ X |
$regex |
{role: /admin.*/i} {role: {$regex:'admin.*', $options: 'i' }} |
✓ X |
$all |
{ genres: { $all: ["fiction", "mystery"] } } |
✓ X |
$size |
{ players: { $size: 5 } } |
✓ X |
Table 5 includes commonly used MongoDB update operations:
Table 5
| Modifier | Start Doc |
Example Mod |
Result |
$set |
{x:"foo"} |
{$set:{x:[1,2,3]}} |
{x:[1,2,3]} |
$unset |
{x:"foo"} |
{$unset:{x:true}} |
{} |
$inc |
{countdown:5} |
{$inc:{countdown:-1}} |
{countdown:4} |
|
|
{votes:[-1,-1,1]} |
{$push:{votes:-1}} |
{votes:[-1,-1,1,-1}} |
|
|
{blacklist:["ip1","ip2","ip3"]} |
{$pull:{blacklist:"ip2"}} |
{blacklist:"ip1","ip3"} |
$pop |
{queue:["1pm","3pm","8pm"]} |
{$pop:{queue:-1}} |
{queue:["3pm","8pm"]} |
|
|
{ints:[0,1,3,4]} |
{$addToSet:{ints:{$each:[1,2,3]}}} |
{ints:[0,1,2,3,4]} |
$rename |
{nmae:"sam"} |
{$rename:{nmae:"name"}} |
{name:"sam"} |
$bit |
{permission:6} |
{$bit:{permissions:{or:1}}} |
{permission:7} |
$min |
{"temp":25} |
{$min: { temp:20}} |
{"temp":20} |
$setOnInsert |
{"name":"bob"} |
{$setOnInsert: {resetPassword: true }} |
{"name": "bob", "resetPassword": true} |
$sort |
{ "scores": [5, 8, 3, 9] } |
|
{ "scores": [3, 5, 8, 9] } |
The aggregation framework can be used to perform everything from simple queries to complex aggregations. To use it, pass the aggregate() function a pipeline of aggregation stages:
> db.collection.aggregate({$match:{x:1}},
... {$limit:10},
... {$group:{_id : "$age"}})
Table 6 contains list of operators for the available stages:
Table 6
| Operator | Description |
{$project : projection} |
Includes, excludes, renames, and munges fields |
{$match : match} |
Queries and takes an argument identical to that passed to find() |
{$limit : num} |
Limits results to num |
{$skip : skip} |
Skips num results |
{$sort : sort} |
Sorts results by the given fields |
{$group : group} |
Groups results using the expressions given (see Table 7) |
{$unwind : field} |
Explodes an embedded array into its own top-level documents |
To refer to a field, use the syntax $fieldName. For example, this projection would return the existing time field with a new name, "time since epoch": {$project: {"time since epoch": "$time"}}
$project and $group can both take expressions, which can use the $fieldName syntax as shown below:
Table 7
| Expression OP Example | Description |
$add : ["$age", 1] |
Adds 1 to the age field. |
$divide : ["$sum", "$count"] |
Divides the sum field by count. |
$mod : ["$sum", "$count"] |
The remainder of dividing sum by count. |
$multiply : ["$mph", 24, 365] |
Multiplies mph by 24*365. |
$subtract : ["$price", "$discount"] |
Subtracts discount from price. |
$strcasecmp : ["ZZ", "$name"] |
1 if name is less than ZZ, 0 if name is ZZ, -1 if name is greater than ZZ. |
$substr : ["$phone", 0, 3] |
Gets the area code (first three characters) of phone. |
$toLower : "$str" |
Converts str to all lowercase. |
$toUpper : "$str" |
Converts str to all uppercase. |
$ifNull : ["$mightExist", $add : ["$doesExist", 1]] |
If mightExist is not null, it returns mightExist. Otherwise, it returns the result of the second expression. |
$cond : [exp1, exp2, exp3] |
If exp1 evaluates to true, it returns exp2. Otherwise, it returns expr3. |
One of the ways to back up MongoDB data is to make a copy of the database files while they are in a consistent state (i.e., not in the middle of being read from/to).
1. Use the fsyncLock() command, which flushes all in-flight writes to disk and prevents new ones:
> db.fsyncLock()
{
info: 'now locked against writes, use db.fsyncUnlock() to unlock',
lockCount: Long('1'),
seeAlso: 'http://dochub.mongodb.org/core/fsynccommand',
ok: 1
}
2. Copy data files to a new location.
3. Use the fsyncUnlock() command to unlock the database:
> db.fsyncUnlock()
{ info: 'fsyncUnlock completed', lockCount: Long('0'), ok: 1
Note: To restore from this backup, copy the files to the correct server's dbpath and start the mongod.
Alternatively, if you have a filesystem that does filesystem snapshots, your journal is on the same volume, and you haven't done anything stripy with RAID, you can take a snapshot without locking. In this case, when you restore, the journal will replay operations to make the data files consistent.
There are several other options for backing up your MongoDB data:
mongodump and mongorestore – mongodump is used to create a binary export of MongoDB data, while mongorestore is used to import this data back into a MongoDB instance.filesystem snapshots capture a consistent state of MongoDB data files at a point in time for fast and efficient backups.Replica sets allow a MongoDB deployment to remain available during the majority of a maintenance window.
To permanently stop a member from being elected, change its priority to 0:
> var config = rs.config()
> config.members[2].priority = 0
> rs.reconfig(config)
To prevent a secondary from being elected temporarily, connect to it and issue the freeze command:
> rs.freeze(10*60) // # of seconds to not become primary
The freeze command can be handy if you don't want to change priorities permanently but need to do maintenance on the primary node.
If a member is currently primary and you don't want it to be, use stepDown:
> rs.stepDown(10*60) // # of seconds to not try to become primary again
For maintenance, often, it is desirable to start up a secondary and be able to do writes on it (e.g., for building indexes). To accomplish this, you can start up a secondary as a stand-alone mongod temporarily.
If the secondary was originally started with the following arguments:
$ mongod --dbpath /data/db --replSet setName --port 30000
Then shut it down cleanly and restart it with:
$ mongod --dbpath /data/db --port 30001
Note that the dbpath does not change but the port does, and the replSet option is removed (all other options can remain the same). This mongod will come up as a stand-alone server. The rest of the replica set will be looking for a member on port 30000, not 30001, so it will just appear to be "down" to the rest of the set.
When you are finished with maintenance, restart the server with the original arguments.
To check current user privileges:
> db.runCommand(
... {
... usersInfo:"manager",
... showPrivileges:true
... }
... )
To create a superAdmin:
> use sensors
switched to db sensors
> db.createUser(
... {
... user: "sensorsUserAdmin",
... pwd: "password",
... roles:
... [
... {
... role: "userAdmin",
... db: "sensors"
... }
... ]
... }
... )
To view user roles:
> use sensors
switched to db sensors
> db.getUser("sensorsUserAdmin")
{
"_id" : "sensors.sensorsUserAdmin",
"user" : "sensorsUserAdmin",
"db" : "sensors",
"roles" : [
{
"role" : "userAdmin",
"db" : "sensors"
}
]
}
To show role privileges:
> db.getRole( "userAdmin", { showPrivileges: true } )
To grant a role:
> db.grantRolesToUser(
... "sensorsUserAdmin",
... [
... { role: "read", db: "admin" }
... ]
... )
To revoke a role:
> db.revokeRolesFromUser(
... "sensorsUserAdmin",
... [
... { role: "userAdmin", db: "sensors" }
... ]
... )
Below are common limitations in MongoDB. For a full list, see https://www.mongodb.com/docs/manual/reference/limits/.
allowDiskUse option allows aggregation pipeline stages to use temporary files for processing.., $, or \0 (the null character). Names can only contain characters that can be used on your filesystem as filenames. Admin, config, and local are reserved database names. (Note that you can store your own data in them, but you should never drop them.)$ or null, start with the system. prefix, or be an empty string. Names prefixed with system. are reserved by MongoDB and cannot be dropped — even if you created the collection. Periods are often used for organization in collection names, but they have no semantic importance.null.For next steps and more information, see the following resources:
ADVERTISE
CONTRIBUTE ON DZONE
LEGAL
CONTACT US
Let's be friends: