![]() |
VOOZH | about |
MongoDB is a popular open-source, document-oriented NoSQL database that stores data in flexible BSON (JSON-like) format, enabling faster and more efficient data storage and retrieval than traditional relational databases.
Data Model:
Schema:
Relationships:
Scalability:
Querying:
Document:
Example:
Collection:
users collection may contain multiple documents like the one above.MongoDB stores data in BSON (Binary JSON) format, which is optimized for speed and traversing. Collections are stored in databases as files on disk using the WiredTiger storage engine.
BSON (Binary JSON) is the binary-encoded data format used by MongoDB to store and transmit documents. It extends JSON with additional data types and is optimized for efficient storage and fast query processing.
Significance of BSON in MongoDB
MongoDB supports a variety of BSON data types, which extend standard JSON types:
To create a new database and collection in MongoDB, you can use the mongo shell:
use mydatabase
db.createCollection("mycollection")
This command switches to mydatabase (creating it if it doesn't exist) and creates a new collection named mycollection.
_id uniquely identifies each document within a collection. It acts as the default primary key and helps in indexing and quick retrieval.
You can use the insertOne() or insertMany() methods.
Example:
db.users.insertOne({ name: “John”, age: 28 })Sharding is a method for distributing data across multiple servers in MongoDB. It allows for horizontal scaling by splitting large datasets into smaller, more manageable pieces called shards.
CRUD operations in MongoDB are used to create, read, update, and delete documents.
Basic querying in MongoDB involves using the find method to retrieve documents that match certain criteria.
Example:
db.collection.find({ age: { $gte: 20 } })This query retrieves all documents from the collection where the age field is greater than or equal to 20.
An index in MongoDB is a data structure that improves the speed of data retrieval operations on a collection. You can create an index using the createIndex method.
For example, to create an index on the name field:
db.collection.createIndex({ name: 1 })MongoDB provides several mechanisms to ensure data consistency:
To perform data import and export in MongoDB, you can use the mongoimport and mongoexport tools. These tools allow you to import data from JSON, CSV or TSV files into MongoDB and export data from MongoDB collections to JSON or CSV files.
Import Data:
mongoimport --db mydatabase --collection mycollection --file data.jsonThis command imports data from data.json into the mycollection collection in the mydatabase database.
Export Data:
mongoexport --db mydatabase --collection mycollection --out data.jsonThis command exports data from the mycollection collection in the mydatabase database to data.json.
The aggregation pipeline is a framework for data aggregation, modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into aggregated results. Each stage performs an operation on the input documents and passes the results to the next stage.
db.orders.aggregate([
{ $match: { status: "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } }
])
Aggregation pipelines are powerful and flexible, enabling complex data processing tasks to be executed within MongoDB.
MongoDB Intermediate Interview Questions explore advanced concepts and features, such as schema design, aggregation pipelines, indexing strategies, and transaction management. These questions help gauge your ability to utilize MongoDB efficiently in more complex scenarios.
The Aggregation Framework in MongoDB processes documents through a multi-stage pipeline to filter, group, sort, reshape, and compute results. It enables complex data transformations and analytics to be performed efficiently within the database itself.
Aggregation operations in MongoDB are performed using the aggregate method. This method takes an array of pipeline stages, each stage representing a step in the data processing pipeline.
db.sales.aggregate([
{ $match: { status: "completed" } },
{ $group: { _id: "$product", totalSales: { $sum: "$amount" } } },
{ $sort: { totalSales: -1 } }
])
Write Concern in MongoDB defines how many nodes must acknowledge a write before it is considered successful, with levels like acknowledged (default), unacknowledged, journaled, and replica-acknowledged.
Importance of Write Concern
TTL (Time To Live) Indexes in MongoDB are special indexes that automatically remove documents from a collection after a certain period. They are commonly used for data that needs to expire after a specific time, such as session information, logs, or temporary data. To create a TTL index, you can specify the expiration time in seconds
createdAt:db.sessions.createIndex({ "createdAt": 1 }, { expireAfterSeconds: 3600 })This index will remove documents from the sessions collection 1 hour (3600 seconds) after the createdAt field's value.
Schema design and data modeling in MongoDB involve defining how data is organized and stored in a document-oriented database. Unlike SQL databases, MongoDB offers flexible schema design, which can be both an advantage and a challenge.
Key considerations for schema design include:
| WiredTiger | MMAPv1 |
|---|---|
| Document-level concurrency, allowing multiple operations simultaneously. | Collection-level concurrency, limiting performance under heavy write operations. |
| Supports data compression. | Does not support data compression. |
| Better performance and efficiency for most workloads | Limited performance under heavy workloads. |
| Uses write-ahead logging for better data integrity. | Basic journaling, less advanced. |
| Modern and default storage engine. | Legacy engine, deprecated. |
| Advanced implementation with additional features. | Simple implementation but lacks advanced features. |
MongoDB supports multi-document ACID transactions by allowing us to perform a series of read and write operations across multiple documents and collections in a transaction. This ensures data consistency and integrity. To use transactions we typically start a session, begin a transaction, perform the operations and then commit or abort the transaction.
Example in JavaScript:
MongoDB Compass is a graphical user interface (GUI) tool for MongoDB that provides an easy way to visualize, explore, and manipulate your data. It offers features such as:
MongoDB Atlas is a fully managed cloud database service provided by MongoDB. It offers automated deployment, scaling, and management of MongoDB clusters across various cloud providers (AWS, Azure, Google Cloud). Key differences from self-hosted MongoDB include:
Access control and user authentication in MongoDB are implemented through a role-based access control (RBAC) system. You create users and assign roles that define their permissions. To set up access control:
db.createUser({
user: "admin",
pwd: "password",
roles: [{ role: "userAdminAnyDatabase", db: "admin" }]
});
Capped collections in MongoDB are fixed-size collections that automatically overwrite the oldest documents when the specified size limit is reached. They maintain insertion order and are useful for scenarios where you need to store a fixed amount of recent data, such as logging, caching, or monitoring data.
Example of creating a capped collection:
db.createCollection("logs", { capped: true, size: 100000 });Geospatial indexes in MongoDB are special indexes that support querying of geospatial data, such as locations and coordinates. They enable efficient queries for proximity, intersections, and other spatial relationships. MongoDB supports two types of geospatial indexes: 2d for flat geometries and 2dsphere for spherical geometries.
Example of creating a 2dsphere index:
db.places.createIndex({ location: "2dsphere" });Handling backups and disaster recovery in MongoDB involves regularly creating backups of your data and having a plan for restoring data in case of failure. Methods include:
Upgrading MongoDB to a newer version involves several steps to ensure a smooth transition:
Change Streams in MongoDB allow applications to listen for real-time changes to data in collections, databases, or entire clusters. They provide a powerful way to implement event-driven architectures by capturing insert, update, replace, and delete operations. To use Change Streams, you typically open a change stream cursor and process the change events as they occur.
Example:
This example listens for changes in the orders collection and logs the change events.
Hashed sharding keys in MongoDB distribute data across shards using a hashed value of the shard key field, ensuring even data distribution and preventing issues caused by data locality or uneven distribution.
Example:
db.collection.createIndex({ _id: "hashed" });
sh.shardCollection("mydb.mycollection", { _id: "hashed" });
Optimizing MongoDB queries involves several strategies:
Map-Reduce in MongoDB is a data processing paradigm used to perform complex data aggregation operations. It consists of two phases: the map phase processes each input document and emits key-value pairs, and the reduce phase processes all emitted values for each key and outputs the final result.
Example:
This example calculates the total price for each category in a collection.
Journaling in MongoDB ensures data durability and crash recovery by writing changes to a journal file before updating the actual database files. In case of a crash, MongoDB replays the journal to restore data consistency. While it improves data safety, it may slightly affect performance due to extra disk I/O operations.
Full-Text Search in MongoDB is implemented using text indexes. These indexes allow you to perform text search queries on string content within documents.
Example:
db.collection.createIndex({ content: "text" });
db.collection.find({ $text: { $search: "mongodb" } });
A text index is created on the content field, and a text search query is performed to find documents containing the word "mongodb."
Considerations for deploying MongoDB in a production environment include:
Monitoring and troubleshooting performance issues in MongoDB involve:
Migrating data from a relational database to MongoDB involves several steps:
MongoDB Query-Based Interview Questions focus on your ability to write efficient and optimized queries to interact with databases. These tasks include retrieving specific data using filters, sorting and paginating results, and utilizing projections to select desired fields.
The following dataset represents a collection named employees, containing documents about employees in an organization. Each document includes details such as the employee's name, age, position, salary, department, and hire date.
"[
{
""_id"": 1,
""name"": ""John Doe"",
""age"": 28,
""position"": ""Software Engineer"",
""salary"": 80000,
""department"": ""Engineering"",
""hire_date"": ISODate(""2021-01-15"")
},
{
""_id"": 2,
""name"": ""Jane Smith"",
""age"": 34,
""position"": ""Project Manager"",
""salary"": 95000,
""department"": ""Engineering"",
""hire_date"": ISODate(""2019-06-23"")
},
{
""_id"": 3,
""name"": ""Emily Johnson"",
""age"": 41,
""position"": ""CTO"",
""salary"": 150000,
""department"": ""Management"",
""hire_date"": ISODate(""2015-03-12"")
},
{
""_id"": 4,
""name"": ""Michael Brown"",
""age"": 29,
""position"": ""Software Engineer"",
""salary"": 85000,
""department"": ""Engineering"",
""hire_date"": ISODate(""2020-07-30"")
},
{
""_id"": 5,
""name"": ""Sarah Davis"",
""age"": 26,
""position"": ""UI/UX Designer"",
""salary"": 70000,
""department"": ""Design"",
""hire_date"": ISODate(""2022-10-12"")
}
]"
Query:
db.employees.find({ department: "Engineering" })Output:
[
{
"_id": 1,
"name": "John Doe",
"age": 28,
"position": "Software Engineer",
"salary": 80000,
"department": "Engineering",
"hire_date": ISODate("2021-01-15")
},
{
"_id": 2,
"name": "Jane Smith",
"age": 34,
"position": "Project Manager",
"salary": 95000,
"department": "Engineering",
"hire_date": ISODate("2019-06-23")
},
{
"_id": 4,
"name": "Michael Brown",
"age": 29,
"position": "Software Engineer",
"salary": 85000,
"department": "Engineering",
"hire_date": ISODate("2020-07-30")
}
]
This query finds all employees whose department field is "Engineering".
Query:
db.employees.find().sort({ salary: -1 }).limit(1)Output:
[
{
"_id": 3,
"name": "Emily Johnson",
"age": 41,
"position": "CTO",
"salary": 150000,
"department": "Management",
"hire_date": ISODate("2015-03-12")
}
]
This query sorts all employees by salary in descending order and retrieves the top document, which is the employee with the highest salary.
Query:
db.employees.updateOne({ name: "John Doe" }, { $set: { salary: 90000 } })Output:
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }This query updates the salary of the employee named "John Doe" to 90000.
Query:
db.employees.aggregate([
{ $group: { _id: "$department", count: { $sum: 1 } } }
])
Output:
[
{ "_id": "Engineering", "count": 3 },
{ "_id": "Management", "count": 1 },
{ "_id": "Design", "count": 1 }
]
This query groups the employees by the department field and counts the number of employees in each department.
Query:
db.employees.updateMany({ department: "Engineering" }, { $set: { bonus: 5000 } })Output:
{ "acknowledged" : true, "matchedCount" : 3, "modifiedCount" : 3 }This query adds a new field bonus with a value of 5000 to all employees in the "Engineering" department.
Query:
db.employees.aggregate([
{ $addFields: { nameLength: { $strLenCP: "$name" } } },
{ $sort: { nameLength: -1 } },
{ $project: { nameLength: 0 } }
])
Output:
[
{
"_id": 2,
"name": "Jane Smith",
"age": 34,
"position": "Project Manager",
"salary": 95000,
"department": "Engineering",
"hire_date": ISODate("2019-06-23")
},
{
"_id": 3,
"name": "Emily Johnson",
"age": 41,
"position": "CTO",
"salary": 150000,
"department": "Management",
"hire_date": ISODate("2015-03-12")
},
{
"_id": 1,
"name": "John Doe",
"age": 28,
"position": "Software Engineer",
"salary": 80000,
"department": "Engineering",
"hire_date": ISODate("2021-01-15")
},
{
"_id": 4,
"name": "Michael Brown",
"age": 29,
"position": "Software Engineer",
"salary": 85000,
"department": "Engineering",
"hire_date": ISODate("2020-07-30")
},
{
"_id": 5,
"name": "Sarah Davis",
"age": 26,
"position": "UI/UX Designer",
"salary": 70000,
"department": "Design",
"hire_date": ISODate("2022-10-12")
}
]
This query calculates the length of each employee's name, sorts the documents by this length in descending order, and removes the temporary nameLength field from the output.
Query:
db.employees.aggregate([
{ $match: { department: "Engineering" } },
{ $group: { _id: null, averageSalary: { $avg: "$salary" } } }
])
Output:
[
{ "_id": null, "averageSalary": 86666.66666666667 }
]
This query filters employees to those in the "Engineering" department and calculates the average salary of these employees.
Query:
db.employees.aggregate([
{ $group: { _id: "$department", averageSalary: { $avg: "$salary" } } },
{ $sort: { averageSalary: -1 } },
{ $limit: 1 }
])
Output:
[
{ "_id": "Management", "averageSalary": 150000 }
]
This query groups employees by department, calculates the average salary for each department, sorts these averages in descending order, and retrieves the department with the highest average salary.
Query:
db.employees.aggregate([
{ $group: { _id: { $year: "$hire_date" }, totalHired: { $sum: 1 } } }
])
Output:
[
{ "_id": 2015, "totalHired": 1 },
{ "_id": 2019, "totalHired": 1 },
{ "_id": 2020, "totalHired": 1 },
{ "_id": 2021, "totalHired": 1 },
{ "_id": 2022, "totalHired": 1 }
]
This query groups employees by the year they were hired, which is extracted from the hire_date field, and counts the total number of employees hired each year.
Query:
db.employees.aggregate([
{ $match: { department: "Engineering" } },
{
$group: {
_id: null,
highestSalary: { $max: "$salary" },
lowestSalary: { $min: "$salary" }
}
}
])
Output:
[
{ "_id": null, "highestSalary": 95000, "lowestSalary": 80000 }
]
This query filters employees to those in the "Engineering" department, then calculates the highest and lowest salary within this group.