Google News
logo
MongoDB Interview Questions
MongoDB is a cross-platform document-based database. Categorized as a NoSQL database, MongoDB avoids the conventional table-oriented relational database structure in support of the JSON-like documents with the dynamic schemas, making the data integration in specific kinds of applications quicker and simpler.
 
MongoDB was developed by a software company “10gen”, in October 2007 as an element of the planned platform as the service product. After that, the company was shifted to a freeware deployment model in 2009, providing sales assistance and other services.
Some advantages of MongoDB are as follows :
 
* MongoDB supports field, range-based, string pattern matching type queries. for searching the data in the database 
* MongoDB support primary and secondary index on any fields
* MongoDB basically uses JavaScript objects in place of procedures
* MongoDB uses a dynamic database schema
* MongoDB is very easy to scale up or down
* MongoDB has inbuilt support for data partitioning (Sharding).
A Document in MongoDB is an ordered set of keys with associated values. It is represented by a map, hash, or dictionary. In JavaScript, documents are represented as objects :

{"greeting" : "Hello world!"}
 
Complex documents will contain multiple key/value pairs :

{"greeting" : "Hello world!", "views" : 3}

A collection in MongoDB is a group of documents. If a document is the MongoDB analog of a row in a relational database, then a collection can be thought of as the analog to a table.

Documents within a single collection can have any number of different “shapes.”, i.e. collections have dynamic schemas.
 
For example, both of the following documents could be stored in a single collection:
 
{"greeting" : "Hello world!", "views": 3}
{"signoff": "Good bye"}

MongoDB stores BSON (Binary Interchange and Structure Object Notation) objects in the collection. The concatenation of the collection name and database name is called a namespace.
The procedure of storing data records across multiple machines is referred as Sharding. It is a MongoDB approach to meet the demands of data growth. It is the horizontal partition of data in a database or search engine. Each partition is referred as shard or database shard.
Across multiple servers, the process of synchronizing data is known as replication. It provides redundancy and increase data availability with multiple copies of data on different database server. Replication helps in protecting the database from the loss of a single server.
Points need to be taken in consideration are
 
* Design your schema according to user requirements
* Combine objects into one document if you use them together. Otherwise, separate them
* Do joins while write, and not when it is on read
* For most frequent use cases optimize your schema
* Do complex aggregation in the schema
A NoSQL database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases (like SQL, Oracle, etc.).
 
Types of NoSQL databases :
 
* Graph
* Document Oriented
* Key Value
* Column Oriented
MongoDB is a document oriented database. It stores data in the form of BSON structure based documents. These documents are stored in a collection.
What is MySQL?

It is feature-rich RDBMS (Relational Database Management System) formerly built by MySQL AB and is presently be in possession of Oracle corporation. MySQL keeps the record of data in tables assembled into a database. It makes use of SQL (Structured Query Language) for accessing data and managing commands like Select, Insert, Update, and Delete.
 
Besides, it allows storing all related info in different tables. However, the procedure of the JOIN operation lets you compare it, execute questions across different tables, and lead to minimizing the data duplication chances.
 
Coming to its compatibility, it works well with almost all operating systems, including Windows, Apple, Linux, UNIX, etc. Apart from that, a wide range of storage engines is also supported by MySQL such as Merge, Blackhole, Memory, InnoDV, CSV, to name a few.


What is MongoDB?

Developed by 10gen, MongoDB is a well-liked document-oriented database. It helps in creating and storing documents in Binary JSON, BSON file format, as a result, it supports all JS types of data. The database is always applied for projects relating to Node.js.
 
Apart from this, JSON allows the transferring of all the data between web apps and servers using a format that can be easily read by a human. MongoDB can be considered as a better option in terms of offering greater reliability and efficiency when comes to storage speed and capacity.
 
On top of that, it allows the employment of dynamic schemas that abolish the requirement of pre-defining the structure such as value types and fields.
A covered query makes the query implementation quicker as we store the indexes in the RAM or consecutively located on the disk. It makes query execution quicker. The covered query covers all the fields in the index, MongoDB matches the query condition along with returning the result fields.
No. Writes to disk are lazy by default. A write may only hit the disk a couple of seconds later. For example, if the database receives thousand increments to an object within one second, it will only be flushed to disk once. (Note: fsync options are available both at the command line and via getLastError_old.)
MongoDB does not use traditional locking or complex transactions with rollback, as it is designed to be light weight, fast and predictable in its performance. It can be thought of how analogous is to the MySQL’s MyISAM autocommit model. By keeping transaction support extremely simple, performance is enhanced, especially in a system that may run across many servers.
It may take 10-30 seconds for the primary to be declared down by the other members and a new primary to be elected. During this window of time, the cluster is down for primary operations i.e writes and strong consistent reads. However, eventually consistent queries may be executed to secondaries at any time (in slaveOk mode), including during this window.
A secondary is a node/member which applies operations from the current primary. This is done by tailing the replication oplog (local.oplog.rs). Replication from primary to secondary is asynchronous, however, the secondary will try to stay as close to current as possible (often this is just a few milliseconds on a LAN).
No. If ‘getLastError’ (aka ‘Safe Mode’) is not called, the server does exactly behave the way as if it has been called. The ‘getLastError’ call simply allows one to get a confirmation that the write operation was successfully committed. Of course, often you will want that confirmation, but the safety of the write and its durability is independent.
We suggest starting with Non-Sharded for simplicity and quick startup, unless your initial data set will not fit on single servers. Upgrading to Sharded from Non-sharded is easy and seamless, so there is not a lot of advantage in setting up Sharding before your data set is large.
BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. BSON extends the JSON model to provide additional data types, ordered fields, and to be efficient for encoding and decoding within different languages.
It is a JavaScript shell that allows interaction with a MongoDB instance from the command line. With that one can perform administrative functions, inspecting an instance, or exploring MongoDB. 
 
To start the shell, run the mongo executable:
$ mongod
$ mongo
MongoDB shell version: 4.2.0
connecting to: test
>
The shell is a full-featured JavaScript interpreter, capable of running arbitrary JavaScript programs. Let’s see how basic math works on this:
> x = 100;
200
> x / 5;
20

 

The document-oriented data model of MongoDB makes it easier to split data across multiple servers. Balancing and loading data across a cluster is done by MongoDB. It then redistributes documents automatically.
 
The mongos acts as a query router, providing an interface between client applications and the sharded cluster.
 
Config servers store metadata and configuration settings for the cluster. MongoDB uses the config servers to manage distributed locks. Each sharded cluster must have its own config servers.
Once a document is stored in the database, it can be changed using one of several update methods: updateOne, updateMany, and replaceOne. updateOne and updateMany each takes a filter document as their first parameter and a modifier document, which describes changes to make, as the second parameter. replaceOne also takes a filter as the first parameter, but as the second parameter replaceOne expects a document with which it will replace the document matching the filter.
 
For example, in order to replace a document :
{
   "_id" : ObjectId("4b2b9f67a1f631733d917a7a"),
   "name" : "alice",
   "friends" : 24,
   "enemies" : 2
}
The CRUD API in MongoDB provides deleteOne and deleteMany for this purpose. Both of these methods take a filter document as their first parameter. The filter specifies a set of criteria to match against in removing documents.
 
For example :

> db.books.deleteOne({"_id" : 3})

The find method is used to perform queries in MongoDB. Querying returns a subset of documents in a collection, from no documents at all to the entire collection. Which documents get returned is determined by the first argument to find, which is a document specifying the query criteria.
 
Example :

> db.users.find({"age" : 24})
MongoDB supports a wide range of data types as values in documents. Documents in MongoDB are similar to objects in JavaScript. Along with JSON’s essential key/value–pair nature, MongoDB adds support for a number of additional data types. The common data types in MongoDB are :
 
Null {"x" : null}

Boolean {"x" : true}

Number {"x" : 4}

String {"x" : "foobar"}

Date {"x" : new Date()}

Regular expression {"x" : /foobar/i}

Array {"x" : ["a", "b", "c"]}

Embedded document {"x" : {"foo" : "bar"}}

Object ID {"x" : ObjectId()}

Binary Data : Binary data is a string of arbitrary bytes.

Code {"x" : function() { /* ... */ }}

The find method is used to perform queries in MongoDB. Querying returns a subset of documents in a collection, from no documents at all to the entire collection. Which documents get returned is determined by the first argument to find, which is a document specifying the query criteria.
 
For example : If we have a string we want to match, such as a "username" key with the value "alice", we use that key/value pair instead:
 
> db.users.find({"username" : "alice"})
 
In MongoDB, indexes help in efficiently resolving queries. What an Index does is that it stores a small part of the data set in a form that is easy to traverse. The index stores the value of the specific field or set of fields, ordered by the value of the field as specified in the index. 
MongoDB’s indexes work almost identically to typical relational database indexes.
 
Indexes look at an ordered list with references to the content. These in turn allow MongoDB to query orders of magnitude faster. To create an index, use the createIndex collection method.
 
For example :
 
> db.users.find({"username": "user101"}).explain("executionStats")
 
Here, executionStats mode helps us understand the effect of using an index to satisfy queries.
MongoDB has two types of geospatial indexes : 2dsphere and 2d. 2dsphere indexes work with spherical geometries that model the surface of the earth based on the WGS84 datum. This datum model the surface of the earth as an oblate spheroid, meaning that there is some flattening at the poles. Distance calculations using 2sphere indexes, therefore, take the shape of the earth into account and provide a more accurate treatment of distance between, for example, two cities, than do 2d indexes. Use 2d indexes for points stored on a two-dimensional plane.
 
2dsphere allows you to specify geometries for points, lines, and polygons in the GeoJSON format. A point is given by a two-element array, representing [longitude, latitude]:
{
   "name" : "New York City",
   "loc" : {
       "type" : "Point",
       "coordinates" : [50, 2]
   }
}
A line is given by an array of points :
{
   "name" : "Hudson River",
   "loc" : {
       "type" : "LineString",
       "coordinates" : [[0,1], [0,2], [1,2]]
   }
}
If the value of a field does not yet exist, the "$set" sets the value. This can be useful for updating schemas or adding user-defined keys.
 
Example :
> db.users.findOne()
{
   "_id" : ObjectId("4b253b067525f35f94b60a31"),
   "name" : "alice",
   "age" : 23,
   "sex" : "female",
   "location" : "India"
}
 
To add a field to this, we use “$set” :
> db.users.updateOne({"_id" : 
ObjectId("4b253b067525f35f94b60a31")},
... {"$set" : {"favorite book" : "Start with Why"}})

 

Yes, it is possible to move old files in the moveChunk directory, during normal shard balancing operations these files are made as backups and can be deleted once the operations are done.
Currently, MonggoDB provides official driver support for C, C++, C#, Java, Node.js, Perl, PHP, Python, Ruby, Scala, Go and Erlang. MongoDB can easily be used with any of these languages. There are some other community supported drivers too but the above mentioned ones are officially provided by MongoDB.
MongoDB allows a highly flexible and scalable document structure. For e.g. one data document in MongoDB can have five columns and the other one in the same collection can have ten columns. Also, MongoDB database are faster as compared to SQL databases due to efficient indexing and storage techniques.
Although both of these databases are document oriented, MongoDB is a better choice for applications which need dynamic queries and good performance on a very big database. On the other side, CouchDB is better used for applications with occasionally changing queries and pre-defined queries.
No. MongoDB can be run even on a small amount of RAM. MongoDB dynamically allocates and de-allocates RAM based on the requirements of other processes.
ObjectID is a 12-byte BSON type with :
 
* 4 bytes value representing seconds
* 3 byte machine identifier
* 2 byte process id
* 3 byte counter
A covered query is the one in which :
 
fields used in the query are part of an index used in the query, and the fields returned in the results are in the same index
Aggregations operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result. MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function, and single purpose aggregation methods and commands.
A storage engine is the part of a database that is responsible for managing how data is stored on disk. For example, one storage engine might offer better performance for read-heavy workloads, and another might support a higher-throughput for write operations.
When running with journaling, MongoDB stores and applies write operations in memory and in the on-disk journal before the changes are present in the data files on disk. Writes to the journal are atomic, ensuring the consistency of the on-disk journal files. With journaling enabled, MongoDB creates a journal subdirectory within the directory defined by dbPath, which is /data/db by default.
MongoDB uses reader-writer locks that allow concurrent readers shared access to a resource, such as a database or collection, but give exclusive access to a single write operation.
Because MongoDB uses memory mapped files so when you run a 32-bit build of MongoDB, the total storage size of server is 2 GB. But when you run a 64-bit build of MongoDB, this provides virtually unlimited storage size. So 64-bit is preferred over 32-bit.
In MongoDB, primary nodes are the node that can accept write. These are also known as master nodes. The replication in MongoDB is single master so, only one node can accept write operations at a time.
 
Secondary nodes are known as slave nodes. These are read only nodes that replicate from the primary.
Difference between MongoDB and Redis :
 
* Redis is faster than MongoDB.
* Redis is hard to code but MongoDB is easy.
* Redis has a key-value storage whereas MongoDB has a document type storage.
Difference between MongoDB and Cassandra :
 
MongoDB is written in C++ while Cassandra is written in Java.
MongoDB is cross-platform document-oriented database system while Cassandra is high performance distributed database system.
MongoDB is easy to administer in the case of failure while Cassandra provides high availability with no single point of failure.
In MongoDB, the following syntax is used for sorting documents :
 
>db.COLLECTION_NAME.find().sort({KEY:1})

While creating a schema in MongoDB, the points need to be taken care of are as follows :
 
* Design our schema according to the user requirements
* Combine objects into one document if we want to use them together; otherwise, separate them
* Do joins while on write, and not when it is on read
* For most frequent use cases, optimize the schema
* Do complex aggregation in the schema
Documents are updated quickly for normalized data and slowly for denormalized data, much as in standard RDBMSes. Reading documents, on the other hand, is faster in denormalized data and slower in normalized data. Denormalized data is more difficult to maintain and takes up more room.
 
It should be noted that in MongoDB, denormalized data is more commonly anticipated. This is because RDBMSes have built-in support for normalization and enable data to be handled as a separate issue, while NoSQL DBMSes like MongoDB does not.
 
Instead, normalization necessitates that client applications carefully protect their own integrity. To help with this, audits may be performed to ensure the app data conforms to anticipated patterns of referential integrity.
Using the utilities mentioned can help handle live MongoDB data.
 
* MongoHub has been migrated to a native Mac version.

* This comes in handy with tree and document views.

* Genghisapp – this is a web-based GUI that is clean, light-weight, easy to use, has keyboard shortcuts and performs fantastically well. GridFS is also supported.
MongoDB is regarded as the strongest NoSQL database due to the following characteristics:
 
* Document-oriented (DO)
* High performance (HP)
* High availability (HA)
* Scalability is easy
* Rich query language
* A 32-bit edition has 2GB data limit. After that it will corrupt the entire DB, including the existing data. A 64-bit edition won’t suffer from this bug/feature.

* Default installation of MongoDB has asynchronous and batch commits turned on. Meaning, it lies when asked to store something in DB and commits all changes in a batch at a later time in future. If there is a server crash or power failure, all those commits buffered in memory will be lost. This functionality can be disabled, but then it will perform as good as or worse than MySQL.

* MongoDB is only ideal for implementing things like analytics/caching where impact of small data loss is negligible.

* In MongoDB, it’s difficult to represent relationships between data so you end up doing that manually by creating another table to represent the relationship between rows in two or more tables.
Operational log(oplog) is a special kind of limited collection that stores a rolling record of all the operations which change the data we store in our databases. Primarily, it applies all the database operations over the primary and, after that, records these operations on the oplog of the primary. After that, the secondary members replicate and apply the operations in the asynchronous process.
By using the following code, we can delete everything from the MongoDB database :
use [database];
db.dropDatabase();
Ruby code should be pretty similiar.
Also, from the command line:
mongo [Database] -eval "db.dropDatabase();"
use
[databaseName]
db.Drop+databasename();
drop colllection
use databaseName
db.collectionName.drop();
C – Create : db.collection.insert();
R – Read : db.collection.find();
U – Update : db.collection.update();
D – Delete : db.collection.remove({“fieldname” : ”value”});
In a covered query, all the fields used in the query have the index created. The results returned should also be part of the index. Due to this, MongoDB fetches the results without actually looking inside documents, thus saving time and increasing efficiency.
Replication means synchronizing the data across multiple servers. It increases data availability. If a single server is lost, data is still intact in the other servers.
 
Primary replica set : MongoDB writes data only to primary or master replica set.

Secondary replica set : secondary or slave nodes can accept only reads. They replicate from the primary.
MongoDB uses multi-granularity locking where in operations can be locked at the global, database or collection level. It is up to the storage engines to implement the level of concurrency. For example, in WiredTiger, it is at document-level. For reads, there is a shared locking mode, while for write there is an exclusive locking mode.
find() : displays only selected data rather than all the data of a document. For example, if your document has 4 fields but you want to show only one, set the required field as 1 and others as 0.
 
db.COLLECTION_NAME.find({},);

limit() : limit function limits the number of records fetched. For example, if you have 7 documents but want to display only the first 4 documents in a collection, use limit. Syntax :
 
db.COLLECTION_NAME.find().limit(NUMBER);
Map-reduce is a method of performing aggregation.
 
* Map function emits key-value pair specified.
* Reduce function combines the key value pair and returns the aggregation result.

Syntax :
db.collection.mapReduce( 
 function() {emit(key,value);}, 
 function(key, values) {return aggregatedResult}, { out: collection } 
</pre.
 )

 

GridFS stores and retrieves large files like images, audio and video files etc. Although the limit to store a file is 16MB, GridFS can store files with size greater than that. GridFS breaks the file into chunks and stores each chunk as a different document of maximum size 255k. It uses two collections, fs.chunks and fs.files for storing chunks and metadata, respectively.
To start a MongoDB instance, follow the steps as below :
 
* First, open the command prompt and run mongod.exe.
* Alternatively, you can move to the path where MongoDB is installed, for example, “C: MongoDB
* Navigate to the bin folder, locate the mongod.exe and double click the same to execute it.
* You can also navigate to the required folder, for example, “C: MongoDB/bin” and type mongo to connect MongoDB through the shell.