Introduction to MongoDB

What is MongoDB?

There are a very few of us who have not heard this name! But there are many who have not gone deeper into it to understand it a step further. This tutorial is meant to be an introduction to the concepts as well as syntax - that should be enough to get one started. MongoDB is an open-source document oriented database and leading NoSQL database. It is written in C++. It is a cross-platform, document oriented database that provides, high performance, high availability, and easy scalability. MongoDB works on concept of collection and document. Now what does that mean? The above line mentions several terms one after the other. Let̢۪s examine them one by one.

Document oriented Database

This is certainly the foremost. MongoDB is a database. That means, is a collection of data that facilitates easy access and update. MongoDB is a NoSQL / document oriented database. This means it is not a traditional structured database that stores data in formal tables and columns. Instead, it stores data in an unstructured form, as a collection of documents. Now what is the big deal? Is unstructured data more efficient? If that is the case, why did our ancestors take all the trouble of structuring data in form of tables? And if structuring was the right way to go, why is everyone moving to unstructured data? How can both be correct? Has something changed?
Yes a lot has changed. The data has changed. The volume of data has changed. The velocity of data has changed. The variety of data has changed. On the other hand, storage capacity, processing power as well as the search algorithms have changed. Because of slow velocity and consistency of the data being pushed into the database; because of higher latency of the database search; it made more sense to keep the data well organized so that the search would be quick enough. Today, the extreme velocity, variety and volume of the data leave you no time to organize it. Also, the improved search process allows you to afford the unstructured storage.
In simple words - when search was a bottleneck all databases were implemented in a way that would organize the data in a way that facilitates faster search. Today, the bottleneck is in absorbing and streamlining the high volume, variety and velocity of the data being pumped into the database. Naturally the database implementations have changed.

Collections and Documents

MongoDB stores its data in form of collections and documents. A MongoDB database is a container of collections. Each database gets its own set of files on the file system. A single MongoDB server typically has multiple databases. Collection is a group of MongoDB documents. It is the equivalent of an RDBMS table. A collection exists within a single database. Collections do not enforce a schema. Documents within a collection can have different fields. Typically, all documents in a collection are of similar or related purpose.
A document is a set of key-value pairs. Documents have dynamic schema. Dynamic schema means that documents in the same collection do not need to have the same set of fields or structure, and common fields in a collection's documents may hold different types of data. Any sensible developer would create a collection out of related documents. Even if they do not follow the same syntactic schema, they would be functionally related. But MongoDB does not insist on any such restriction.

Performance / Availability / Scalability

Gone are the days when a database was just another application running on the server that contained the processes using that data - when the servers were rebooted every night or at least on weekends. Today, any application worth its name would span over geographies, not just servers. Such applications are accessed by users spread across the over the globe. These applications have to tolerate the massive fluctuation in the number of users, without wasting any resources - or running out of them. That is where performance, availability and scalability are important. The applications have to be scalable by design. They should be able to perform well under different user loads. And they can't afford any downtime - it has to be always available. MongoDB is a sturdy database that fits well in such applications.

Installing MongoDB on Ubuntu

The community edition of MongoDB can be installed on Ubuntu or any Debian system using simple steps
Import the public key
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6
Create List of files for MongoDB
echo "deb [ arch=amd64,arm64 ] http://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.4 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.4.list
(xenial for Ubuntu 16 - Use the appropriate path for your version of Ubuntu.)
Reload the package database
sudo apt-get update
Install the latest stable version of MongoDB
sudo apt-get install -y mongodb-org

The MongoDB Server

MongoDB runs as Linux service. It can be handled in Ubuntu like any other service.
sudo service mongod start
sudo service mongod stop
sudo service mongod restart

Uninstall MongoDB

If you are done with it and want to uninstall the MongoDB server, it is equally simple. First Stop the MongoDB service. Then remove any MongoDB packages that you have installed. And finally remove the database and log files.
sudo service mongod stop
sudo apt-get purge mongodb-org*
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongodb

Basic CRUD Operations

MongoDB comes with a command shell that can be used to connect to the database and run database commands. The shell can be started using the simple command - "mongo". If you are on the same system where he MongoDB is running, that is all you need to connect to the MongoDB server. It will open the mongo shell where you can type

MongoDB commands

Before you start working, you need to connect to a particular database on the server. You can do it with the "use" command.
use test
This will connect you to the test database. Note that the commands in MongoDB shell are self-fulfilling by default. That is, they will generate the referred objects if they do not exist. In this case, if the test database does not exist, it will be created, instead of giving a "database not found" error. This can also get you into trouble when you erroneously type a wrong name. MongoDB provides ways to avoid that. We will check those down the line.
Let us now look at the standard CRUD operations that any normal database provides:

Insert

Now that we have connected to the test database, we can insert a document into the collection called blogs.
> db.blog.insert({ "title" : "NoSQL Database"})
WriteResult({ "nInserted" : 1 })

> db.blog.insert({ "title" : "MongoDB"})
WriteResult({ "nInserted" : 1 })

> db.blog.insert({ "title" : "Big Data"})
WriteResult({ "nInserted" : 1 })
Again, the collection need not exist before we fire the insert command. MongoDB handles that for us. The insert command returns an object - WriteResult({ "nInserted" : 1 }) - that contains the status of the operation.

Find

We can read records from the database using the find command.
> db.blog.find()
{ "_id" : ObjectId("59b79bc90426610fcc7125ef"), "title" : "NoSQL Database" }
{ "_id" : ObjectId("59b79bd50426610fcc7125f0"), "title" : "MongoDB" }
{ "_id" : ObjectId("59b79bdc0426610fcc7125f1"), "title" : "Big Data" }
Note that there is an additional field _id that the database generates by itself. We also have a choice to add it manually. Both approaches have their own pros and cons. We will have a look into those when we take up database design. When we find, we can also narrow the output using a kind of query. That helps us locate the exact record that we are looking for.
> db.blog.find({"title" : "Big Data"})
{ "_id" : ObjectId("59b79bdc0426610fcc7125f1"), "title" : "Big Data" }
You can also use inequalities for querying. For example:
> db.blog.find({"title" : {$gt : "Big Data"}})
{ "_id" : ObjectId("59b79bc90426610fcc7125ef"), "title" : "NoSQL Database" }
{ "_id" : ObjectId("59b79bd50426610fcc7125f0"), "title" : "MongoDB" }
MongoDB also provides for various other features like query combinations, sorting, cursors, etc.

Update

> db.blog.update(
... {"title" : "MongoDB"},
... {
...    {$set : {"description":  "Introduction to MongoDB" } }
... }
... )
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

> db.blog.find()
{ "_id" : ObjectId("59b79bc90426610fcc7125ef"), "title" : "NoSQL Database" }
{ "_id" : ObjectId("59b79bdc0426610fcc7125f1"), "title" : "Big Data" }
{ "_id" : ObjectId("59b79bd50426610fcc7125f0"), "title" : "MongoDB", "Introduction to MongoDB"}
The update() command takes two arguments - the criteria and the update. The WriteResult object shows that only record was updated. The db.blog.find() confirms that.

Remove

Finally, we can delete a document from the collection using the remove command.
> db.blog.remove({"title" : "Big Data"})
WriteResult({ "nRemoved" : 1 })

> db.blog.find()
{ "_id" : ObjectId("59b79bc90426610fcc7125ef"), "title" : "NoSQL Database" }
{ "_id" : ObjectId("59b79bdc0426610fcc7125f1"), "title" : "Big Data" }
{ "_id" : ObjectId("59b79bd50426610fcc7125f0"), "title" : "MongoDB", "Introduction to MongoDB"}
That was a brief introduction to the MongoDB database. Of course there is a lot more to MongoDB than just these petty CRUD operations. The following blogs contain details of other facets of MongoDB.

This is a basic introduction to give you a feel of what MongoDB is all about. Other blogs give more details about different aspects of MongoDB. You can also check out the below video tutorial.


If you are fond of books, you can check out this one.

Comments