Unit 4 (MongoDB)
Unit 4 (MongoDB)
UNIT-IV
MongoDB
MongoDB: An introduction
• MongoDB, the most popular NoSQL database, is an open-source
document-oriented database. The term ‘NoSQL’ means ‘non-relational’. It
means that MongoDB isn’t based on the table-like relational database
structure but provides an altogether different mechanism for storage and
retrieval of data.
• MongoDB is available under General Public license for free, and it is also
available under Commercial license from the manufacturer.
• The manufacturing company 10gen has defined MongoDB as:
• "MongoDB is a scalable, open source, high performance, document-
oriented database." - 10gen
• MongoDB was designed to work with commodity servers. Now it is used
by the company of all sizes, across all industry.
MongoDB is such a NoSQL database that scales by adding more and more
servers and increases productivity with its flexible document model.
History of MongoDB
• The initial development of MongoDB began in 2007 when the company
was building a platform as a service similar to window azure.
• MongoDB was developed by a NewYork based organization named 10gen
which is now known as MongoDB Inc. It was initially developed as a
PAAS (Platform as a Service). Later in 2009, it is introduced in the market
as an open source database server that was maintained and supported by
MongoDB
• The first ready production of MongoDB has been considered from version
1.4 which was released in March 2010.
• MongoDB2.4.9 was the latest and stable version which was released on
January 10, 2014.
Features of MongoDB
1. Support ad hoc queries
In MongoDB, you can search by field, range query and it also supports
regular expression searches.
2. Indexing
Without indexing, a database would have to scan every document of a
collection to select those that match the query which would be inefficient.
So, for efficient searching Indexing is a must and MongoDB uses it to
process huge volumes of data in very less time.
3. Replication and High Availability: MongoDB increases the data
availability with multiple copies of data on different servers. By providing
redundancy, it protects the database from hardware failures. If one server
goes down, the data can be retrieved easily from other active servers which
also had the data stored on them.
4. Document Oriented: MongoDB stores the main subject in the minimal
number of documents and not by breaking it up into multiple relational
structures like RDBMS. For example, it stores all the information of a
computer in a single document called Computer and not in distinct
relational structures like CPU, RAM, Hard disk, etc.
5. Scalability: MongoDB scales horizontally using sharding (partitioning data
across various servers). Data is partitioned into data chunks using the shard
key, and these data chunks are evenly distributed across shards that reside
across many physical servers. Also, new machines can be added to a
running database.
6. Aggregation: Aggregation operations process data records and return the
computed results. It is similar to the GROUPBY clause in SQL. A few
aggregation expressions are sum, avg, min, max, etc
Where do we use MongoDB?
MongoDB is preferred over RDBMS in the following scenarios:
• Big Data: If you have huge amount of data to be stored in tables, think of
MongoDB before RDBMS databases. MongoDB has built-in solution for
partitioning and sharding your database.
• Unstable Schema: Adding a new column in RDBMS is hard whereas
MongoDB is schema-less. Adding a new field does not effect old
documents and will be very easy.
• Distributed data Since multiple copies of data are stored across different
servers, recovery of data is instant and safe even if there is a hardware
failure.
Language Support by MongoDB:
• MongoDB currently provides official driver support for all popular
programming languages like C, C++, Rust, C#, Java, Node.js, Perl, PHP,
Python, Ruby, Scala, Go, and Erlang.
Installing MongoDB:
• Just go to http://www.mongodb.org/downloads and select your operating
system out of Windows, Linux, Mac OS X and Solaris. A detailed
explanation about the installation of MongoDB is given on their site.
Install MongoDB On Windows
• To install MongoDB on Windows, first download the latest release of
MongoDB from https://www.mongodb.com/download-center.
• Enter the required details, select the Server tab, in it you can choose the
version of MongoDB, operating system and, packaging as:
Date_Of_Birth: "1995-09-26" },
Contact: {
e-mail: "[email protected]",
phone: "9848022338"
},
Address: {
city: "Hyderabad",
Area: "Madapur",
State: "Telangana"
}}
Normalized Data Model
• In this model, you can refer the sub documents in the original document,
using references.
Employee:
{
_id: <ObjectId101>,
Emp_ID: "10025AE336"
}
Personal_details:
{
_id: <ObjectId102>,
empDocID: " ObjectId101",
First_Name: "Radhika",
Last_Name: "Sharma",
Date_Of_Birth: "1995-09-26"
}
DataTypes in MongoDB
In MongoDB, the documents are stores in BSON, which is the binary encoded
format of JSON and using BSON we can make remote procedure calls in
MongoDB. BSON data format supports various data-types.
1. String: This is the most commonly used data type in MongoDB to store
data, BSON strings are of UTF-8. So, the drivers for each programming
language convert from the string format of the language to UTF-8 while
serializing and de-serializing BSON. The string must be a valid UTF-8.
2. Integer: In MongoDB, the integer data type is used to store an integer
value. We can store integer data type in two forms 32 -bit signed integer
and 64 – bit signed integer.
3. Double: The double data type is used to store the floating-point values.
4. Boolean: The boolean data type is used to store either true or false.
5. Null: The null data type is used to store the null value.
6. Array: The Array is the set of values. It can store the same or different data
types values in it. In MongoDB, the array is created using square
brackets([]).
7. Object: Object data type stores embedded documents. Embedded
documents are also known as nested documents. Embedded document or
nested documents are those types of documents which contain a document
inside another document.
8. Object Id: Whenever we create a new document in the collection
MongoDB automatically creates a unique object id for that document(if the
document does not have it). There is an _id field in MongoDB for each
document. The data which is stored in Id is of hexadecimal format and the
length of the id is 12 bytes which consist:
• 4-bytes for Timestamp value.
• 5-bytes for Random values. i.e., 3-bytes for machine Id and 2-bytes for
process Id.
• 3- bytes for Counter
• You can also create your own id field, but make sure that the value of that
id field must be unique.
10. Binary Data: This datatype is used to store binary data.
11. Date: Date data type stores date. It is a 64-bit integer which represents the
number of milliseconds. BSON data type generally supports UTC datetime
and it is signed. If the value of the date data type is negative then it represents
the dates before 1970. There are various methods to return date, it can be
returned either as a string or as a date object. Some method for the date:
• Date(): It returns the current date in string format.
• new Date(): Returns a date object. Uses the ISODate() wrapper.
• new ISODate(): It also returns a date object. Uses the ISODate() wrapper.
12. Min & Max key: Min key compares the value of the lowest BSON element
and Max key compares the value against the highest BSON element. Both are
internal data types.
13. Symbol: This data type similar to the string data type. It is generally not
supported by a mongo shell, but if the shell gets a symbol from the database,
then it converts this type into a string type.
• 14. Regular Expression: This datatype is used to store regular
expressions.
• 15. JavaScript: This datatype is used to store JavaScript code into the
document without the scope.
• 17. Timestamp: In MongoDB, this data type is used to store a timestamp.
It is useful when we modify our data to keep a record and the value of this
data type is 64-bit. The value of the timestamp data type is always unique.
• 18. Decimal: This MongoDB data type store 128-bit decimal-based
floating-point value. This data type was introduced in MongoDB version
3.4
MongoDB Operators
MongoDB Query and Projection Operator
The MongoDB query operator includes comparison, logical, element,
evaluation, Geospatial, array, bitwise, and comment operators.
MongoDB Comparison Operators
$eq
The $eq specifies the equality condition. It matches documents where the
value of a field equals the specified value.
Syntax:
• { <field> : { $eq: <value> } }
Example:
• db.books.find ( { price: { $eq: 300 } } )
$gt
• The $gt chooses a document where the value of the field is greater than the specified value.
Syntax:
• { field: { $gt: value } }
Example:
• db.books.find ( { price: { $gt: 200 } } )
$gte
• The $gte choose the documents where the field value is greater than or equal to a specified
value.
Syntax:
• { field: { $gte: value } }
Example:
• db.books.find ( { price: { $gte: 250 } } )
$in
• The $in operator choose the documents where the value of a field equals any value in the
specified array.
Syntax:
• { filed: { $in: [ <value1>, <value2>, ……] } }
Example:
• db.books.find( { price: { $in: [100, 200] } } )
$lt
• The $lt operator chooses the documents where the value of the field is less than the
specified value.
Syntax:
• { field: { $lt: value } }
Example:
• db.books.find ( { price: { $lt: 20 } } )
$lte
• The $lte operator chooses the documents where the field value is less than or equal to a
specified value.
Syntax:
• { field: { $lte: value } }
Example:
• db.books.find ( { price: { $lte: 250 } } )
$ne
• The $ne operator chooses the documents where the field value is not equal to the
specified value.
Syntax:
• { <field>: { $ne: <value> } }
Example:
• db.books.find ( { price: { $ne: 500 } } )
$nin
• The $nin operator chooses the documents where the field value is not in the specified
array or does not exist.
Syntax:
• { field : { $nin: [ <value1>, <value2>, .... ] } }
Example:
• db.books.find ( { price: { $nin: [ 50, 150, 200 ] } } )
Also, we can create the collections using both the parameters, which means
name and options like below –
db.employee.insert(
[
{name:"Sandeep Sharma", email:"[email protected]", age:28, salary:5333},
{name:"Manish Fartiyal", email:"[email protected]", age:26, salary:5555.4},
{name:"Santosh Kumar", email:"[email protected]", age:30, salary:7000.74},
{name:"Dhirendra Chauhan", email:"[email protected]", age:29,
salary:4848.44}
]
)
Select Record/s
• db.collection.find() method is used to fetch the data from the
database. db.collection.find() method with no parameter will fetch all data
from the collection similarly db.collection.find
(query, projection) method with the parameter will fetch the data
conditionally.
> db.employee.find()
{ "_id" : ObjectId("5fa2c0e19577bba42e1db54a"), "name" : "Atul Rai", "email" :
"[email protected]", "age" : 28, "salary" : 5000.54 }
{ "_id" : ObjectId("5fa2c2409577bba42e1db54b"), "name" : "Sandeep Sharma",
"email" : "[email protected]", "age" : 28, "salary" : 5333.94 }
{ "_id" : ObjectId("5fa2c2409577bba42e1db54c"), "name" : "Manish Fartiyal", "email"
: "[email protected]", "age" : 26, "salary" : 5555.4 }
{ "_id" : ObjectId("5fa2c2409577bba42e1db54d"), "name" : "Santosh Kumar", "email"
: "[email protected]", "age" : 30, "salary" : 7000.74 }
{ "_id" : ObjectId("5fa2c2409577bba42e1db54e"), "name" : "Dhirendra Chauhan",
"email" : "[email protected]", "age" : 29, "salary" : 4848.44 }
Fetch with condition
db.employee.find({email:"[email protected]"})
db.employee.update({email:"[email protected]"},{$set:{salary: 8000.99}})
a. Unique Indexes
• This property of index causes MongoDB to reject duplicate values for the
indexed field. In other words, a unique property of indexes restricts it to
insert the duplicate value of an indexed field. The unique indexes can be
interchanged functionally with other MongoDB indexes.
• We create an index using createIndex() for the name field and set unique to
be true.
• > db.dataflair.createIndex({name:1},{unique:true})
b. Partial Indexes
• Partial Indexes only index the documents that match the filter criteria. If
we are creating an index with some conditions applied then it is a partial
index.
c. Sparse Indexes
• The sparse property ensures that the index only contains entries for
documents with the indexed field. The index will skip the documents
without the indexed field.
• We can combine this option with the unique index option in order to reject
documents with duplicate values for a field. And can ignore documents
without an indexed key at the same time.
d. TTL Indexes
• TTL or “total time to live” indexes are the special indexes in MongoDB.
These indexes are used to auto-delete documents from a collection after the
specified time duration. The option that we use is expireAfterSeconds to
provide the expiration time.
• This property is ideal for certain types of information like machine-
generated data, logs and session information that only need to be there for
a finite amount of time in a database.