Python - Coding for MongoDB

Published on Jan. 28, 2018, 1:46 p.m.

Updated on April 10, 2020, 2:16 a.m.!

MongoDB

MongoDB is an open source database that uses a document-based model. A RDBMS system like MySQL, Postgresql or Oracle uses an organized system of tables that have a defined set of columns and functions that relate to each. Instead of tables and rows, MongoDB uses the concept of collections (tables) and documents (rows of data).

Using MongoDB or most other NoSQL databases eliminates the often-complex object-relational mapping (ORM) layer that relates objects to relational tables. MongoDB's flexible data model allows for schemas that can change as business requirements change.

What I want to do with this post is to give you an idea of how to install MongoDB, add and manipulate data both from the MongoDB client and from Python. You will see that both are very simple to use and are not very different from each other.

Install MongoDB

First, let's install MongoDB. The instructions for installation assume you are using Linux. But, if you're on Windows, you can follow these instructions here

On Linux ...

$ curl -O https://fastdl.mongodb.org/linux/mongodb-linux-x86_64-3.6.2.tgz
$ tar -zxvf mongodb-linux-x86_64-3.6.2.tgz
$ rm mongodb-linux-x86_64-3.6.2.tgz 
$ mv mongodb-linux-x86_64-3.6.2/ mongodb

So that mongod and mongo can be usable from the command line, add the following to your .bashrc file or if you're on a RHEL-based system to .bash_profile:

export PATH=<mongodb-install-directory>/bin:$PATH

To make this modification effective, do the following:

$ source ~/.bashrc

or

$ source ~/.bash_profile

Start Mongodb

Here's how to start the MongoDB service.

First, create a data structure so that your database info can be stored:

$ mkdir -p data/db

Now, let's start MongoDB:

$ mongod --dbpath data/db

Verify that mongodb started successfully:

...
2018-01-28T09:55:10.898-0700 I NETWORK  [initandlisten] waiting for connections on port 27017

Use Mongodb

We'll need to use the mongo client so that we can interface with MongoDB.

$ mongo --host 127.0.0.1:27017

The first time you start it, you may see this error:

 [main] Error loading history file: FileOpenFailed: Unable to fopen() file /home/hseritt/.dbshell: No such file or directory

But it can be ignored. It just means that there has not been a history file created so far. Next time, you start it, you shouldn't see the error.

Database

A database in MongoDB is similar to the concept as a database in RDBMS systems as far as structure goes. Instead of containing tables, it contains collections.

When you start the client you should get a prompt.

This command shows you which database you're using:

> db
test

We're going to create a database of pets. So, let's switch to a database called pets. Note that when you do so, it creates a database called 'pets' automatically:

> use pets
switched to db pets
> db
pets

Collection

As mentioned, a collection is similar to an RDBMS table. A collection is a group of similar documents. Document is similar to a row of data in an RDBMS system.

Create

Let's create a document related to dogs. Note that when we use db.dogs... the "dogs" collection is automatically created too.

Here we're creating a dog document that has a name and a breed. The next command below called db.dogs.find() will give a result of all dogs created so far. But first, we'll use insertOne() to add a dog to our collection.

> db.dogs.insertOne({'name': 'Spike', 'breed': 'English Bulldog'})
{
    "acknowledged" : true,
    "insertedId" : ObjectId("5a6e0117d53c52ede0133ad3")
}
> db.dogs.find()
{ "_id" : ObjectId("5a6e0117d53c52ede0133ad3"), "name" : "Spike", "breed" : "English Bulldog" }

insertOne() is good for creating one document at a time, but if we need to create many documents, we can use db.dogs.insertMany() in this case:

> db.dogs.insertMany(
... [
... {'name': 'Sparky', 'breed': 'Beagle'},
... {'name': 'Rusty', 'breed': 'Chihuahua'},
... ]
... )

# output below:

{
    "acknowledged" : true,
    "insertedIds" : [
        ObjectId("5a6e0146d53c52ede0133ad4"),
        ObjectId("5a6e0146d53c52ede0133ad5")
    ]
}

Now, we can do a find() and get all of the dogs:

> db.dogs.find()
{ "_id" : ObjectId("5a6e0117d53c52ede0133ad3"), "name" : "Spike", "breed" : "English Bulldog" }
{ "_id" : ObjectId("5a6e0146d53c52ede0133ad4"), "name" : "Sparky", "breed" : "Beagle" }
{ "_id" : ObjectId("5a6e0146d53c52ede0133ad5"), "name" : "Rusty", "breed" : "Chihuahua" }

Now, let's create cats. Using db.cats.insertMany() will automatically create a cat collection:

> db.cats.insertMany(
... [
... {'name': 'Koko', 'breed': 'American Domestic Shorthair'},
... {'name': 'Gracie', 'breed': 'American Domestic Shorthair'},
... {'name': 'Cheshire', 'breed': 'Snow Leopard'},
... ]
... )
{
    "acknowledged" : true,
    "insertedIds" : [
        ObjectId("5a6e016fd53c52ede0133ad6"),
        ObjectId("5a6e016fd53c52ede0133ad7"),
        ObjectId("5a6e016fd53c52ede0133ad8")
    ]
}
> db.cats.find()
{ "_id" : ObjectId("5a6e016fd53c52ede0133ad6"), "name" : "Koko", "breed" : "American Domestic Shorthair" }
{ "_id" : ObjectId("5a6e016fd53c52ede0133ad7"), "name" : "Gracie", "breed" : "American Domestic Shorthair" }
{ "_id" : ObjectId("5a6e016fd53c52ede0133ad8"), "name" : "Cheshire", "breed" : "Snow Leopard" }

Query

We can query our database in a different way if we like. We can use db.getCollection('dogs')... or db.dogs... to start off our query. But, we'll keep it mostly simply and use db.dogs. Note that both queries return the same results:

> db.getCollection('dogs').find({'name': 'Spike'})
{ "_id" : ObjectId("5a6e0117d53c52ede0133ad3"), "name" : "Spike", "breed" : "English Bulldog" }

> db.dogs.find({'name': 'Spike'})
{ "_id" : ObjectId("5a6e0117d53c52ede0133ad3"), "name" : "Spike", "breed" : "English Bulldog" }

Let's query our cats now:

> db.cats.find({'breed': 'American Domestic Shorthair'})
{ "_id" : ObjectId("5a6e016fd53c52ede0133ad6"), "name" : "Koko", "breed" : "American Domestic Shorthair" }
{ "_id" : ObjectId("5a6e016fd53c52ede0133ad7"), "name" : "Gracie", "breed" : "American Domestic Shorthair" }

After a while, your client screen will get a little crowded. If your screen starts to get overwhelming, you can clear it by running:

> cls

Update

We can also modify our documents with updateOne(). For example, when our family got an unusual looking cat we named "Cheshire", my daughter said it was a snow leopard. You don't find snow leopards in Georgia and we knew it wasn't really a snow leopard. We found out though that it was an Egyptian Mau. So, we'll use the updateOne() function to change the breed type from Snow Leopard to Egyptian Mau.

> db.cats.find({'name': 'Cheshire'})
{ "_id" : ObjectId("5a6e016fd53c52ede0133ad8"), "name" : "Cheshire", "breed" : "Snow Leopard" }

> db.cats.updateOne(
... {'name': 'Cheshire'},
... {
... $set: {'breed': 'Egyptian Mau'}
... }
... )
{ "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }

> db.cats.find({'name': 'Cheshire'})
{ "_id" : ObjectId("5a6e016fd53c52ede0133ad8"), "name" : "Cheshire", "breed" : "Egyptian Mau" }

Delete

If we need to, we can delete a document.

> db.dogs.find()
{ "_id" : ObjectId("5a6e0117d53c52ede0133ad3"), "name" : "Spike", "breed" : "English Bulldog" }
{ "_id" : ObjectId("5a6e0146d53c52ede0133ad4"), "name" : "Sparky", "breed" : "Beagle" }
{ "_id" : ObjectId("5a6e0146d53c52ede0133ad5"), "name" : "Rusty", "breed" : "Chihuahua" }

Rusty was a rescue and we already had too many dogs, so we decided to give him to someone who really loves Chihuahuas.

> db.dogs.deleteOne({'name': 'Rusty'})
{ "acknowledged" : true, "deletedCount" : 1 }
> db.dogs.find()
{ "_id" : ObjectId("5a6e0117d53c52ede0133ad3"), "name" : "Spike", "breed" : "English Bulldog" }
{ "_id" : ObjectId("5a6e0146d53c52ede0133ad4"), "name" : "Sparky", "breed" : "Beagle" }

Note that 'Rusty' is no longer with us. :-) So, before I show you some Python code that will do the same things we just did with the Mongo client, we'll go ahead and clear the dogs and cats collection but keep them available.

> db.dogs.deleteMany({})
{ "acknowledged" : true, "deletedCount" : 2 }
> db.cats.deleteMany({})
{ "acknowledged" : true, "deletedCount" : 3 }
> db.dogs.find()
> db.cats.find()
> exit

PyMongo

Both Python and Mongodb are very simple to use. The cool thing about using Python with Mongo is that the syntax is not that much different than the querying methods that are used natively with Mongodb client. I'll show how to do the same things in Python interfacing with our Mongodb.

I am using Python 3.6.4 but I believe you can use the following code with any Python 3.x versions.

$ python
Python 3.6.4 (default, Jan  7 2018, 10:19:13) 
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.

For our MongoDB sandbox, let's use pyenv to create a virtual environment. This is optional but it can help you keep your environments from conflicting with each other.

$ pyenv virtualenv mongodb-sandbox
$ pyenv local mongodb-sandbox

Let's install the pymongo module that allows our Python code to "talk" to MongoDB.

$ pip install pymongo

Connection and Close

Create a script called demo.py and put the following code in it.

#!/usr/bin/env python

from pymongo import MongoClient

client = MongoClient('localhost', 27017)
db = client.pets

print(db.name)

client.close()

Let's run our simple connection demo. You should see something like this.

$ chmod +x demo.py
$ ./demo.py
pets

Create and Query

Like we used the MongoDB client to create documents and query them, we can do the same using Python and pymongo.

#!/usr/bin/env python

from pymongo import MongoClient

client = MongoClient('localhost', 27017)
db = client.pets

db.dogs.insert_one({'name': 'Spike', 'breed': 'English Bulldog'})
db.dogs.insert_many(
    [
        {'name': 'Sparky', 'breed': 'Beagle'},
        {'name': 'Rusty', 'breed': 'Chihuahua'},
    ]
)

db.cats.insert_many(
    [
        {'name': 'Koko', 'breed': 'American Domestic Shorthair'},
        {'name': 'Gracie', 'breed': 'American Domestic Shorthair'},
        {'name': 'Cheshire', 'breed': 'Snow Leopard'},
    ]
)

print('Dogs: ')

for document in db.dogs.find({}):
    print(document['name'])
    print(document['breed'])
    print('\n')

print('Cats: ')
for document in db.cats.find({}):
    print(document['name'])
    print(document['breed'])
    print('\n')

client.close()

When we run it we should see very similar output:

$ ./demo.py
Dogs: 
Spike
English Bulldog


Sparky
Beagle


Rusty
Chihuahua


Cats: 
Koko
American Domestic Shorthair


Gracie
American Domestic Shorthair


Cheshire
Snow Leopard

Update

As we updated our cats breed using the MongoDB client, we can do the same thing with our Python code.

#!/usr/bin/env python

from pymongo import MongoClient

client = MongoClient('localhost', 27017)
db = client.pets

for document in db.cats.find({'name': 'Cheshire'}):
    print(document)

db.cats.update_one(
    {'name': 'Cheshire'},
    {'$set': {'breed': 'Egyptian Mau'}}
)

for document in db.cats.find({'name': 'Cheshire'}):
    print(document)

client.close()

The output should look like:

$ ./demo.py 
{'_id': ObjectId('5a6e04aa68f1881c2524c64d'), 'name': 'Cheshire', 'breed': 'Snow Leopard'}
{'_id': ObjectId('5a6e04aa68f1881c2524c64d'), 'name': 'Cheshire', 'breed': 'Egyptian Mau'}

Delete

Of course, we can also delete all documents from our collections:

#!/usr/bin/env python

from pymongo import MongoClient

client = MongoClient('localhost', 27017)
db = client.pets

for document in db.dogs.find({}):
    print(document['name'])
    print(document['breed'])

db.dogs.delete_one({'name': 'Rusty'})

for document in db.dogs.find({}):
    print(document['name'])
    print(document['breed'])

db.dogs.delete_many({})

print([document for document in db.dogs.find({})])

client.close()

Our output:

$ ./demo.py 
Spike
English Bulldog
Sparky
Beagle
Rusty
Chihuahua
Spike
English Bulldog
Sparky
Beagle

# Our query result is an empty set.

[]

MongoDB is very simple to use and very simple to code with. No wonder there are MongoDB interfaces for just about every popular programming language available now. Its document-oriented data models allow our code to be very flexible without needing to make what are often-times painful schema changes. This is great for unstructured, dynamic and frequent chaotic uses. Hope this was helpful for you.

If this blog is helpful, please consider helping me pay it backward with a coffee.

Buy Me a Coffee at ko-fi.com