Close and Go BackBack to Viget

Getting Started with MongoDB & MongoMapper

Clinton R. Nixon
Clinton R. Nixon, Development Director, September 17, 2009 26

As part of our NoSQL exploration, I’ve spent some time lately with MongoDB. MongoDB bills itself as a “schema-free document-oriented database.” In using MongoDB, I’ve found it to be an easy transition from RDBMS’s because of the way it organizes document-based data. Here’s the basics:

  • MongoDB has collections of data, not tables. Unlike CouchDB, which is also a document-oriented DB, Mongo has namespaces for data. These are schema-less, so any data could go in each namespace. In my practice, I’ve persisted objects of one class into each collection, not unlike ActiveRecord with MySQL or any other RDBMS.

  • MongoDB has indexes. Even though each collection has no schema, you can still index the data in a collection based off a field. Not all documents in a collection have to have this field.

  • MongoDB has a query language and query profiling. While you can use JavaScript to search through a collection, like CouchDB, you also have access to a rich query language that can filter based on fields, like SQL, and filter based on the contents of embedded documents, which proves to be totally freaking awesome. Instead of a complex join, you can query for all documents in the posts collection that have an embedded comment in the last month.

Given the similarities between MongoDB and a relational database, you’d think it would be easy to use in Ruby in place of ActiveRecord, and you’d be right. John Nunemaker has created a gem called MongoMapper to work as an object mapper to MongoDB. Using MongoMapper, you can create model classes like so:

class Book
  include MongoMapper::Document

  key :title, String, :required => true
  key :author, String
  key :published_at, Date
  key :user_id, String
  timestamps! # HECK YES

  belongs_to :user
  many :chapters
end  

You’ll note several things here. Keys are defined in the model, like in a DataMapper model, although they aren’t defining a schema, only a mapping for this particular model. (If the difference seems subtle, that’s because it is: MongoMapper in many ways lets you treat MongoDB as a relational DB.) The keys can be typecast as I’ve done, although they don’t have to be. I’ve defined relationships to other models, and MongoMapper is smart about this. In the case of many :chapters, it looks to see if the Chapter class is embeddable. If so, it will embed Chapter documents in my Book document. If not, it will store them in their own collection.

Just because MongoMapper defines a document with keys, you don’t have to stick to the keys. Because collections are schema-less, you can add new attributes at will, like in this example:

book = Book.new(:title => "Moby Dick 2")
# => #<Book _id: , title: Moby Dick 2, author: >

book.author = "Dan Brown"
book.update_attributes(:author => "J.K. Rowling", 
                       :isbn => '1-2345-6789-0', 
                       :amazon_score => 1.25)
book.save

book = Book.find_by_title("Moby Dick 2")
# => #<Book _id: 4aafe487477a51f0e8000002, 
#         title: Moby Dick 2, 
#         author: J.K. Rowling, 
#         isbn: 1-2345-6789-0, 
#         amazon_score: 1.25> 

You can see that I can set keys defined in the class with setters, but I can set any attribute through update_attributes.

MongoMapper’s API is roughly equivalent to ActiveRecord’s, allowing you to use in a Rails application with little difficulty. The only things I’ve had to do are define human_name on model classes and define new_record? on embedded documents.

The only other thing you need to know to get started with MongoMapper is how to tell it what database to use. All you have to do is set MongoMapper.connection and MongoMapper.database. In my sample Rails app, I’ve put a file in config/initializers/ that looks like this:

db_config = YAML::load(File.read(RAILS_ROOT + "/config/database.yml"))

if db_config[Rails.env] && 
db_config[Rails.env]['adapter'] == 'mongodb'
  mongo = db_config[Rails.env]
  MongoMapper.connection = Mongo::Connection.new(mongo['hostname'])
  MongoMapper.database = mongo['database']
end

You can see my database.yml file for more information on setup or check out Ben Scofield’s Rails template for MongoMapper.

That should get you started! I’ve really enjoyed using MongoDB so far. For further information, checkout the MongoDB Ruby driver code, the MongoMapper code, and the code for my sample app on GitHub, and look out for more upcoming posts about how we’ve used MongoDB.

Nick Lewis said on 09/17 at 08:28 AM

Great article Clinton, I’m definitely looking forwards to these NoSQL series as the recent push behind Document-Oriented stores has really got me curious. While I know this is more of a high-level approach (and just scratching the surface about MongoDB), i’d be interested in reading discussions on how well MongoDB and its counterparts like CouchDB scale, and what techniques can be used vs a RDMS.  I’m building a new facebook app soon and would love to play with MongoDB or Couch just worried a bit how they handle traffic.  Nonetheless, awesome writeup and I look forward to the rest of the series!!

Clinton R. Nixon said on 09/17 at 09:22 AM

Thanks, Nick. We should definitely touch on performance. MongoDB makes performance a priority. On the Philosophy page of their site, they say “By reducing transactional semantics the db provides, one can solve an interesting set of problems where performance is very important.”

I can’t really compare it to CouchDB, but I can say that in my experience, Mongo’s very fast. Built-in replication and (still in alpha) auto-sharding should help it scale well.

Mike Dirolf said on 09/17 at 10:07 AM

Great introduction to MongoDB and MongoMapper, Clinton. I think you really hit the nail on the head in terms of what makes MongoDB unique.

Nick, to follow up on what Clinton said, I work on the MongoDB team and we’re pushing on auto-sharding hard now. It’s well on its way, but once it’s fully production quality it should allow for almost infinite scalability. To see who’s using MongoDB in production now check out the wiki article on production deployments.

Nick Lewis said on 09/17 at 10:30 AM

@Mike:  Thanks for the reply I’ve been poking through the wiki and reading a bit more. Auto-sharding looks really interesting.  Thanks for the link as well. Question though:  as far as MongoDB in production, what has been a word of advice for scaling since auto-sharding is still in alpha, has there been any other technique used?  Replication?  Definitely think i’m going to roll with MongoDB on my next project just doing a little R&D;now.  Mike, Awesome, Awesome work.

@Client:  What’s next in the series? Very excited about it, lol.

Mike Dirolf said on 09/17 at 10:40 AM

@Nick: Some people have been using replication to scale out reads. Sourceforge, for example, has a single write-only master with several read-only slaves. Single node write performance is pretty good with MongoDB, so this can take you pretty far (to really scale out writes you’ll need sharding though). Glad to hear you’re going to give MongoDB a shot - ping us on the list / irc if you have questions or need help!

John Nunemaker said on 09/17 at 11:11 AM

Clinton: Great article. A couple notes. update_attributes actually calls save internally so the save call following it is just extra.

Also, you can use bracket notation to assign keys on the fly. You don’t have to use new or update_attributes. Something like this will do the same thing.

book = Book.new(:title => ‘Foobar’)
book[’pages’] = 345
book[’genre’] = ‘Awesome’
book.save

Just thought I would point those out. Thanks for the write up.

Alexander Kahn said on 09/17 at 11:29 AM

Can you go into more detail about having to set human_name and new_record?

Chris said on 09/17 at 11:51 AM

Is MongoDB really a serious database? will it scale to tens of millions of records? will it be quick at that speed?

I cant see how it could get anywhere near the speed of established databases. It seems like a toy.

Clinton R. Nixon said on 09/17 at 12:18 PM

Alexander,

new_record? is defined on Documents, but not on EmbeddedDocuments. If you want a form just for an EmbeddedDocument, you’ll want to define new_record? for your form helpers. (You can handle this manually, but you’ll end up with similar logic.) There’s not a great way to do this, as EmbeddedDocuments don’t know if they’re saved. They don’t know their embedder.

You can either be super-awesome and make a callback on the embedder that sets an attribute in each of its embedded documents (which I didn’t) or write something kind of lame like:

def new_record?
name.nil?
end

I did the latter for my demo app.

I like to use a particular form builder which makes use of .human_name on models. It’s meant for ActiveRecord, but works like a charm with MongoMapper, except MM models don’t have .human_name defined. (Which makes me think, why have I not submitted a patch for this?) Here’s what I use:

def human_name
self.to_s.humanize
end

Clinton R. Nixon said on 09/17 at 12:20 PM

@John: Thanks for the notes! I didn’t realize you could use bracket notation.

Mike Dirolf said on 09/17 at 12:37 PM

@Chris Definitely not a toy - please see my previous comment for a link to places MongoDB is already used in production (some of whom have been using it in production for almost 2 years now). Performance is one of the main reasons to choose MongoDB, actually, not a weakness - I’d suggest downloading it and giving it a try. If you have questions or concerns shoot an email to the list and we’ll help you out.

TJ Stankus said on 09/17 at 12:47 PM

Awesome write-up Clinton. Good timing to as I’ve got an app I’m going to try MongoDB out with.

david said on 09/17 at 01:17 PM

nice article

as additional resource, you can find two opensource projects using mongomapper here:
http://gitorious.org/shapado/shapado
and
http://gitorious.org/menki/menki

Rich said on 09/17 at 02:54 PM

Clinton,

Just curious.  What is the actual problem you’re trying to solve?  Are you running an application where an RDBMS failed to scale for you?  Or where a map/reducable key-value store with enhanced querying and relational features (the exciting future of this in my perspective) was a necessity?

Peter said on 09/17 at 03:55 PM

Clinton, how do you find it dealing with a database where you can add new attributes at will? Where a typo in code can add a new attribute rather than cause an error?

I love the idea of the flexibility available, but part of me wonders what a mess the data can/will end up in after a couple of years, unless the programmers are very strict in their usage…

John Nunemaker said on 09/17 at 04:13 PM

@Rich - It is not just about scaling. Sometimes data is easier to model with document databases than relational.

@Peter - That is what tests are for. ;) Also, MongoDB already has some features available for managing your data such as renaming collections and such and they have more planned for the future like removing/renaming keys.

daeltar said on 09/17 at 04:42 PM

How would you write tests when using MongoMapper? Any easy way to return database to known state (something like transactional fixtures)?

John Nunemaker said on 09/17 at 04:48 PM

@daeltar - Mongo has no transactions so there is no rolling back. I typically clear all model collections on setup and use factories to create my test cases.

Foo.collection.clear will empty the collection where Foo is a class that includes MongoMapper::Document.

Rich said on 09/17 at 04:51 PM

Yes?  Is this where you have tightly coupled dependent object relationships?  Variable column definitions where you don’t want your database full of NULLs?  Non-relational text with full search (the case where I as most have used a document-based db)?

Ben Scofield said on 09/18 at 05:59 AM

@Rich: I’ve used Mongo for a couple of projects, generally falling into three categories (two of which you identified):

1) Tightly-coupled dependent objects – I wrote a survey-building application that handled multiple-choice questions as questions with embedded choices.

2) Domains with variable data (which you identified) - rows full of NULLs make me cry, so any time I have variable data (or STI) my document-db wheels start turning.

3) One-offs, like the app I run on my local machine to track sites around the web where I’ve left comments. I used to use SQLite for these, but they’ve generally got very flat (non-relational) persistence mappings, so a relational DB would be overkill - and being able to avoid migrations makes it just that much easier to get going.

I actually haven’t needed to resort to a document DB for full-text searching or for scalability, but there are still good reasons to pull one in.

Rich said on 09/18 at 09:14 AM

Interesting examples.  Thanks, Ben. :)

Panos said on 09/18 at 11:05 AM

Hey guys, i am evaluating mongodb and on doing a search on almost 1m documents on an indexed date field (using MongoMapper) it was significantly slower than MySQL (everything on a local machine). Do you have any clue why this happens?

Mike Dirolf said on 09/18 at 11:07 AM

@Panos if you don’t mind posting to the MongoDB list it will be easier to figure out what’s going on there.

suheimi said on 10/11 at 05:45 AM

Great article, I’m definitely trying out the NoSQL db - especially MongoDB. I’m so curious. I noticed that this Mongo concept is little bit similar to the Multi Dimensional - multi valued database, like IBM-U2 Universe.  Thanks, for the article, awesome and I look forward to others series.

Damon said on 11/03 at 06:37 PM

@Panos/Mike I’m pretty sure it’s because MongoMapper isn’t actually creating the indexes. (MongoMapper.ensure_indexes! never gets called - will file a bug.)

Not sure if that’s intentional or not, but after creating an index on 2M documents, it was wicked fast.

Damon said on 11/03 at 08:57 PM

Update: you have to call it manually.

More details on the mailing list - http://groups.google.com/group/mongomapper/browse_thread/thread/2bebc5bdc4bb54a7

Commenting is not available in this weblog entry.

We're the Developers

at Viget Labs. We write about web development trends, tips, best practices, industry events, and our projects — all with an emphasis on Ruby on Rails.

Contact Us

Have any questions, comments, ideas, or secrets to share? Let us know.


How many days in a non-leap year?

Sorry, you need to have Javascript enabled to use this form. (Don't blame us, blame the spammers!) If you'd like to contact us, please visit our Contact page.