Getting Started With Stored Procedures in MongoDB

Posted in Databases on Thursday, the 26th of August, 2010.

My most exciting technical discovery this year has been MongoDB. Mongo is proving to be a very satisfying solution to a lot of problems, so I'll no doubt be mentioning it here many times in the near future.

MongoDB has support for stored procedures, which is great, but as a newcomer, I found it wasn't entirely obvious how to get started. I also found that the typically quite excellent MongoDB documentation tended to fall short on examples of the specific things I had trouble with. So here's a quick writeup of getting stored procs up-and-running in MongoDB.

Server-Side JavaScript

The first thing you need to know is that MongoDB stored procs are written in JavaScript. This feels pretty strange at first, but in actual fact it's far more satisfying to program in a real language than in any of those vendor-specific SQL-based concoctions. It's also very powerful: MongoDB uses the same JavaScript runtime as FireFox, so you have pretty much every feature of the language, including closures, Iterators and XML support via E4X.

So in MongoDB, a stored procedure is really just a JavaScript function, which must be stored in a special collection named db.system.js. Imagine a simple function like so:

function addNumbers( x , y ) {
    return x + y;
}

To convert that to a stored proc, just insert it into db.system.js:

> db.system.js.save({_id:"addNumbers", value:function(x, y){ return x + y; }});

Stored procs can be viewed, altered and deleted in the same way as any document in any collection, as that's all they are. So we can check that our code was successfully stored by using a simple find():

> db.system.js.find()
{ "_id" : "addNumbers", "value" : function cf__3__f_(x, y) {
    return x + y;
} }

Ignore the machine-generated function name. That looks good, and we can start to actually use the proc. That's done using db.eval():

> db.eval('addNumbers(17, 25)');
42

That's all pretty simple, I think.

Preventing Injection of Vulnerabilities

One important point to be aware of is that you shouldn't directly call db.eval() on user-supplied data, as this opens you up to risks analogous with SQL injection attacks (let's call it NoSQL injection), and raises the risk of the proc choking on characters that might comprise a syntax error.

The solution to this is to modify the approach slightly, with the parameters to addNumbers() being passed in as extra parameters to db.eval() itself. In this way, MongoDB knows that the parameters are to be treated as data rather than as code. An example should make that clearer, though I was unable to find one online. Well, here's one:

> db.eval(function(x,y) { return addNumbers(x, y); }, 17, 25)
42

The syntax is kind of esoteric, as to achieve this, you have to wrap the call to addNumbers() in an anonymous function. But it works, and hopefully you can see what's going on there.

MongoDB Stored Procs from PHP

While we're here, I'll add a few notes about doing all of this from PHP, as that was the next challenge. The PHP MongoDB drivers, available via PECL, make this relatively simple, so let's have a look.

Creating a stored proc from PHP is done as follows:

<?php
 
$mongo      = new Mongo('mongodb://mymongo.server');
$database   = $mongo->selectDB('my-database-name');
$collection = $database->selectCollection('system.js');
 
$proccode = 'function addNumbers(x, y) { return x + y; }';
 
$collection->save(
                array(
                    '_id'   => 'addNumbers',
                    'value' => new MongoCode($proccode),
));

Do ensure that the "_id" is set to whatever you want the function to be called. One genuinely tricky bit here can be escaping various characters in the code that might get treated specially by PHP and/or JavaScript, including quotes, backslashes and so forth. My function contained a fairly involved regular expression, so this bit was particularly painful in my case.

Finally, we turn to calling the stored proc from PHP:

<?php
 
$toexec = 'function(x, y) { return addNumbers(x, y) }';
$args   = array(30, 12);
 
$response = $database->execute($toexec, $args);

Again we see the call to addNumbers() wrapped in an anonymous function. Remember that we don't want to bind in user-supplied data and have JavaScript happily eval() it for us as code, and this can be avoided by wrapping the arguments in an array and passing them to MongoDB::execute() separately, as we did in the example.

The data to be passed as parameters to the JavaScript function must be in the array in the order they are declared in the function signature. One other thing to note is that you mustn't be tempted to specify array keys, such as using the names of the parameters. I tried this in a well-intentioned attempt at achieving clarity, but this actually prevents the parameters being passed in correctly.

The response comes back as an array containing the return value of the function and a status indicator:

array (
    'retval' => 42,
    'ok' => 1,
)

So that's about all there is to say about that. I wanted to add a brief section about debugging stored procs, as it can be really quite difficult to see what on earth is going on when the code runs, or indeed unexpectedly doesn't run, but I'm not sure I've figured that bit out yet. If readers have any hints on debugging stored procs, then it would be great to hear about it.

Related Reading

A word of thanks must go to Kristina Chodorow for patiently fielding my beginner-level questions via the MongoDB mailing list.

Comments

Posted by Projapati on Tuesday, the 5th of April, 2011.

Looks like you are putting entire sproc\js fun code when calling the function. It might look good with you 1 line fn, what about 500 lines js fn? How should it be done for that?

Posted by Chris on Tuesday, the 10th of May, 2011.

Nice post. Thanks for your contribution it was just what I was looking for.

Posted by jim on Sunday, the 27th of May, 2012.

Thanks for the article. I agree that one of the most powerful features of MongoDB is the fact that it uses javascript as its query and stored procedure language. This opens up the possibility of allowing developers to write stored procedures rather than request them from the DBA's. Of course there are some things that really should be done in a relational database, but mongodb solves many common problems.

Posted by lenny on Saturday, the 14th of July, 2012.

Very helpful! Thanks for posting

Posted by Aaron Heckmann on Tuesday, the 29th of January, 2013.

Heads up: MongoDB 2.4 changes the javascript engine from Spidermonkey to v8, so Spidermonkey only features like E4X and Iterators will not be available.

Posted by yujie on Tuesday, the 5th of February, 2013.

mongodb procedure has a return type?

Enter your comment: