How is index performance achieved in MongoDB?

Your thoughts?

|

it's all about understanding your application and the nature of your queries. If you are running the majority of your queries on a single field, then understand the best way to index the database.

Be sure to follow the Equality, Sorting, Range (ESR) rule as well.

This basically states that your compound indexes should define index fields in this order, with equality matchers first, sorting second, and range third. This makes sense when you understand the notion of prefixes and compound indexes in Mongo.

A prefix is basically any subset of a compound index relative to the order of the keys when the index is created. for example

db.myCollection.createIndex({phone:1, email:1, name:1})

means the following prefixes...

{phone:1}
{phone:1, email:1}

these prefixes dictate the order of what fields can be used and still leverage the compound index they both participate in.

Also be sure that indexes fit on memory. If your indexes don't fit on memory then you must make sure the index increments by one each time. This allows partial indexes to be stored in RAM and things still be ok.

FInally, don't overdo indexes. You don't want to create too much book keeping in managing all the duplicate inserts, updates, deletes. Also be sure to thoroughly understand partial indexes and other optional characteristics as these can optimize and reduce performance when used appropriately.


|

Understanding ESR is key...

Equality

When creating a compound index, fields that match based on equality should be placed first.

For example, if your application runs this query frequently...

db.get({name:"Sam"}).sort({email:1})

then your compound index should order the fields like this...

db.collection.createIndex({name:1, email:1})

Sorting

Sorting should always follow equality matches. This is because sorting can be an expensive operation. By quickly eliminating 90% of the results with an equality match, the sort operation has significantly less work to do.

Range

Range operations should typically follow sorting operations. Sorting indexes can't be utilized when following range operations.

Also...

Selectivity

Keep queries selective. Always start with equality and exact matches. This plays into the ESR rule for compound index as explained above.

Apart from these fancy rules, common sense tells you to filter the result set first. MOST IMPORTANTLY, EQUALITY OPERATIONS SHOULD ELIMINATE 90% OF THE RESULT SET.

If you run a equality match on sex (male, female) then you could still filter through a bunch of records because the indexed field has low cardinality...

This...

db.collection.find({email:"sam@gmail.com"})

Not this

db.collection.find({age:{$gte:30}})

Keep things selective, especially before sorting or filtering further. This minimizes the data set your using for the other operations.

Index Properties

You can use stuff like unique and partial to your advantage. Understand all of the options available to you and leverage the performance benefits they offer.

|

1) Follow the ESR rule

2) Be selective with queries

3) Fit indexes inside RAM

4) Use indexes intelligently