In order to use index in MongoDB, we must give a leftmost set of the indexes. The order of the indexes matter.
db.students.ensureIndex( { student_id : 1 } ) -> create the index on student_id in increasing order
db.students.ensureIndex( { student_id : 1, class : -1 } ) -> create a compound index
db.system.indexes.find( ) -> find all the indexes in the current database, index default on _id field
db.students.getIndexes( )
db.students.dropIndex( { student_id : 1 } ) -> drop the created index
MongoDB allows to create a key on a field which is an array, the index is called multi-key index.
MongoDB allows to create a compound index with an array and a scale, but does not allow to array.
db.stuff.ensureIndex( { thing : 1 }, { unique : true } ) -> create unique index, each key can only appear once
db.stuff.ensureIndex( { thing : 1}, { unique : true, dropDups : true } ) -> drop the duplicates expect for one
sparse index, only create index on the document that has the specific field
In order to find which index to use for a query, MongoDB will experiment different indexes on real data in parallel to test which is optimal and memorize it
db.students.stats( )
Index Cardinality
- Regular 1 : 1
- Sparse <= documents
- Multikey > document ( index on each array elements )
Use hint( ) to manually tell MongoDB what index to use
ensureIndex( {“location” : “2d” } ) -> 2D geospatial index
find( { location : { $near : [x, y] } } )
db.places.find( { location : { $near : {
$geometry : {
type : ‘Point’,
coordinates : [x, y] },
$maxDistance : 2000
}
}
} )
db.sentences.ensureIndex( { ‘words’ : ‘text’ } ) -> support full text search
db.sentences.find( { $text : { $search : ‘dog moss’ } } )
use mongotop to find where does most time have been spent on
mongostat
idx miss -> how many times indexes are not in the memory when they are needed, an import factor
Shard: split up the large data into several mongod client as shards, use a mongos as a sever and let the application talk to mongos. It will use shard_key to issue which shards receive the query. The insert operation must contain the entire shard_key. For update and remove query, if shard_key is not given, mongos will broadcast the query to hall shards.