Navigation

Index Ranking in Performance Advisor

The indexes suggested by the Performance Advisor are ordered by their respective Impact scores. Impact indicates the estimated performance improvement that the suggested index would bring.

How Performance Advisor Suggests and Ranks Indexes

The Performance Advisor monitors queries that take longer than 100 milliseconds to execute and groups these queries into common query shapes. The Performance Advisor calculates the inefficiency of each query shape by considering the following aggregated metrics from queries which match the shape:

  • Amount of time spent executing the query.
  • Number of documents scanned.
  • Number of documents returned.

To establish recommended indexes, the Performance Advisor uses these metrics in a formula to calculate the Impact, or performance improvement that creating an index matching that query shape would cause. The Performance Advisor compares the amount of time spent executing index-specific operations to the total operational latency in the deployment. When the Performance Advisor suggests indexes, the indexes are ranked by their Impact score.

Index Field Order

The type of query operation in the query shape affects the order of the fields used to construct the index. In general, fields are ranked by their cardinality.

The following table shows how the Performance Advisor ranks various operation types by order of relative importance:

Rank Operation Type Example Operator
1 Equality match $eq
2 Array query $in
3 Range query $gte
4 Type query $type
5 Exists $exists
6 All other operators $nearSphere
7 Sort sort()

Limiting Proposed Indexes

The Performance Advisor does not suggest indexes which:

  • Have more than 16 fields, and/or
  • Contain _id as a field key.

Additionally, the Performance Advisor only suggests the index if:

  • For impacted queries, the difference between scanned documents and returned documents is greater than 500, and
  • At least 60 seconds cumulatively were spent executing impacted queries over the past 24 hours.

Index De-Duplication

The Performance Advisor de-duplicates overlapping indexes before making suggestions. For example, consider if the Performance Advisor calculates the following potential suggested indexes:

{ a : 1 }
{ a : 1, b : 1 }

Since { a : 1 } is a prefix of { a : 1, b : 1 }, Performance Advisor only suggests { a : 1, b : 1 }. For more information on index prefixes, see Prefixes.

Example: New York City Taxi Data

This example uses a database named cab-db containing information about New York City taxi rides, with fields for the times of pickup and dropoff, ride distance, and a breakdown of ride costs. A typical document in the collection yellow looks like this:

{
     "_id" : ObjectId("5db9daab0b2a17b7706cd6a3"),
     "pickup_datetime" : "2014-06-30 02:09:23",
     "dropoff_datetime" : "2014-06-30 02:20:36",
     "passenger_count" : 2,
     "trip_distance" : 3,
     "fare_amount" : 12,
     "tip_amount" : 2.6,
     "total_amount" : 15.6
}

The collection contains more than 10 million documents, so an application which needs to run queries based on specific field data is going to generate some very inefficient operations unless the collection is properly indexed.

Typical queries for this application search for documents which contain a specific dropoff time, combined with one or more other fields. For example:

db.yellow.find({ "dropoff_datetime": "2014-06-19 21:45:00",
                 "passenger_count": 1,
                 "trip_distance": {"$gt": 3 }
               })

The Performance Advisor recommends the following indexes to improve performance:

Screen shot of suggested indexes

Note

By default, the Performance Advisor shows index recommendations for all collections in your cluster. To narrow the recommendations down to a specific collection, select one from the Collection dropdown menu.

The recommended indexes are listed in order of performance impact, from greatest to least. In this example, the first recommended index is estimated to decrease overall cluster operation latency time by 50%.

The first recommendation is for an index on three fields:

  • passenger_count
  • dropoff_datetime
  • trip_distance

Click the Create Index button to get a copyable shell command which you can run in the mongo shell to create the suggested index.

Creating this index improves performance by removing the need for the database engine to scan the entire collection to find documents matching the specified criteria. Queries with the shape shown in the example return results in 50 milliseconds or less on the indexed collection, as opposed to several seconds on the unindexed collection.

The suggested index command uses the background: true option so the index will build in the background and will not block any operations on the cluster. For more information about how indexes work, see the MongoDB manual.

Screen shot of index command

Note

You can also create indexes with the Data Explorer.