PHP Master | MongoDB Indexing, Part 1
Key Takeaways
- Indexing in MongoDB can greatly enhance performance and throughput by reducing the number of full documents that need to be read, thus improving application performance.
- MongoDB supports several types of indexes, including Default _id Index, Secondary Index, Compound Index, Multikey Index, and Multikey Compound Index. Each type serves a specific purpose and is used for different types of queries.
- More than one index can be defined on a collection, but a query can only use one index during its execution. The best index is chosen at runtime by MongoDB’s query-optimizer.
- While indexing can dramatically improve read operations, it also incurs its own costs. Indexing operations occupy space and cause extra overhead on each insert, update, and delete operation on the collection. Therefore, indexing benefits read-heavy collections more than write-heavy collections.
- Default _id Index
- Secondary Index
- Compound Index
- Multikey Index
- Multikey Compound Index
<span>{ </span> <span>"_id": <span>ObjectId</span>("5146bb52d852470060001f4"), </span> <span>"comments": { </span> <span>"0": "This is the first comment", </span> <span>"1": "This is the second comment" </span> <span>}, </span> <span>"post_likes": 40, </span> <span>"post_tags": { </span> <span>"0": "MongoDB", </span> <span>"1": "Tutorial", </span> <span>"2": "Indexing" </span> <span>}, </span> <span>"post_text": "Hello Readers!! This is my post text", </span> <span>"post_type": "private", </span> <span>"user_name": "Mark Anthony" </span><span>}</span>
Default _id Index
By default, MongoDB creates a default index on the _id field for each collection. Each document has a unique _id field as a primary key, a 12-byte ObjectID. When there are no other any indexes available, this is used by default for all kinds of queries. To view the indexes for a collection, open the MongoDB shell and do the following:Secondary Index
For cases where we want to use indexing on fields other than _id field, we have to define custom indexes. Suppose we want to search for posts based on the user_name field. In this case, we’ll define a custom index on the user_name field of the collection. Such custom indexes, other than the default index, are called secondary indexes. To demonstrate the effect of indexing on database, let’s briefly analyze query performance without indexing first. For this, we’ll execute a query to find all posts having a user_name with “Jim Alexandar”.<span>{ </span> <span>"_id": <span>ObjectId</span>("5146bb52d852470060001f4"), </span> <span>"comments": { </span> <span>"0": "This is the first comment", </span> <span>"1": "This is the second comment" </span> <span>}, </span> <span>"post_likes": 40, </span> <span>"post_tags": { </span> <span>"0": "MongoDB", </span> <span>"1": "Tutorial", </span> <span>"2": "Indexing" </span> <span>}, </span> <span>"post_text": "Hello Readers!! This is my post text", </span> <span>"post_type": "private", </span> <span>"user_name": "Mark Anthony" </span><span>}</span>
- cursor – indicates the index used in the query. BasicCursor indicates that the default _id index was used and MongoDB had to search the entire collection. Going ahead, we’ll see that when we apply indexing, BtreeCursor will be used instead of BasicCursor.
- n – indicates the number of documents the query returned (one document in this case).
- nscannedObjects – indicates the number of documents searched by the query (in this case, all 500 documents of the collection were searched). This can be an operation with large overhead if the number of documents in collection is very large.
- nscanned – indicates the number of documents scanned during the database operation.
Compound Index
There will be cases when a query uses more than one field. In such cases, we can use compound indexes. Consider the following query which uses both the post_type and post_likes fields:<span>{ </span> <span>"_id": <span>ObjectId</span>("5146bb52d852470060001f4"), </span> <span>"comments": { </span> <span>"0": "This is the first comment", </span> <span>"1": "This is the second comment" </span> <span>}, </span> <span>"post_likes": 40, </span> <span>"post_tags": { </span> <span>"0": "MongoDB", </span> <span>"1": "Tutorial", </span> <span>"2": "Indexing" </span> <span>}, </span> <span>"post_text": "Hello Readers!! This is my post text", </span> <span>"post_type": "private", </span> <span>"user_name": "Mark Anthony" </span><span>}</span>
- field1
- field1, field2
- field1, field2, field3
<span><span><?php </span></span><span><span>// query to find posts with user_name "Jim Alexandar" </span></span><span><span>$cursor = $collection->find( </span></span><span> <span>array("user_name" => "Jim Alexandar") </span></span><span><span>); </span></span><span><span>// use explain() to get explanation of query indexes </span></span><span><span>var_dump($cursor->explain());</span></span>
Multikey Index
When indexing is done on an array field, it is called a multikey index. Consider our post document again; we can apply a multikey index on post_tags. The multikey index would index each element of the array, so in this case separate indexes would be created for the post_tags values: MongoDB, Tutorial, Indexing, and so on.Multikey Compound Index
We can create a multikey compound index, but with the limitation that at most one field in the index can be an array. So, if we have field1 as a string, and [field2, field3] as an array, we can’t define the index {field2,field3} since both fields are arrays. In the example below, we create an index on the post_tags and user_name fields:Indexing Limitations and Considerations
It is important to know that indexing can’t be used in queries which use regular expressions, negation operators (i.e. $ne, $not, etc.), arithmetic operators (i.e. $mod, etc.), JavaScript expressions in the $where clause, and in some other cases. Indexing operations also come with their own cost. Each index occupies space as well as causes extra overhead on each insert, update, and delete operation on the collection. You need to consider the read:write ratio for each collection; indexing is beneficial to read-heavy collections, but may not be for write-heavy collections. MongoDB keeps indexes in RAM. Make sure that the total index size does not exceed the RAM limit. If it does, some indexes will be removed from RAM and hence queries will slow down. Also, a collection can have a maximum of 64 indexes.Summary
That’s all for this part. To summarize, indexes are highly beneficial for an application if a proper indexing approach is chosen. In the next part, we’ll look at using indexes on embedded documents, sub-documents, and ordering. Stay tuned! Image via FotoliaFrequently Asked Questions about MongoDB Indexing
What is the importance of MongoDB indexing in database management?
MongoDB indexing is a critical aspect of database management. It significantly improves the performance of database operations by providing a more efficient path to the data. Without indexes, MongoDB must perform a collection scan, i.e., scan every document in a collection, to select those documents that match the query statement. With indexes, MongoDB can limit its search to the relevant parts of the data, thereby reducing the amount of data it needs to scan. This results in faster query response times and lower CPU usage, which is particularly beneficial in large databases.
How does MongoDB indexing work?
MongoDB indexing works by creating a special data structure that holds a small portion of the collection’s data. This data structure includes the value of a specific field or set of fields, ordered by the value of the field as specified in the index. When a query is executed, MongoDB uses these indexes to limit the number of documents it must inspect. Indexes are particularly beneficial when the total size of the documents exceeds the available RAM.
What are the different types of indexes in MongoDB?
MongoDB supports several types of indexes that you can use to improve the performance of your queries. These include Single Field, Compound, Multikey, Text, 2d, and 2dsphere indexes. Each type of index serves a specific purpose and is used for different types of queries. For example, Single Field and Compound indexes are used for queries on single or multiple fields, respectively. Multikey indexes are used for arrays, and Text indexes are used for string content.
How do I create an index in MongoDB?
You can create an index in MongoDB using the createIndex() method. This method creates an index on a specified field if the index does not already exist. The method takes two parameters: the field or fields to index and an options document that allows you to specify additional options.
Can I create multiple indexes in MongoDB?
Yes, you can create multiple indexes in MongoDB. However, it’s important to note that while indexes improve query performance, they also consume system resources, particularly disk space and memory. Therefore, it’s crucial to create indexes judiciously and only on those fields that will be frequently queried.
How do I choose which fields to index in MongoDB?
The choice of which fields to index in MongoDB largely depends on your application’s query patterns. Fields that are frequently queried or used in sort operations are good candidates for indexing. Additionally, fields with a high degree of uniqueness are also good candidates for indexing as they can significantly reduce the number of documents MongoDB needs to scan when executing a query.
How can I check if an index exists in MongoDB?
You can check if an index exists in MongoDB using the getIndexes() method. This method returns a list of all indexes on a collection, including the _id index which is created by default.
Can I delete an index in MongoDB?
Yes, you can delete an index in MongoDB using the dropIndex() method. This method removes the specified index from a collection.
What is index intersection in MongoDB?
Index intersection is a feature in MongoDB that allows the database to use more than one index to fulfill a query. This can be particularly useful when no single index can satisfy a query, but the intersection of two or more indexes can.
What is the impact of indexing on write operations in MongoDB?
While indexing significantly improves the performance of read operations, it can have an impact on write operations. This is because each time a document is inserted or updated, all indexes on the collection must also be updated. Therefore, the more indexes a collection has, the slower the write operations will be. It’s important to find a balance between read performance and write performance when creating indexes.
The above is the detailed content of PHP Master | MongoDB Indexing, Part 1. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Alipay PHP...

JWT is an open standard based on JSON, used to securely transmit information between parties, mainly for identity authentication and information exchange. 1. JWT consists of three parts: Header, Payload and Signature. 2. The working principle of JWT includes three steps: generating JWT, verifying JWT and parsing Payload. 3. When using JWT for authentication in PHP, JWT can be generated and verified, and user role and permission information can be included in advanced usage. 4. Common errors include signature verification failure, token expiration, and payload oversized. Debugging skills include using debugging tools and logging. 5. Performance optimization and best practices include using appropriate signature algorithms, setting validity periods reasonably,

Session hijacking can be achieved through the following steps: 1. Obtain the session ID, 2. Use the session ID, 3. Keep the session active. The methods to prevent session hijacking in PHP include: 1. Use the session_regenerate_id() function to regenerate the session ID, 2. Store session data through the database, 3. Ensure that all session data is transmitted through HTTPS.

The application of SOLID principle in PHP development includes: 1. Single responsibility principle (SRP): Each class is responsible for only one function. 2. Open and close principle (OCP): Changes are achieved through extension rather than modification. 3. Lisch's Substitution Principle (LSP): Subclasses can replace base classes without affecting program accuracy. 4. Interface isolation principle (ISP): Use fine-grained interfaces to avoid dependencies and unused methods. 5. Dependency inversion principle (DIP): High and low-level modules rely on abstraction and are implemented through dependency injection.

How to debug CLI mode in PHPStorm? When developing with PHPStorm, sometimes we need to debug PHP in command line interface (CLI) mode...

Article discusses essential security features in frameworks to protect against vulnerabilities, including input validation, authentication, and regular updates.

How to automatically set the permissions of unixsocket after the system restarts. Every time the system restarts, we need to execute the following command to modify the permissions of unixsocket: sudo...

The enumeration function in PHP8.1 enhances the clarity and type safety of the code by defining named constants. 1) Enumerations can be integers, strings or objects, improving code readability and type safety. 2) Enumeration is based on class and supports object-oriented features such as traversal and reflection. 3) Enumeration can be used for comparison and assignment to ensure type safety. 4) Enumeration supports adding methods to implement complex logic. 5) Strict type checking and error handling can avoid common errors. 6) Enumeration reduces magic value and improves maintainability, but pay attention to performance optimization.
