Home Database Mysql Tutorial 研究了这么久的MongoDB,我也来吐下槽。

研究了这么久的MongoDB,我也来吐下槽。

Jun 07, 2016 pm 05:56 PM
mongodb Research

MongoDB做为一款NOSQL数据库,在刚接触它的时候,被它的性能深深的吸引了。在一四核,4G内存的centos虚拟机上,插入了500W每条大小200byte的数据。发现它的写性能太令我震惊了。在不做索引的情况下,前一百万条,只用了二分钟就插完了,这只是我WIN7上的一台

   MongoDB做为一款NOSQL数据库,在刚接触它的时候,被它的性能深深的吸引了。在一四核,4G内存的centos虚拟机上,插入了500W每条大小200byte的数据。发现它的写性能太令我震惊了。在不做索引的情况下,前一百万条,只用了二分钟就插完了,这只是我WIN7上的一台虚拟机,WIN7执行插入操作。在先建索引的情况下,再插入了一百万条,只是比没有索引的情况下,慢了20秒。但发现它对磁盘的占用,有点超出了我的估计!它占用的磁盘空间太大,而实际上数据大小没有这么大。磁盘占用大小差不多是数据的三倍。

  插完数据后,进行了一些读取操作。性能还是非常可观的,查询都是MS秒级的。欣喜之余,接着再插数据。坑爹的事情就发生了。32位的mongodb最大一块文件块是512M,当512M有存储空间用完时,再插数据会先划出512M的数据块。当内存被大量占用后,发现它的插入数据,变龟速了。特别是在开辟一块新的存储空间时,完全阻塞了。Mongo在内存足够的情况下,开始插入的数据能达到6000条/秒,到内存不足后,速度瞬间降到了200条/秒,如果内存进一步退化,索引比数据量大的话,香港空间,有可能完全阻塞。

  换到64位的MongoDB测试,发现他的内存充足的情况下,比32位的插入100W条速度要快到十几秒。而且64位的MongoDB,他最大的一块存储是2G的数据块。当内存不足的情况下,哥哭了~~~,绝大部分时间在阻塞。速度降到你不能忍受。关闭MongoDB,重启centos后,再接着插入,在将部分MongoDB的数据加载进内存后,又非常快了,插入速度几M/秒。好景不长,当2G的数据块用完后,再开辟一块2G的数据块时,发现MongoDB占用的内存瞬间升高,写入速度直线下降,直至阻塞。我怀疑MongoDB在开辟了那2G的空间后,同时在内存中开辟了一块2G的内存,由于当时内存不足(发现SWAP中的虚拟内存也占用过高),所以产生了阻塞情况。MongoDB可能是内存映射写入方式,所以它在内存足够的情况下,写入速度非常快。建议实际生产环境中,如果数据量大的话,给它多留点内存吧,MongoDB绝对是吃内存的老虎。

  之后重启centos,内存又降下来了,MongoDB中已经存储了500万条数据了,再进行有索引查询,发现MongoDB在数据在冷的情况下,响应很慢,多执行几次查询预热后,性能才能回升,直至像刚插入时再查询那样。500万条数据查询,返回1000行数据内的,有索引情况下,查询时间是几十MS,然后继续测试了各种复杂查询。执行下面一条语句后,哥泪牛满面了

db.jqueue.find({"$or":[{"Name":"janson7"},{"Age":{"$in":[1,2,3]}}]}).sort({"_id":-1}).explain() { "cursor" : "BtreeCursor _Name_ reverse", "isMultiKey" : false, "n" : 301, "nscannedObjects" : 5000000, "nscanned" : 5000000, "nscannedObjectsAllPlans" : 5000000, "nscannedAllPlans" : 5000000, "scanAndOrder" : false, "indexOnly" : false, "nYields" : 0, "nChunkSkips" : 0, "millis" : 50989, "indexBounds" : { "_id" : [ [ { "$maxElement" : 1 }, { "$minElement" : 1 } ] ] }, "server" : "localhost:27017" }

  发现他全表遍历了一次,反复测试后,都是这样的情况,一去掉sort,后,就是直接读索引,或者把OR操作去掉,也是读索引。我认为,排序应该是在查询到的数据中进行排序的,也就是先去索引中找到了相应的项,再把项根据我的要求排序啊,不可能出现遍历表的情况。

  然后经过了坚辛的百度和Google,终于找到了答案,原来这是MongoDB的一个Bug,从他一设计出来后,这个Bug就一直没解决过。

  园子里这位兄台的文章里写了

  它自已的官方上的反馈:https://jira.mongodb.org/browse/SERVER-1205 发现这个问题,从10年就有人提出了,直到现在,2.2.2版本了,都还没有解决。如果有要进行$or查询,再sort排序业务的兄弟,请三思,我们开始想用MongoDB,就是因为我们业务里面这个查询是一个非常频繁且关键的查询。

  在倍受打击后,改变设计方法,改变业务模式,虚拟主机,我不再进行$or查询了,我直接用Capped Collection来做一个临时映射,通过Capped表中数据进行排序,分页偏移,再用ID去主表查询。

  在使用Capped Collection时,又发现了坑爹的事。2.2之前的版本,Capped Collection是默认没有索引的,2.2后就默认加了_id,并做索引了.我用的是C#驱动,然后按照驱动说明方法,

var collectionOptions = CollectionOptions.SetCapped(true).SetMaxDocuments(1000).SetMaxSize(1000000).SetAutoIndexId(false);

  建了一个Capped表,去MongoDB里面看,发现,他还是建了索引。头大了,又开始找资料,发现了官方提供的驱动版本是1.7版本以前的,网站空间,也就是说,这个版本有可能不会支持2.2的新功能,在2.2以前,Capped默认是不建索引的,2.2是默认建索引了。查找官方驱动源码,下载地址:https://github.com/mongodb/mongo-csharp-driver

 

Sets whether the collection is capped. CollectionOptionsBuilder SetCapped(bool value) { if (value) { _document[] = value; } else { _document.Remove(); } return this; }

发现他的源码是这样写的,因为早期版本默认情况下是不建索引的,所以,如果 SetCapped传入的参数是false的话,他就直接执行了_document.Remove("capped");这一句,直接把这个参数选项从CollectionOptions项中删除了,没有带这个参数传入至数据库,而默认情况下,它是要建索引的,也就是说,在这个驱动版本,你是怎么样做Capped都会给你建索引,最后没办法,只好改了他的源码

 

Sets whether the collection is capped. CollectionOptionsBuilder SetCapped(bool value) { _document.Remove(); }

让它不管输入什么参数,这项都得输入,然后再执行时 ,发现MongoDB里面的Capped就没有建索引了。

  

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Nordhold: Fusion System, Explained
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1667
14
PHP Tutorial
1273
29
C# Tutorial
1255
24
Use Composer to solve the dilemma of recommendation systems: andres-montanez/recommendations-bundle Use Composer to solve the dilemma of recommendation systems: andres-montanez/recommendations-bundle Apr 18, 2025 am 11:48 AM

When developing an e-commerce website, I encountered a difficult problem: how to provide users with personalized product recommendations. Initially, I tried some simple recommendation algorithms, but the results were not ideal, and user satisfaction was also affected. In order to improve the accuracy and efficiency of the recommendation system, I decided to adopt a more professional solution. Finally, I installed andres-montanez/recommendations-bundle through Composer, which not only solved my problem, but also greatly improved the performance of the recommendation system. You can learn composer through the following address:

Navicat's method to view MongoDB database password Navicat's method to view MongoDB database password Apr 08, 2025 pm 09:39 PM

It is impossible to view MongoDB password directly through Navicat because it is stored as hash values. How to retrieve lost passwords: 1. Reset passwords; 2. Check configuration files (may contain hash values); 3. Check codes (may hardcode passwords).

How to choose a database for GitLab on CentOS How to choose a database for GitLab on CentOS Apr 14, 2025 pm 04:48 PM

GitLab Database Deployment Guide on CentOS System Selecting the right database is a key step in successfully deploying GitLab. GitLab is compatible with a variety of databases, including MySQL, PostgreSQL, and MongoDB. This article will explain in detail how to select and configure these databases. Database selection recommendation MySQL: a widely used relational database management system (RDBMS), with stable performance and suitable for most GitLab deployment scenarios. PostgreSQL: Powerful open source RDBMS, supports complex queries and advanced features, suitable for handling large data sets. MongoDB: Popular NoSQL database, good at handling sea

What is the CentOS MongoDB backup strategy? What is the CentOS MongoDB backup strategy? Apr 14, 2025 pm 04:51 PM

Detailed explanation of MongoDB efficient backup strategy under CentOS system This article will introduce in detail the various strategies for implementing MongoDB backup on CentOS system to ensure data security and business continuity. We will cover manual backups, timed backups, automated script backups, and backup methods in Docker container environments, and provide best practices for backup file management. Manual backup: Use the mongodump command to perform manual full backup, for example: mongodump-hlocalhost:27017-u username-p password-d database name-o/backup directory This command will export the data and metadata of the specified database to the specified backup directory.

MongoDB and relational database: a comprehensive comparison MongoDB and relational database: a comprehensive comparison Apr 08, 2025 pm 06:30 PM

MongoDB and relational database: In-depth comparison This article will explore in-depth the differences between NoSQL database MongoDB and traditional relational databases (such as MySQL and SQLServer). Relational databases use table structures of rows and columns to organize data, while MongoDB uses flexible document-oriented models to better suit the needs of modern applications. Mainly differentiates data structures: Relational databases use predefined schema tables to store data, and relationships between tables are established through primary keys and foreign keys; MongoDB uses JSON-like BSON documents to store them in a collection, and each document structure can be independently changed to achieve pattern-free design. Architectural design: Relational databases need to pre-defined fixed schema; MongoDB supports

How to set up users in mongodb How to set up users in mongodb Apr 12, 2025 am 08:51 AM

To set up a MongoDB user, follow these steps: 1. Connect to the server and create an administrator user. 2. Create a database to grant users access. 3. Use the createUser command to create a user and specify their role and database access rights. 4. Use the getUsers command to check the created user. 5. Optionally set other permissions or grant users permissions to a specific collection.

How to encrypt data in Debian MongoDB How to encrypt data in Debian MongoDB Apr 12, 2025 pm 08:03 PM

Encrypting MongoDB database on a Debian system requires following the following steps: Step 1: Install MongoDB First, make sure your Debian system has MongoDB installed. If not, please refer to the official MongoDB document for installation: https://docs.mongodb.com/manual/tutorial/install-mongodb-on-debian/Step 2: Generate the encryption key file Create a file containing the encryption key and set the correct permissions: ddif=/dev/urandomof=/etc/mongodb-keyfilebs=512

What are the tools to connect to mongodb What are the tools to connect to mongodb Apr 12, 2025 am 06:51 AM

The main tools for connecting to MongoDB are: 1. MongoDB Shell, suitable for quickly viewing data and performing simple operations; 2. Programming language drivers (such as PyMongo, MongoDB Java Driver, MongoDB Node.js Driver), suitable for application development, but you need to master the usage methods; 3. GUI tools (such as Robo 3T, Compass) provide a graphical interface for beginners and quick data viewing. When selecting tools, you need to consider application scenarios and technology stacks, and pay attention to connection string configuration, permission management and performance optimization, such as using connection pools and indexes.

See all articles