GridFS:基于MongoDB的分布式文件存储系统
GridFS是MongoDB之上的分布式文件系统,其利用了MongoDB的分布式存储机制并通过MongoDB来存储文件数据和文件元数据,兼具文档型数
GridFS是MongoDB之上的分布式文件系统,其利用了MongoDB的分布式存储机制并通过MongoDB来存储文件数据和文件元数据,兼具文档型数据库和文件系统的优势。GridFS是当前大数据潮流和复杂数据分析需求的产物。
简单来说,GridFS通过将文件数据和文件元数据保存在MongoDB里来实现文件系统,通过复制(Replication)来应对故障切换,数据集成,还可以用来做读扩展,热备份或者作为离线批处理的数据源,通过分片来实现自动切分数据,实现大数据存储和负载均衡,通过数据库对集合中文档的管理和查询(包括MapReduce)实现轻量级文件系统接口和搜索与分析。
GridFS的一个基本思想是可以将大文件分成很多块,每一块作为一个单独的文档存储,则有就能存储大文件了。由于MongoDB支持在文档中存储二进制数据,可以最大限度减小块的存储开销。GridFS使用MongoDB的复制,分片等机制来实现分布式文件存储,使用MongoDB进行管理和复杂分析。
GridFS使用两个文档来存储文件,一个用来存储文件本身的块,另外一个用来存储分块的信息和文件的元数据,默认对应的集合分别为fs.chunks和fs.files.
Chunks集合:
{
“_id”:
“files_id”:
“n”:
“data”:
}
块集合中文档包含以下属性:chunk_id:块ID。Chunks.files_id:对应files集合中文档的_id。Chunks.n:块的编号,由GridFS管理,从0开始。Chunks.data:文件数据,是BSON二进制类型。
Chunks集合使用files_id和n作为混合索引,files集合:
{
“_id”:
“length”:
“chunkSize”:
“uploadDate”:
“md5”:
“filename”:
“contentType”:
“aliases”:
“metadata”:
}
Files集合中的文档包含以下属性,应用还可以创建额外任意的属性:files_id:唯一的文件表示。MongoDB的默认值是BOSN ObjectID。Files.length: 文件的字节数大小。Files.chunkSize:每个块的大小,默认为256KB,GridFS根据这个值将文件分成多个快,files.uploadDate:GridFS第一次存储此文件的时间,类型为ISODate。Files.md5: 文件的md5散列值,是字符串。 Files.filename:可选。人类可读的文件名。Files.contentType: 可选。合法的文件MIME类型。Files.aliases:可选。别名的字符串数组。Files.metadata:可选。自定义存储的文件元数据。
可以通过mongofiles工具或者MongoDB驱动程序来使用GridFS,GridFS主要提供5种操作接口:
List:获取文件列表
Get:获取文件
Put:写入文件
Search:根据文件名搜索文件
Delete:删除文件
因为GridFS文件的元数据存储在files集合中,因此GridFS可以非常方便地进行文件管理,比如根据文件名,上传时间,文件大小或者自定义的文件元数据进行查询,还可以利用MapReduce做复杂数据分析。这是GridFS把传统文件系统和数据库相结合得到的众多好处之一。
对比传统文件系统的优势
分布式:GridFS是基于MongoDB的分布式文件系统,可以直接使用MongoDB Replication和Sharding机制,数据可靠性和水平扩展性都得到保证。GridFS不产生磁盘碎片,因为MongoDB分配数据文件空间时以2GB为一块。
MapReduce:可以进行复杂管理和查询分析。
索引和缓存:元数据存储在MongoDB中,非常方便索引,,并且可以对文件和文件元数据进行索引,能提高系统效率。
Checksum: GridFS会为文件产生散列值,可用于校验文件以检查完整性。
开发者友好:利用Grid可以简化需求,减小开发成本。要是已经用了MongoDB,GridFS就可以不需要使用独立文件存储架构,并且使代码和数据真正分离,方便管理。
其他: GridFS可以避免用于存储用户上传内容的文件系统出现的某些问题。例如,GridFS在同一个目录下防止大量的文件是没有任何问题的。GridFS不产生磁盘碎片,因为MongoDB分配数据文件空间时以2GB为一块。

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











When developing an e-commerce website, I encountered a difficult problem: how to provide users with personalized product recommendations. Initially, I tried some simple recommendation algorithms, but the results were not ideal, and user satisfaction was also affected. In order to improve the accuracy and efficiency of the recommendation system, I decided to adopt a more professional solution. Finally, I installed andres-montanez/recommendations-bundle through Composer, which not only solved my problem, but also greatly improved the performance of the recommendation system. You can learn composer through the following address:

It is impossible to view MongoDB password directly through Navicat because it is stored as hash values. How to retrieve lost passwords: 1. Reset passwords; 2. Check configuration files (may contain hash values); 3. Check codes (may hardcode passwords).

GitLab Database Deployment Guide on CentOS System Selecting the right database is a key step in successfully deploying GitLab. GitLab is compatible with a variety of databases, including MySQL, PostgreSQL, and MongoDB. This article will explain in detail how to select and configure these databases. Database selection recommendation MySQL: a widely used relational database management system (RDBMS), with stable performance and suitable for most GitLab deployment scenarios. PostgreSQL: Powerful open source RDBMS, supports complex queries and advanced features, suitable for handling large data sets. MongoDB: Popular NoSQL database, good at handling sea

Detailed explanation of MongoDB efficient backup strategy under CentOS system This article will introduce in detail the various strategies for implementing MongoDB backup on CentOS system to ensure data security and business continuity. We will cover manual backups, timed backups, automated script backups, and backup methods in Docker container environments, and provide best practices for backup file management. Manual backup: Use the mongodump command to perform manual full backup, for example: mongodump-hlocalhost:27017-u username-p password-d database name-o/backup directory This command will export the data and metadata of the specified database to the specified backup directory.

MongoDB and relational database: In-depth comparison This article will explore in-depth the differences between NoSQL database MongoDB and traditional relational databases (such as MySQL and SQLServer). Relational databases use table structures of rows and columns to organize data, while MongoDB uses flexible document-oriented models to better suit the needs of modern applications. Mainly differentiates data structures: Relational databases use predefined schema tables to store data, and relationships between tables are established through primary keys and foreign keys; MongoDB uses JSON-like BSON documents to store them in a collection, and each document structure can be independently changed to achieve pattern-free design. Architectural design: Relational databases need to pre-defined fixed schema; MongoDB supports

To set up a MongoDB user, follow these steps: 1. Connect to the server and create an administrator user. 2. Create a database to grant users access. 3. Use the createUser command to create a user and specify their role and database access rights. 4. Use the getUsers command to check the created user. 5. Optionally set other permissions or grant users permissions to a specific collection.

Encrypting MongoDB database on a Debian system requires following the following steps: Step 1: Install MongoDB First, make sure your Debian system has MongoDB installed. If not, please refer to the official MongoDB document for installation: https://docs.mongodb.com/manual/tutorial/install-mongodb-on-debian/Step 2: Generate the encryption key file Create a file containing the encryption key and set the correct permissions: ddif=/dev/urandomof=/etc/mongodb-keyfilebs=512

The main tools for connecting to MongoDB are: 1. MongoDB Shell, suitable for quickly viewing data and performing simple operations; 2. Programming language drivers (such as PyMongo, MongoDB Java Driver, MongoDB Node.js Driver), suitable for application development, but you need to master the usage methods; 3. GUI tools (such as Robo 3T, Compass) provide a graphical interface for beginners and quick data viewing. When selecting tools, you need to consider application scenarios and technology stacks, and pay attention to connection string configuration, permission management and performance optimization, such as using connection pools and indexes.
