This article brings you relevant knowledge about Redis, which mainly introduces the relevant content about persistence strategy. RDB persistence refers to the process of storing data in memory within a specified time interval. The data set snapshot is written to the disk. Let's take a look at it. I hope it will be helpful to everyone.

A brief analysis of Redis persistence strategy

Recommended learning: Redis video tutorial

Redis (Remote Dictionary Server), that is, remote dictionary service, is a Open source in-memory cache data storage service. Written in ANSI C language, it supports network, memory-based and persistent log-type, Key-Value data storage, and provides APIs in multiple languages

Redis is an in-memory database, and the data is Stored in memory, in order to avoid permanent loss of data caused by process exit, the data in Redis needs to be regularly saved from memory to the hard disk in some form (data or command). When Redis restarts next time, use persistent files to achieve data recovery. In addition, persistent files can be copied to a remote location for disaster backup purposes. There are two persistence mechanisms for Redis:

RDB (Redis Data Base) Memory Snapshot

AOF (Append Only File) Incremental Log

RDB saves the current data to the hard disk, and AOF saves each executed write command to the hard disk (similar to MySQL's Binlog). AOF persistence has better real-time performance, that is, less data is lost when the process exits unexpectedly.

RDB Persistence

Introduction

##RDB ( Redis Data Base) refers to writing a snapshot of the data set in memory to disk within a specified time interval. RDB is a memory snapshot (a binary sequence of memory data (in the form of persistence), each time a snapshot is generated from Redis to fully back up the data.

Advantages:

Disadvantages:

RDB uses a fork child process to fully back up the memory snapshot, which is a heavyweight operation and is expensive to perform frequently.

A brief analysis of Redis persistence strategy

RDB file structure

By default, Redis saves the database snapshot in the file named dump.rdb in the binary file. The RDB file structure consists of five parts:

(1)

REDIS constant string with a length of 5 bytes.

(2) 4-byte db_version, identifying the RDB file version.

(3)databases: indefinite length, including zero or more databases, as well as key-value pair data in each database.

(4) 1-byte EOF constant, indicating the end of the file content.

(5) check_sum: 8-byte unsigned integer, saves the checksum.

A brief analysis of Redis persistence strategy

Data structure example, the following is the situation where database [0] and database [3] have data:

A brief analysis of Redis persistence strategy

Creation of RDB files

Manual instruction trigger

Manually trigger RDB persistence using the

save command and bgsave command, the difference between these two commands is as follows:

save: Execute the save command to block other operations of Redis, which will cause Redis to be unable to Responds to client requests, not recommended.

bgsave: Execute the bgsave command, Redis creates a child process in the background, and saves the snapshot asynchronously. At this time, Redis can still respond to the client's request.

Automatic interval saving

By default, Redis saves the database snapshot in a binary file named dump.rdb. Redis can be set to automatically save a data set when the condition "the data set has at least M changes within N seconds" is met.

For example, the following settings will cause Redis to automatically save a data set when it meets the condition of "at least 10 keys have been changed within 60 seconds":

save 60 10.

The default configuration of Redis is as follows. Automatic saving can be triggered if one of the three settings is met:

save 60 10000
save 300 10
save 900 1

Copy after login

The data structure of the automatic saving configuration

records the server trigger The saveparams attribute of the automatic BGSAVE condition.

lastsave Attribute: Record the time when the server last executed SAVE or BGSAVE.

dirty Attributes: and how many times the server has written since the last time the RDB file was saved.

A brief analysis of Redis persistence strategy

Backup process

When backing up the RDB persistence scheme, Redis will fork a separate child process for persistence and write the data to a temporary file, replacing the old RDB file after persistence is complete. During the entire persistence process, the main process (the process that provides services to the client) does not participate in IO operations, which ensures the high performance of the Redis service. The RDB persistence mechanism is suitable for applications that do not have high requirements for data integrity but pursue efficient recovery. Scenes. The following shows the RDB persistence process:

A brief analysis of Redis persistence strategy

The key execution steps are as follows

The Redis parent process first determines: whether save is currently being executed. Or the child process of bgsave/bgrewriteaof, if it is being executed, the bgsave command will return directly. The child processes of bgsave/bgrewriteaof cannot be executed at the same time, mainly due to performance considerations: two concurrent child processes perform a large number of disk write operations at the same time, which may cause serious performance problems.
The parent process performs a fork operation to create a child process. During this process, the parent process is blocked and Redis cannot execute any commands from the client. After the parent process forks, the bgsave command returns the "Background saving started" message and no longer blocks the parent process, and can respond to other commands.
The sub-process process generates a snapshot file for memory data.
New write operations received by the parent process during this period are written using the COW mechanism.
The child process completes the snapshot writing, replaces the old RDB file, and then the child process exits.

In the step of generating RDB files, how to deal with data inconsistencies during the process of synchronizing to disk and continuous writing? Will there be any business impact when generating snapshot RDB files?

The role of Fork sub-process

As mentioned above, during the RDB persistence process, the main process will fork a sub-process to be responsible for RDB backup. This is simple Let me introduce fork:

A program in the Linux operating system, fork will generate a child process that is exactly the same as the parent process. All the data of the child process is consistent with that of the parent process, but the child process is a completely new process and has a parent-child process relationship with the original process.

For efficiency reasons, the Linux operating system uses the COW (Copy On Write) copy-on-write mechanism. The fork child process generally shares a section of physical memory with the parent process. When the memory in the process space is modified, a copy of the memory space will be copied.

In Redis, RDB persistence makes full use of this technology. Redis calls the glibc function to fork a child process during persistence, and is fully responsible for the persistence work, so that the parent process can still continue. Provide services to clients. The child process of fork initially shares the same memory with the parent process (the main process of Redis); when the client requests to modify the data in the memory during the persistence process, the COW (Copy On Write) mechanism will be used to modify the data in the memory. The data segment page is separated, that is, a piece of memory is copied for the main process to modify.

A brief analysis of Redis persistence strategy

The child process created through fork can obtain exactly the same memory space as the parent process. The modification of the memory by the parent process is invisible to the child process, and the two will not. Mutual influence;

When creating a child process through fork, it will not immediately trigger the copy of a large amount of memory. Instead, COW (Copy On Write) is used. The kernel only creates virtual space structures for the newly generated child processes. They copy the virtual actual structures of the parent process, but do not allocate physical memory for these segments. They share the physical space of the parent process. When the parent and child processes change the corresponding segments, When the behavior occurs, physical space is allocated for the corresponding segment of the child process;

AOF 持久化

简介

AOF (Append Only File) 是把所有对内存进行修改的指令（写操作）以独立日志文件的方式进行记录，重启时通过执行 AOF 文件中的 Redis 命令来恢复数据。类似MySql bin-log 原理。AOF 能够解决数据持久化实时性问题，是现在 Redis 持久化机制中主流的持久化方案。

A brief analysis of Redis persistence strategy

优点：

数据的备份更加完整，丢失数据的概率更低，适合对数据完整性要求高的场景

日志文件可读，AOF 可操作性更强，可通过操作日志文件进行修复

缺点：

AOF 日志记录在长期运行中逐渐庞大，恢复起来非常耗时，需要定期对 AOF 日志进行瘦身处理

恢复备份速度比较慢

同步写操作频繁会带来性能压力

AOF 文件内容

被写入 AOF 文件的所有命令都是以 RESP 格式保存的，是纯文本格式保存在 AOF 文件中。

Redis 客户端和服务端之间使用一种名为 RESP(REdis Serialization Protocol) 的二进制安全文本协议进行通信。

下面以一个简单的 SET 命令进行举例：

redis> SET mykey "hello"    
//客户端命令OK

Copy after login

客户端封装为以下格式（每行用 \r\n分隔）

*3$3SET$5mykey$5hello

Copy after login

AOF 文件中记录的文本内容如下

*2\r\n$6\r\nSELECT\r\n$1\r\n0\r\n       
//多出一个SELECT 0 命令，用于指定数据库，为系统自动添加
*3\r\n$3\r\nSET\r\n$5\r\nmykey\r\n$5\r\nhello\r\n

Copy after login

AOF 持久化实现

AOF 持久化方案进行备份时，客户端所有请求的写命令都会被追加到 AOF 缓冲区中，缓冲区中的数据会根据 Redis 配置文件中配置的同步策略来同步到磁盘上的 AOF 文件中，追加保存每次写的操作到文件末尾。同时当 AOF 的文件达到重写策略配置的阈值时，Redis 会对 AOF 日志文件进行重写，给 AOF 日志文件瘦身。Redis 服务重启的时候，通过加载 AOF 日志文件来恢复数据。

A brief analysis of Redis persistence strategy

AOF 的执行流程包括：

命令追加(append)

Redis 先将写命令追加到缓冲区 aof_buf，而不是直接写入文件，主要是为了避免每次有写命令都直接写入硬盘，导致硬盘 IO 成为 Redis 负载的瓶颈。

struct redisServer {
   //其他域...   sds  aof_buf;           // sds类似于Java中的String   //其他域...}

Copy after login

文件写入(write)和文件同步(sync)

根据不同的同步策略将 aof_buf 中的内容同步到硬盘；

Linux 操作系统中为了提升性能，使用了页缓存（page cache）。当我们将 aof_buf 的内容写到磁盘上时，此时数据并没有真正的落盘，而是在 page cache 中，为了将 page cache 中的数据真正落盘，需要执行 fsync / fdatasync 命令来强制刷盘。这边的文件同步做的就是刷盘操作，或者叫文件刷盘可能更容易理解一些。

AOF 缓存区的同步文件策略由参数 appendfsync 控制，有三种同步策略，各个值的含义如下：

always：命令写入 aof_buf 后立即调用系统 write 操作和系统 fsync 操作同步到 AOF 文件，fsync 完成后线程返回。这种情况下，每次有写命令都要同步到 AOF 文件，硬盘 IO 成为性能瓶颈，Redis 只能支持大约几百TPS写入，严重降低了 Redis 的性能；即便是使用固态硬盘（SSD），每秒大约也只能处理几万个命令，而且会大大降低 SSD 的寿命。可靠性较高，数据基本不丢失。

no：命令写入 aof_buf 后调用系统 write 操作，不对 AOF 文件做 fsync 同步；同步由操作系统负责，通常同步周期为30秒。这种情况下，文件同步的时间不可控，且缓冲区中堆积的数据会很多，数据安全性无法保证。

everysec：命令写入 aof_buf 后调用系统 write 操作，write 完成后线程返回；fsync 同步文件操作由专门的线程每秒调用一次。everysec 是前述两种策略的折中，是性能和数据安全性的平衡，因此是 Redis 的默认配置，也是我们推荐的配置。

文件重写(rewrite)

定期重写 AOF 文件，达到压缩的目的。

AOF 重写是 AOF 持久化的一个机制，用来压缩 AOF 文件，通过 fork 一个子进程，重新写一个新的 AOF 文件，该次重写不是读取旧的 AOF 文件进行复制，而是读取内存中的Redis数据库，重写一份 AOF 文件，有点类似于 RDB 的快照方式。

文件重写之所以能够压缩 AOF 文件，原因在于：

过期的数据不再写入文件

无效的命令不再写入文件：如有些数据被重复设值(set mykey v1, set mykey v2)、有些数据被删除了(sadd myset v1, del myset)等等

多条命令可以合并为一个：如 sadd myset v1, sadd myset v2, sadd myset v3 可以合并为 sadd myset v1 v2 v3。不过为了防止单条命令过大造成客户端缓冲区溢出，对于 list、set、hash、zset类型的 key，并不一定只使用一条命令；而是以某个常量为界将命令拆分为多条。这个常量在 redis.h/REDIS_AOF_REWRITE_ITEMS_PER_CMD 中定义，不可更改，2.9版本中值是64。

AOF 重写

前面提到 AOF 的缺点时，说过 AOF 属于日志追加的形式来存储 Redis 的写指令，这会导致大量冗余的指令存储，从而使得 AOF 日志文件非常庞大，比如同一个 key 被写了 10000 次，最后却被删除了，这种情况不仅占内存，也会导致恢复的时候非常缓慢，因此 Redis 提供重写机制来解决这个问题。Redis 的 AOF 持久化机制执行重写后，保存的只是恢复数据的最小指令集，我们如果想手动触发可以使用如下指令：

bgrewriteaof

Copy after login

文件重写时机

思考

AOF 与 WAL

Redis 为什么考虑使用 AOF 而不是 WAL 呢?

很多数据库都是采用的 Write Ahead Log（WAL）写前日志，其特点就是先把修改的数据记录到日志中，再进行写数据的提交，可以方便通过日志进行数据恢复。

但是 Redis 采用的却是 AOF（Append Only File）写后日志，特点就是先执行写命令，把数据写入内存中，再记录日志。

If you let the system execute the command first, only the command that can be executed successfully will be recorded in the log. Therefore, Redis uses post-write logging to avoid recording incorrect commands.

Another reason is that AOF only records the log after the command is executed, so it will not block the current write operation.

Interaction between AOF and RDB

In Redis with version number greater than or equal to 2.4, BGREWRITEAOF cannot be executed during the execution of BGSAVE. On the other hand, BGSAVE cannot be executed during the execution of BGREWRITEAOF. This prevents two Redis background processes from doing heavy I/O operations on the disk at the same time.

If BGSAVE is executing and the user explicitly calls the BGREWRITEAOF command, the server will reply with an OK status to the user and inform the user that BGREWRITEAOF has been scheduled for execution: BGREWRITEAOF will officially start once BGSAVE is completed.

When Redis starts, if both RDB persistence and AOF persistence are turned on, the program will give priority to using AOF files to restore the data set, because the data saved in AOF files is usually the most complete.

Hybrid Persistence

After Redis 4.0, most usage scenarios will not use RDB or AOF alone as the persistence mechanism, but will take into account both. Advantages of mixed use. The reason is that although RDB is fast, it will lose a lot of data and cannot guarantee data integrity; although AOF can ensure data integrity as much as possible, its performance is indeed a criticism, such as replaying and recovering data.

Redis has introduced the RDB-AOF hybrid persistence mode since version 4.0. This mode is built based on the AOF persistence mode. Hybrid persistence passes aof-use-rdb-preamble yes Turn on.

Then when the Redis server performs the AOF rewrite operation, it will generate the corresponding RDB data based on the current status of the database just like executing the BGSAVE command, and write these data into the newly created AOF file. As for Redis commands executed after AOF rewriting starts will continue to be appended to the end of the new AOF file in the form of protocol text, that is, after the existing RDB data.

In other words, after the RDB-AOF hybrid persistence function is turned on, the AOF file generated by the server will be composed of two parts. The data at the beginning of the AOF file is the RDB format data, and the RDB data that follows What follows is the data in AOF format.

When a Redis server that supports RDB-AOF hybrid persistence mode starts and loads an AOF file, it will check whether the beginning of the AOF file contains RDB format content.

If included, the server will load the beginning RDB data first, and then load the subsequent AOF data.
If the AOF file only contains AOF data, the server will load the AOF data directly.

The log file structure is as follows:

A brief analysis of Redis persistence strategy

Summary

Finally To sum up the two, which one is better?

The recommendation is to enable both.

If the data is not sensitive, you can choose to use RDB alone.
If you are just doing pure memory caching, you don’t need to use it at all.

Recommended learning: Redis video tutorial

The above is the detailed content of A brief analysis of Redis persistence strategy. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks ago By DDD

How to fix KB5055612 fails to install in Windows 10?

3 weeks ago By DDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Nordhold: Fusion System, Explained

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial

1667

CakePHP Tutorial

1426

Laravel Tutorial

1328

PHP Tutorial

1273

C# Tutorial

1255

Related knowledge

How to build the redis cluster mode Apr 10, 2025 pm 10:15 PM

Redis cluster mode deploys Redis instances to multiple servers through sharding, improving scalability and availability. The construction steps are as follows: Create odd Redis instances with different ports; Create 3 sentinel instances, monitor Redis instances and failover; configure sentinel configuration files, add monitoring Redis instance information and failover settings; configure Redis instance configuration files, enable cluster mode and specify the cluster information file path; create nodes.conf file, containing information of each Redis instance; start the cluster, execute the create command to create a cluster and specify the number of replicas; log in to the cluster to execute the CLUSTER INFO command to verify the cluster status; make

How to clear redis data Apr 10, 2025 pm 10:06 PM

How to clear Redis data: Use the FLUSHALL command to clear all key values. Use the FLUSHDB command to clear the key value of the currently selected database. Use SELECT to switch databases, and then use FLUSHDB to clear multiple databases. Use the DEL command to delete a specific key. Use the redis-cli tool to clear the data.

How to read redis queue Apr 10, 2025 pm 10:12 PM

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

How to configure Lua script execution time in centos redis Apr 14, 2025 pm 02:12 PM

On CentOS systems, you can limit the execution time of Lua scripts by modifying Redis configuration files or using Redis commands to prevent malicious scripts from consuming too much resources. Method 1: Modify the Redis configuration file and locate the Redis configuration file: The Redis configuration file is usually located in /etc/redis/redis.conf. Edit configuration file: Open the configuration file using a text editor (such as vi or nano): sudovi/etc/redis/redis.conf Set the Lua script execution time limit: Add or modify the following lines in the configuration file to set the maximum execution time of the Lua script (unit: milliseconds)

How to use the redis command line Apr 10, 2025 pm 10:18 PM

Use the Redis command line tool (redis-cli) to manage and operate Redis through the following steps: Connect to the server, specify the address and port. Send commands to the server using the command name and parameters. Use the HELP command to view help information for a specific command. Use the QUIT command to exit the command line tool.

How to implement redis counter Apr 10, 2025 pm 10:21 PM

Redis counter is a mechanism that uses Redis key-value pair storage to implement counting operations, including the following steps: creating counter keys, increasing counts, decreasing counts, resetting counts, and obtaining counts. The advantages of Redis counters include fast speed, high concurrency, durability and simplicity and ease of use. It can be used in scenarios such as user access counting, real-time metric tracking, game scores and rankings, and order processing counting.

How to set the redis expiration policy Apr 10, 2025 pm 10:03 PM

There are two types of Redis data expiration strategies: periodic deletion: periodic scan to delete the expired key, which can be set through expired-time-cap-remove-count and expired-time-cap-remove-delay parameters. Lazy Deletion: Check for deletion expired keys only when keys are read or written. They can be set through lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-user-del parameters.

How to optimize the performance of debian readdir Apr 13, 2025 am 08:48 AM

In Debian systems, readdir system calls are used to read directory contents. If its performance is not good, try the following optimization strategy: Simplify the number of directory files: Split large directories into multiple small directories as much as possible, reducing the number of items processed per readdir call. Enable directory content caching: build a cache mechanism, update the cache regularly or when directory content changes, and reduce frequent calls to readdir. Memory caches (such as Memcached or Redis) or local caches (such as files or databases) can be considered. Adopt efficient data structure: If you implement directory traversal by yourself, select more efficient data structures (such as hash tables instead of linear search) to store and access directory information

See all articles