Home Database Redis Detailed explanation of Redis' high availability and high concurrency mechanism

Detailed explanation of Redis' high availability and high concurrency mechanism

Mar 23, 2021 am 11:04 AM
redis

Detailed explanation of Redis' high availability and high concurrency mechanism

1. High concurrency mechanism

We know that redis is based on single thread and can be hosted in stand-alone mode It is only about tens of thousands, so how to improve its high concurrent requests of hundreds of thousands under big data through the master-slave architecture of redis and the separation of reading and writing.

Video Course Recommendation →: "Concurrency Solution for Tens of Millions of Data (Theory and Practice)"

1. Master-slave replication

The configuration of redis master-slave replication is not emphasized. It mainly depends on the principle and process of master-slave replication: In the process of master-slave replication of redis, a master host is required as an administrator. Build multiple slave machines. When the slave slave tries to start, it will send a command PSYNC to the master host. If the slave slave is reconnected at this time, the data that the slave slave does not have will be copied from the master host. If it is the first time to connect, then Full resynchronization will be triggered. After triggering, the master host will start a process in the background to generate an RDB snapshot file, and at the same time store the write operations in this time period into the cache. When the RDB file is generated, it will send the RDB file to the slave machine, and the slave machine will get the file. After that, it is first written to the disk and then loaded into the memory. Finally, the master host will also send the data cached in the memory to the slave machine at the same time. If a master-slave network failure occurs and multiple slaves reconnect, the master will only restart one RDB to serve all slaves. [Related recommendations: Redis Video Tutorial]

Breakpoint resume: There is a replica offset in the master and slave, and there is a master id in it, where the offset is kept in the backlog, when the master When the slave reconnects after a network failure, it will find the corresponding last replica offset and copy it. If the corresponding offset is not found, full resynchronization is triggered.

①The complete process of replication

(1) The slave node starts and only saves the information of the master node, including the host and IP of the master node, but the replication process does not start

Where do the master host and IP come from?

of the slaveof configuration in redis.conf (2) There is a scheduled task inside the slave node to check whether there is a new master node to connect and copy every second. If Found that, establish a socket network connection with the master node
(3) The slave node sends the ping command to the master node
(4) Password authentication. If the master sets requirepass, then the slave node must send the masterauth password for authentication.
(5) The master node performs full replication for the first time and sends all data to the slave node
(6) The master node will continue to write commands and asynchronously copy them to the slave node

②Data synchronization The related core mechanism

refers to the full copy performed when the slave connects to msater for the first time. Some of your detailed mechanisms in that process

(1) Both master and slave will maintain An offset

The master will continuously accumulate offsets on itself, and the slave will also continuously accumulate offsets on itself
The slave will report its own offset to the master every second, and the master will also save the offset of each slave

This does not mean that it is specifically used for full replication. The main reason is that both the master and the slave need to know the offset of their respective data in order to know the inconsistency of the data between each other.

(2) backlog

The master node has a backlog, the default size is 1MB
When the master node copies data to the slave node, it will also write a copy of the data synchronously in the backlog
The backlog is mainly used for full replication Incremental replication after interruption

(3) master run id

info server, you can see the master run id
It is unreliable to locate the master node based on the host ip , if the master node restarts or the data changes, then the slave node should be distinguished according to different run ids. If the run id is different, full copy will be made.
If you need to restart redis without changing the run id, you can use the redis-cli debug reload command

(4)psync

The slave node uses psync to copy from the master node, and psync runid offset
The master node will return response information according to its own situation. It may be FULLRESYNC runid offset that triggers full replication. , it may be that CONTINUE triggers incremental copy

③Full copy

(1) The master executes bgsave and generates an rdb snapshot file locally
(2) The master node sends the rdb snapshot file to the slave node. If the rdb copy time exceeds 60 seconds (repl-timeout), then the slave The node will think that the copy failed, and you can adjust this parameter appropriately
(3) For machines with Gigabit network cards, 100MB, 6G files are generally transferred per second, which is likely to exceed 60s
(4) The master node is generating RDB When, all new write commands will be cached in memory. After the salve node saves the rdb, the new write commands will be copied to the salve node
(5) client-output-buffer-limit slave 256MB 64MB 60, If during copying, the memory buffer continues to consume more than 64MB, or exceeds 256MB at one time, then stop copying and copy fails
(6) After the slave node receives the rdb, it clears its own old data, and then reloads the rdb to itself. in the memory, while providing external services based on the old data version
(7) If the slave node turns on AOF, then BGREWRITEAOF will be executed immediately and the AOF will be rewritten

rdb generation, rdb copy through the network, slave Cleaning old data and slave aof rewrite are very time-consuming

If the amount of copied data is between 4G~6G, then the full copy time is likely to take 1 and a half to 2 minutes

④Incremental replication

(1) If the master-slave network connection is disconnected during the full replication process, then when the salve reconnects to the master, incremental replication will be triggered
(2) The master directly copies from its own Get part of the lost data from the backlog and send it to the slave node. The default backlog is 1MB
(3) msater gets the data from the backlog based on the offset in psync sent by the slave

⑤heartbeat

The master and slave nodes will send heartbeat information to each other

The master sends a heartbeat every 10 seconds by default, and the salve node sends a heartbeat every 1 second

⑥Asynchronous replication

Every time the master receives a write command, it now writes data internally and then sends it asynchronously to the slave node

2. Read and write separation: the master is responsible for the write operation, and the slave is responsible for helping the master reduce access queries. Quantity

2. High availability mechanism

In the case of high concurrency, multiple clusters are equipped with one master and multiple backups. Although the high concurrency problem can be solved, there is only one host. , if the master is down, the entire system cannot perform write operations, and the slave cannot synchronize data, the entire system will be paralyzed, and the entire system will be unavailable. The high-availability mechanism of redis is the sentinel mechanism. The sentinel is an important component in the redis cluster. It is responsible for cluster monitoring, information notification, failover, and configuration center.

(1) Cluster monitoring, responsible for monitoring whether the redis master and slave processes are working normally
(2) Message notification, if a redis instance fails, the sentinel is responsible for sending messages as alarm notifications to the administrator
(3) Failover, if the master node hangs up, it will be automatically transferred to the slave node
(4) Configuration center, if failover occurs, notify the client of the new master address
Sentinel It is distributed in itself and works as a cluster and needs to work together.

When the master node is found to be down, it will require the consent of a majority of sentinels. This involves distributed elections.

The sentinel mechanism needs to ensure at least 3 nodes to ensure its robustness. If we only give two nodes during the test, one is the master node and the other is the slave node, then there is a sentinel responsible for both nodes. Monitoring, when the master host goes down, then sentinels are needed for election. Then the s1 sentinel in the master node can no longer work, and the election can only be carried out by the s2 sentinel in the slave node. After the election, a fault must be carried out. The transfer requires one sentinel to work, and its majority parameter specifies the number of sentinels required for failover. At this time, there is only one S2 sentinel without majority for failover. So at least 3 nodes are needed to ensure its robustness.

3. Data loss issues arising from high availability and high concurrency

(1) Data loss caused by asynchronous replication

Because master -> The slave's replication is asynchronous, so some data may not be copied to the slave before the master crashes, and these parts of the data are lost.

(2) Data loss caused by split brain

Split brain, that is to say, the machine where a master is located suddenly leaves the normal network and cannot connect to other slave machines, but in fact The master is still running.

At this time, the sentinel may think that the master is down, and then start the election and switch other slaves to the master.

At this time, there will be two slaves in the cluster. There is a master, which is the so-called split brain.

Although a slave is switched to the master at this time, the client may not have time to switch to the new master, and the data that continues to write to the old master may not be Lost,

So when the old master is restored again, it will be hung to the new master as a slave, its own data will be cleared, and the data will be copied from the new master again.

Solution to data loss caused by asynchronous replication and split-brain

min-slaves-to-write 1
 min-slaves-max-lag 10
Copy after login

Requires at least 1 slave, the delay of data replication and synchronization cannot exceed 10 seconds

If once all The slave, data replication and synchronization delays exceed 10 seconds, then at this time, the master will no longer receive any requests

The above two configurations can reduce data loss caused by asynchronous replication and split-brain

(1) Reduce data loss caused by asynchronous replication

With the min-slaves-max-lag configuration , it can be ensured that once the slave copy data and ACK delay is too long, it is considered that too much data may be lost after the master goes down, and then the write request is rejected. This can prevent some data from being synchronized when the master goes down. The data loss caused by the slave is reduced within the controllable range

(2) Reduce the data loss caused by split brain

If a master has a split brain and loses connection with other slaves, then the above two This configuration can ensure that if it cannot continue to send data to the specified number of slaves, and the slave does not send itself an ack message for more than 10 seconds, then the client's write request will be directly rejected

In this way, the old master after the split brain will It will not accept new data from the client, thus avoiding data loss.

The above configuration ensures that if the connection is lost with any slave and no slave gives itself an ack after 10 seconds, then it will be rejected. New write request

Therefore, in a split-brain scenario, up to 10 seconds of data will be lost

For more programming-related knowledge, please visit:Introduction to Programming ! !

The above is the detailed content of Detailed explanation of Redis' high availability and high concurrency mechanism. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

How to build the redis cluster mode How to build the redis cluster mode Apr 10, 2025 pm 10:15 PM

Redis cluster mode deploys Redis instances to multiple servers through sharding, improving scalability and availability. The construction steps are as follows: Create odd Redis instances with different ports; Create 3 sentinel instances, monitor Redis instances and failover; configure sentinel configuration files, add monitoring Redis instance information and failover settings; configure Redis instance configuration files, enable cluster mode and specify the cluster information file path; create nodes.conf file, containing information of each Redis instance; start the cluster, execute the create command to create a cluster and specify the number of replicas; log in to the cluster to execute the CLUSTER INFO command to verify the cluster status; make

How to read redis queue How to read redis queue Apr 10, 2025 pm 10:12 PM

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

How to clear redis data How to clear redis data Apr 10, 2025 pm 10:06 PM

How to clear Redis data: Use the FLUSHALL command to clear all key values. Use the FLUSHDB command to clear the key value of the currently selected database. Use SELECT to switch databases, and then use FLUSHDB to clear multiple databases. Use the DEL command to delete a specific key. Use the redis-cli tool to clear the data.

How to configure Lua script execution time in centos redis How to configure Lua script execution time in centos redis Apr 14, 2025 pm 02:12 PM

On CentOS systems, you can limit the execution time of Lua scripts by modifying Redis configuration files or using Redis commands to prevent malicious scripts from consuming too much resources. Method 1: Modify the Redis configuration file and locate the Redis configuration file: The Redis configuration file is usually located in /etc/redis/redis.conf. Edit configuration file: Open the configuration file using a text editor (such as vi or nano): sudovi/etc/redis/redis.conf Set the Lua script execution time limit: Add or modify the following lines in the configuration file to set the maximum execution time of the Lua script (unit: milliseconds)

How to use the redis command line How to use the redis command line Apr 10, 2025 pm 10:18 PM

Use the Redis command line tool (redis-cli) to manage and operate Redis through the following steps: Connect to the server, specify the address and port. Send commands to the server using the command name and parameters. Use the HELP command to view help information for a specific command. Use the QUIT command to exit the command line tool.

How to set the redis expiration policy How to set the redis expiration policy Apr 10, 2025 pm 10:03 PM

There are two types of Redis data expiration strategies: periodic deletion: periodic scan to delete the expired key, which can be set through expired-time-cap-remove-count and expired-time-cap-remove-delay parameters. Lazy Deletion: Check for deletion expired keys only when keys are read or written. They can be set through lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-user-del parameters.

How to optimize the performance of debian readdir How to optimize the performance of debian readdir Apr 13, 2025 am 08:48 AM

In Debian systems, readdir system calls are used to read directory contents. If its performance is not good, try the following optimization strategy: Simplify the number of directory files: Split large directories into multiple small directories as much as possible, reducing the number of items processed per readdir call. Enable directory content caching: build a cache mechanism, update the cache regularly or when directory content changes, and reduce frequent calls to readdir. Memory caches (such as Memcached or Redis) or local caches (such as files or databases) can be considered. Adopt efficient data structure: If you implement directory traversal by yourself, select more efficient data structures (such as hash tables instead of linear search) to store and access directory information

How to implement redis counter How to implement redis counter Apr 10, 2025 pm 10:21 PM

Redis counter is a mechanism that uses Redis key-value pair storage to implement counting operations, including the following steps: creating counter keys, increasing counts, decreasing counts, resetting counts, and obtaining counts. The advantages of Redis counters include fast speed, high concurrency, durability and simplicity and ease of use. It can be used in scenarios such as user access counting, real-time metric tracking, game scores and rankings, and order processing counting.

See all articles