Detailed explanation of data structure in Redis-Redis-php.cn

Table of Contents

RedisUnderlying data structure

Redis underlying data structure implementation

SDS, doubly linked list and integer set

Hashha Hash table

Rehash

Compressed list and skip table

Redis data type

Choose the appropriate RedisData type strategy

Amount of data, the size of the data itself

Supports single point query/range query

Special usage scenarios that do not support range query

Home

Database

Redis

Detailed explanation of data structure in Redis

青灯夜游

Mar 31, 2021 am 10:26 AM

java redis Scenario application data structure interview

Detailed explanation of data structure in Redis

In actual development, Redis will be used frequently, so how should we correctly choose the data type during use? Which data types are suitable in which scenarios. And in interviews, interviewers often ask questions about Redis data structure:

Why is Redis fast?
Why does the query operation slow down?
Redis Hash rehash process
Why use a hash table as the index of Redis

When We have analyzed and understood the Redis data structure, which can help us correctly choose the data type to use and improve system performance when using Redis. [Related recommendations: Redis video tutorial]

`Redis`Underlying data structure

Redis Yes A memorykey-valuekey-value database, and the key-value pair data is stored in memory, so Redis memory-based data operations , which has high efficiency and fast speed;

Among them, Key is the String type, Redis supports the value type Including String, List, Hash, Set, Sorted Set, BitMap wait. Redis The reason why it can be widely applied to many business scenarios is based on its diverse types of value.

The data type of Value of Redis is based on the object system customized for RedisredisObject Implemented,

typedef struct redisObject{
    //类型
    unsigned type:4;
    //编码
    unsigned encoding:4;
    //指向底层实现数据结构的指针
    void *ptr;
    ….. 
}

Copy after login

redisObjectIn addition to recording actual data, additional memory space is also required to record metadata information such as data length, space usage, etc., which contains 8 bytes of metadata and a 8-byte pointer, the pointer points to the actual data location of the specific data type:

Detailed explanation of data structure in Redis

Among them, the pointer points to the location based on The underlying data structure of Redis stores the location of data. The underlying data structure of Redis is: SDS, implemented by doubly linked lists, jump tables, hash tables, compressed lists, and integer sets. .

So how is the underlying data structure of Redis implemented?

Redis underlying data structure implementation

Let’s take a look at Redisrelatively simpleSDS, two-way Linked list, set of integers.

`SDS`, doubly linked list and integer set

SDS, use the len field to record The number of bytes used reduces the complexity of obtaining the string length to O(1), and SDS is lazy releasing space, you free free the space , the system records the data and can use it directly when you want to use it next time. No need to apply for new space.
Detailed explanation of data structure in Redis
Integer collection, allocate a space with consecutive addresses in the memory, and the data elements will be stored next to each other, without the need for additional pointers to bring space overhead. Its characteristics areThe memory is compact and saves memory space, the query complexity is O(1) and the efficiency is high, and the complexity of other operations is O(N);

Doubly linked list , it can be a non-contiguous and non-sequential space in the memory, and the sequence between the elements is connected in series through the additional pointer overhead of the front-end/back-end pointer.

Its characteristics are that the complexity of inserting/updating data in the section is O(1), high efficiency, and the query complexity is O(N);

`Hash`ha Hash table

Hash table is actually similar to an array. Each element of the array is called a hash bucket. Each hash bucket stores key-value pair data, and the hash bucket The elements in use the dictEntry structure,
Detailed explanation of data structure in Redis

Therefore, the hash bucket element does not save the key-value pair itself, but points to A pointer to a specific value, so there will be additional space overhead when saving each key-value pair, at least 24 bytes will be added, especially Value is String key-value pairs, each key-value pair requires an additional 24 bytes of space. When the saved data is small and the additional overhead is larger than the data, in order to save space, consider changing the data structure.

Let’s take a look at the full picture of the global hash table:
Detailed explanation of data structure in Redis
Although the hash table operation is very fast, RedisAfter the data becomes larger, A potential risk will arise: Hash table conflict problem and rehashoverhead problem, Can this explain why the hash table operation slows down?

When writing more data into the hash table, hash conflicts are an inevitable problem. The way Redis solves hash conflicts is Chained Hash , multiple elements in the same hash bucket are stored in a linked list, and they are connected with pointers in turn, as shown in the figure:
Detailed explanation of data structure in Redis

When there are more and more hash conflicts, this will cause some hash conflict chains to be too long, which will lead to a long time-consuming search for elements on this chain and reduced efficiency.

In order to solve the problem of too long chains caused by hash conflicts, perform rehash operation to increase the number of existing hash buckets and disperse the single Number of bucket elements. So how is the rehash process performed?

`Rehash`

To make the rehash operation more efficient, two global hash tables are used: Hash table 1 and hash table 2, as follows:

Allocate larger space to hash table 2,
Remap the data in hash table 1 and copy it to the hash in Table 2;
Release the space of hash table 1

However, due to the large data size of tables 1 and 2 during remapping and copying, if you put hash table 1 in one go After all the data has been migrated, the Redis thread will be blocked and unable to serve other requests.

In order to avoid this problem and ensure that Redis can handle client requests normally, Redis adopts progressive rehash.

Every time a request is processed, all entries at the index position are copied from hash table 1 to hash table 2, and the overhead of a large number of copies at one time is allocated to the process of processing multiple requests. , avoiding time-consuming operations and ensuring fast access to data.

Detailed explanation of data structure in Redis

After understanding the relevant knowledge points of HashHash tables, let’s take a look at the uncommon compression lists and skip tables.

Compressed list and skip table

Compressed list, based on the array, the compressed list has three fields in the header: zlbytes, zltail and zllen, respectively represents the length of the list, the offset of the end of the list and the number of entries in the list; the compressed list also has a zlend at the end of the table, indicating the end of the list.
Detailed explanation of data structure in Redis

Advantages: The memory is compact and saves memory space. A space with continuous addresses is allocated in the memory, and the data elements will be stored next to each other without the need for additional pointers. To reduce space overhead; searching and locating the first element and the last element can be directly located through the length of the three header fields, and the complexity is O(1).

Jump list, based on the linked list, adds a multi-level index, and achieves rapid positioning of data through several jumps in the index position, as shown in the following figure:

For example, query 33

Detailed explanation of data structure in Redis

Features: When the amount of data is large, the search complexity of the skip table is O(logN).

To sum up, we can know the time complexity of the underlying data structure:

Data structure type	Time complexity
Hash table	O(1)
Integer array	O(N)
Doubly linked list	O(N)
Compressed list	O( N)
Jump list	O(logN)

RedisThe custom object system type is the Value data type of Redis. The data type of Redis is based on the underlying If the data structure is implemented, what are the data types?

Redis data type

String、List、Hash、Sorted Set , Set are relatively common types, and their corresponding relationship with the underlying data structure is as follows:

##HashCompressed list
Hash table Sorted SetCompressed List
Skip ListSetHash Table
Integer Array

The corresponding characteristics of the data type are similar to the underlying data structure of its implementation, and the properties are the same, and

String is implemented based on SDS and is suitable for simple key-valueStorage, setnx key valueImplement distributed locks, counters (atomicity), and distributed global unique IDs.

List, is sorted according to the order in which the elements enter List , following the FIFO (first in, first out) rule, and is generally used in sorting statistics and simple message queues.

Hash is the mapping between the string key and the string value. It is very suitable for representing an object information. Features are added. And the deletion operation complexity is O(1).

Set is an unordered collection of String type elements. The members of the collection are unique, which means that duplicate data cannot appear in the collection. It is implemented based on a hash table, so the complexity of adding, deleting, and searching is O(1).

Sorted Set is an upgrade of the Set type. The difference is that each element is associated with a double type score. By sorting the score, range query is possible.

Then let’s take a look at these data types, Redis Geo, HyperLogLog, BitMap?

Redis Geo, treats the earth as an approximate sphere, and converts two-dimensional longitude and latitude into strings based on GeoHash to implement location division and specified distance query. Features are generally used in location-related applications.

HyperLogLog is a probabilistic data structure that uses probabilistic algorithms to count the approximate cardinality of a set, with an error rate of approximately 0.81%. When the number of set elements is very large, the space required to calculate the cardinality is always fixed and very small, making it suitable for UV statistics.

BitMap uses one bit to map the state of an element. There are only two states: 0 and 1. It is a very typical binary state, and it uses the String type as the underlying layer. A data type of statistical binary state implemented by the data structure. It has the advantage of saving a lot of memory space and can be used in binary statistics scenarios.

After understanding the above knowledge, let's discuss which strategies are used to select the Redis data type in the corresponding application scenario?

Choose the appropriate `Redis`Data type strategy

In actual development applications, Redis can be applied to many business scenarios, but what do we need? What about choosing data type storage?

The main basis is time/space complexity. In actual development, the following points can be considered:

Amount of data, the size of the data itself
Collection type Statistical mode
Supports single point query/range query
Special usage scenarios

Amount of data, the size of the data itself

When the amount of data is relatively large and the data itself is relatively small, using String will greatly increase the use of extra space, because a hash table is used to save key-value pairs, and dictEntry is used Structure saving will result in the overhead of saving three additional pointers of dictEntry when saving each key-value pair. This will cause the data itself to be smaller than the additional space overhead, which will eventually lead to a much larger storage space data size. than the original data storage size.

Can be implemented using List, Hash and Sorted Set# based on integer arrays and compressed lists ##, because integer array and compressed list allocate a space with continuous addresses in the memory, and then place the elements in the set one by one in this space, It is very compact, and there is no need to use extra pointers to connect elements together, which avoids the space overhead caused by extra pointers. Moreover, when using a collection type, one key corresponds to the data of a collection, and a lot more data can be saved, but only one dictEntry is used, thus saving memory.

Collection type statistical mode

RedisCommon collection type statistical modes include:

Aggregation statistics (intersection, difference set, union statistics): When performing aggregation calculations on multiple sets, you can choose Set;
Sorting statistics (requires set type The order of elements can be preserved): List and Sorted Set in Redis are ordered collections, and List is entered according to the elements List is sorted in the order, Sorted Set can be sorted according to the weight of the elements;
Binary state statistics (the values of set elements are only 0 and 1) : Bitmap itself is a statistical binary state data type implemented using the String type as the underlying data structure. Bitmap is used after BITOP bitwise AND, OR, and XOR operations. BITCOUNT counts the number of 1's.
Cardinality statistics (counting the number of unique elements in a set): HyperLogLog is a data collection type used to count cardinality. The statistical results have a certain error. Standard The error rate is 0.81%. If you need accurate statistical results, use Set or Hash type.

Detailed explanation of data structure in Redis

Set type, suitable for statistical users/friends/follows/fans/interested people collection aggregation operations, such as

Statistics of the number of new users of mobile APP every day
Common friends of two users

Redis中List and Sorted Set are ordered sets, used to deal with the sorting requirements of set elements, such as

Latest comments list
Ranking

BitmapBinary status statistics are suitable for statistics with a large amount of data and can be represented by binary status, such as:

Sign-in and clock-in, the number of user check-ins on the day
User Weekly Active
User Online Status

HyperLogLog is a data collection type used to count cardinality, counting non-repeating elements in a collection Number, for example,

counts the UV of a web page. Multiple visits by a user in a day can only be counted once.

Supports single point query/range query

RedisList and Sorted Set are ordered collections that support range queries, but Hash is

Special usage scenarios that do not support range query

Message queue, use Redis as the implementation of the message queue, to Basic requirements for messages Preserve message order , Handle duplicate messages and Ensure message reliability , the solutions are as follows:

Based on List's message queue solution
Streams-based message queue solution

Data type	Data structure
String	SDS (Simple Dynamic String)
List	Doubly linked list Compressed list

##Blocking readUse Use Duplicate message processingProducers implement global unique IDs by themselvesStreams automatically generates globally unique IDsMessage reliabilityUse Use Applicable scenariosThe total amount of messages is smallThe total amount of messages is large and data needs to be read in the form of consumer groups

	Based on List	Based on Strems
Message order preservation	Use `LPUSH/RPOP`	Use `XADD/XREAD`
	BRPOP	XREAD block

	BRPOPLPUSH	PENDING `List to automatically retain messages`

Location-based LBS service is implemented using the specific GEO data type of Redis. GEO can record geographical location information in the form of longitude and latitude, and is widely used in LBS is in service. For example: how taxi-hailing software provides services based on location.

Summary

Redis is so fast because of its memory-based data manipulation and use of HashHash As an index, a table is highly efficient and fast, and thanks to the diversification of its underlying data, it can be applied to many scenarios. Choosing the appropriate data type in different scenarios can improve its query performance.

For more programming related knowledge, please visit:

Programming Video! !

The above is the detailed content of Detailed explanation of data structure in Redis. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Assassin's Creed Shadows: Seashell Riddle Solution

4 weeks ago By DDD

What's New in Windows 11 KB5054979 & How to Fix Update Issues

3 weeks ago By DDD

Where to find the Crane Control Keycard in Atomfall

4 weeks ago By DDD

Roblox: Dead Rails - How To Complete Every Challenge

1 months ago By DDD

How to fix KB5055523 fails to install in Windows 11?

2 weeks ago By DDD

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Where is the login entrance for gmail email?

7719

Java Tutorial

1641

CakePHP Tutorial

1396

Laravel Tutorial

1289

PHP Tutorial

1233

Related knowledge

How to build the redis cluster mode Apr 10, 2025 pm 10:15 PM

Redis cluster mode deploys Redis instances to multiple servers through sharding, improving scalability and availability. The construction steps are as follows: Create odd Redis instances with different ports; Create 3 sentinel instances, monitor Redis instances and failover; configure sentinel configuration files, add monitoring Redis instance information and failover settings; configure Redis instance configuration files, enable cluster mode and specify the cluster information file path; create nodes.conf file, containing information of each Redis instance; start the cluster, execute the create command to create a cluster and specify the number of replicas; log in to the cluster to execute the CLUSTER INFO command to verify the cluster status; make

PHP vs. Python: Understanding the Differences Apr 11, 2025 am 12:15 AM

PHP and Python each have their own advantages, and the choice should be based on project requirements. 1.PHP is suitable for web development, with simple syntax and high execution efficiency. 2. Python is suitable for data science and machine learning, with concise syntax and rich libraries.

PHP: A Key Language for Web Development Apr 13, 2025 am 12:08 AM

PHP is a scripting language widely used on the server side, especially suitable for web development. 1.PHP can embed HTML, process HTTP requests and responses, and supports a variety of databases. 2.PHP is used to generate dynamic web content, process form data, access databases, etc., with strong community support and open source resources. 3. PHP is an interpreted language, and the execution process includes lexical analysis, grammatical analysis, compilation and execution. 4.PHP can be combined with MySQL for advanced applications such as user registration systems. 5. When debugging PHP, you can use functions such as error_reporting() and var_dump(). 6. Optimize PHP code to use caching mechanisms, optimize database queries and use built-in functions. 7

How to clear redis data Apr 10, 2025 pm 10:06 PM

How to clear Redis data: Use the FLUSHALL command to clear all key values. Use the FLUSHDB command to clear the key value of the currently selected database. Use SELECT to switch databases, and then use FLUSHDB to clear multiple databases. Use the DEL command to delete a specific key. Use the redis-cli tool to clear the data.

PHP vs. Other Languages: A Comparison Apr 13, 2025 am 12:19 AM

PHP is suitable for web development, especially in rapid development and processing dynamic content, but is not good at data science and enterprise-level applications. Compared with Python, PHP has more advantages in web development, but is not as good as Python in the field of data science; compared with Java, PHP performs worse in enterprise-level applications, but is more flexible in web development; compared with JavaScript, PHP is more concise in back-end development, but is not as good as JavaScript in front-end development.

How to read redis queue Apr 10, 2025 pm 10:12 PM

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

PHP vs. Python: Core Features and Functionality Apr 13, 2025 am 12:16 AM

PHP and Python each have their own advantages and are suitable for different scenarios. 1.PHP is suitable for web development and provides built-in web servers and rich function libraries. 2. Python is suitable for data science and machine learning, with concise syntax and a powerful standard library. When choosing, it should be decided based on project requirements.

PHP: The Foundation of Many Websites Apr 13, 2025 am 12:07 AM

The reasons why PHP is the preferred technology stack for many websites include its ease of use, strong community support, and widespread use. 1) Easy to learn and use, suitable for beginners. 2) Have a huge developer community and rich resources. 3) Widely used in WordPress, Drupal and other platforms. 4) Integrate tightly with web servers to simplify development deployment.

See all articles

Detailed explanation of data structure in Redis

RedisUnderlying data structure

Redis underlying data structure implementation

SDS, doubly linked list and integer set

Hashha Hash table

Rehash

Compressed list and skip table

Redis data type

Choose the appropriate RedisData type strategy

Amount of data, the size of the data itself

Collection type statistical mode

Supports single point query/range query

Special usage scenarios that do not support range query

Summary

Hot AI Tools

Undresser.AI Undress

AI Clothes Remover

Undress AI Tool

Clothoff.io

Video Face Swap

Hot Article

Hot Tools

Notepad++7.3.1

SublimeText3 Chinese version

Zend Studio 13.0.1

Dreamweaver CS6

SublimeText3 Mac version

Hot Topics

`Redis`Underlying data structure

`SDS`, doubly linked list and integer set

`Hash`ha Hash table

`Rehash`

Choose the appropriate `Redis`Data type strategy