


Detailed examples of how Redis implements intersection, union and complement of data
This article brings you relevant knowledge about Redis, which mainly introduces related issues about realizing the intersection, union and complement of data. If all calculations are performed in JVM memory If so, it is easy to cause OOM exceptions caused by insufficient memory space. Let’s take a look at it. I hope it will be helpful to everyone.
Recommended learning: Redis video tutorial
Scene description
Today we will simulate such a scenario, We have multiple text files locally. Each file stores a lot of 32-bit strings as unique identifiers of users. Each user stores one line. If we have a very large number of users every day, we may In work, there is a need to perform intersection, union or complement processing on these users. The simplest way is to perform operations through sets in Java, such as using HashSet to perform some corresponding operations, but such operations exist One limitation is that we generally have limited initial memory during JVM operation. If all calculations are performed in JVM memory, it is easy to cause OOM exceptions caused by insufficient memory space. So today we will introduce an extension. A more flexible way to perform such intersection and complement operations: use Redis to realize the intersection, union, and complement of data
Environment Description
Redis version: Redis 6.0.6
Jedis version: 4.2.2
Tool hutool version: 5.8.0.M3
pom file:
<dependencies> <dependency> <groupId>redis.clients</groupId> <artifactId>jedis</artifactId> <version>4.2.2</version> </dependency> <dependency> <groupId>cn.hutool</groupId> <artifactId>hutool-all</artifactId> <version>5.8.0.M3</version> </dependency></dependencies>
Intersection and complement calculation
Initialization constants
public class RedisCalculateUtils { static String oneFileString = "/Users/tmp/test-1.txt"; static String twoFileString = "/Users/tmp/test-2.txt"; static String diffFileString = "/Users/tmp/diff-test.txt"; static String interFileString = "/Users/tmp/inter-test.txt"; static String unionFileString = "/Users/tmp/union-test.txt"; static String oneFileCacheKey = "oneFile"; static String twoFileCacheKey = "twoFile"; static String diffFileCacheKey = "diffFile"; static String interFileCacheKey = "interFile"; static String unionFileCacheKey = "unionFile"; }
Initialize data to the specified file
/** * 初始化数据并写入文件中 */public static void writeFile() { File oneFile = new File(oneFileString); List<String> fs = new ArrayList<>(10000); for (int i = 10000; i < 15000; i++) { String s = SecureUtil.md5(String.valueOf(i)); fs.add(s); } FileUtil.writeUtf8Lines(fs, oneFile); File twoFile = new File(twoFileString); fs.clear(); for (int i = 12000; i < 20000; i++) { String s = SecureUtil.md5(String.valueOf(i)); fs.add(s); } FileUtil.writeUtf8Lines(fs, twoFile); }
Write the specified file to Redis
/** * 读取文件数据并写入Redis */public static void writeCache() { try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { Pipeline p = jedis.pipelined(); List<String> oneFileStringList = FileUtil.readLines(oneFileString, "UTF-8"); for (String s : oneFileStringList) { p.sadd(oneFileCacheKey, s); } p.sync(); List<String> twoFileStringList = FileUtil.readLines(twoFileString, "UTF-8"); for (String s : twoFileStringList) { p.sadd(twoFileCacheKey, s); } p.sync(); } catch (Exception e) { throw new RuntimeException(e); }}
Calculation of difference set
/** * oneKey对应的Set 与 twoKey对应的Set 的差集 并写入 threeKey * @param oneKey 差集前面的集合Key * @param twoKey 差集后面的集合Key * @param threeKey 差集结果的集合Key */ public static void diff(String oneKey, String twoKey, String threeKey) { try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { long result = jedis.sdiffstore(threeKey, oneKey, twoKey); System.out.println("oneKey 与 twoKey 的差集的个数:" + result); } catch (Exception e) { throw new RuntimeException(e); } }
The difference set calculation result is written to the specified file
/** * 将计算的差集数据写入到指定文件 */ public static void writeDiffToFile() { File diffFile = new File(diffFileString); try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { Set<String> result = jedis.smembers(diffFileCacheKey); FileUtil.writeUtf8Lines(result, diffFile); } catch (Exception e) { throw new RuntimeException(e); } }
Intersection calculation
/** * * @param cacheKeyArray 交集集合Key * @param destinationKey 交集集合结果Key */ public static void inter(String[] cacheKeyArray, String destinationKey) { try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { long result = jedis.sinterstore(destinationKey, cacheKeyArray); System.out.println("cacheKeyArray 的交集的个数:" + result); } catch (Exception e) { throw new RuntimeException(e); } }
The intersection calculation result is written to the specified file
/** * 将计算的交集数据写入到指定文件 */ public static void writeInterToFile() { File interFile = new File(interFileString); try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { Set<String> result = jedis.smembers(interFileCacheKey); FileUtil.writeUtf8Lines(result, interFile); } catch (Exception e) { throw new RuntimeException(e); } }
Union calculation
/** * 计算多个Key的并集并写入到新的Key * @param cacheKeyArray 求并集的Key * @param destinationKey 并集结果写入的KEY */ public static void union(String[] cacheKeyArray, String destinationKey) { try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { long result = jedis.sunionstore(destinationKey, cacheKeyArray); System.out.println("cacheKeyArray 的并集的个数:" + result); } catch (Exception e) { throw new RuntimeException(e); } }
Write the union calculation result to the specified file
/** * 将计算的并集数据写入到指定文件 */ public static void writeUnionToFile() { File unionFile = new File(unionFileString); try(Jedis jedis = new Jedis("127.0.0.1", 6379)) { Set<String> result = jedis.smembers(unionFileCacheKey); FileUtil.writeUtf8Lines(result, unionFile); } catch (Exception e) { throw new RuntimeException(e); } }
Redis command description
SDIFFSTORE destination key [key …]
Example:
key1 = {a,b,c,d} key2 = {c} key3 = {a,c,e} SDIFF key1 key2 key3 = {b,d}
The SDIFFSTORE command is similar to SDIFF. The difference is that it saves the results to the destination set and returns the result set. to the client.
If the destination collection already exists, it will be overwritten.
- Return value
Number of members in the result set
SINTERSTORE destination key [key …]
Example Note:
key1 = {a,b,c,d} key2 = {c} key3 = {a,c,e} SINTER key1 key2 key3 = {c}
The SINTERSTORE command is similar to the SINTER command, except that it does not directly return the result set, but saves the results in the destination collection.
If the destination collection exists, it will be overwritten.
- Return value
Number of members in the result set
SUNIONSTORE destination key [key …]
Example Note:
key1 = {a,b,c,d} key2 = {c} key3 = {a,c,e} SUNION key1 key2 key3 = {a,b,c,d,e}
The function of the SUNIONSTORE command is similar to that of SUNION. The difference is that the result set is not returned but is stored in the destination.
If destination already exists, it will be overwritten.
- Return value
Number of members in the result set
Recommended learning: Redis video tutorial
The above is the detailed content of Detailed examples of how Redis implements intersection, union and complement of data. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Redis cluster mode deploys Redis instances to multiple servers through sharding, improving scalability and availability. The construction steps are as follows: Create odd Redis instances with different ports; Create 3 sentinel instances, monitor Redis instances and failover; configure sentinel configuration files, add monitoring Redis instance information and failover settings; configure Redis instance configuration files, enable cluster mode and specify the cluster information file path; create nodes.conf file, containing information of each Redis instance; start the cluster, execute the create command to create a cluster and specify the number of replicas; log in to the cluster to execute the CLUSTER INFO command to verify the cluster status; make

How to clear Redis data: Use the FLUSHALL command to clear all key values. Use the FLUSHDB command to clear the key value of the currently selected database. Use SELECT to switch databases, and then use FLUSHDB to clear multiple databases. Use the DEL command to delete a specific key. Use the redis-cli tool to clear the data.

To read a queue from Redis, you need to get the queue name, read the elements using the LPOP command, and process the empty queue. The specific steps are as follows: Get the queue name: name it with the prefix of "queue:" such as "queue:my-queue". Use the LPOP command: Eject the element from the head of the queue and return its value, such as LPOP queue:my-queue. Processing empty queues: If the queue is empty, LPOP returns nil, and you can check whether the queue exists before reading the element.

On CentOS systems, you can limit the execution time of Lua scripts by modifying Redis configuration files or using Redis commands to prevent malicious scripts from consuming too much resources. Method 1: Modify the Redis configuration file and locate the Redis configuration file: The Redis configuration file is usually located in /etc/redis/redis.conf. Edit configuration file: Open the configuration file using a text editor (such as vi or nano): sudovi/etc/redis/redis.conf Set the Lua script execution time limit: Add or modify the following lines in the configuration file to set the maximum execution time of the Lua script (unit: milliseconds)

There are two types of Redis data expiration strategies: periodic deletion: periodic scan to delete the expired key, which can be set through expired-time-cap-remove-count and expired-time-cap-remove-delay parameters. Lazy Deletion: Check for deletion expired keys only when keys are read or written. They can be set through lazyfree-lazy-eviction, lazyfree-lazy-expire, lazyfree-lazy-user-del parameters.

Use the Redis command line tool (redis-cli) to manage and operate Redis through the following steps: Connect to the server, specify the address and port. Send commands to the server using the command name and parameters. Use the HELP command to view help information for a specific command. Use the QUIT command to exit the command line tool.

Redis counter is a mechanism that uses Redis key-value pair storage to implement counting operations, including the following steps: creating counter keys, increasing counts, decreasing counts, resetting counts, and obtaining counts. The advantages of Redis counters include fast speed, high concurrency, durability and simplicity and ease of use. It can be used in scenarios such as user access counting, real-time metric tracking, game scores and rankings, and order processing counting.

In Debian systems, readdir system calls are used to read directory contents. If its performance is not good, try the following optimization strategy: Simplify the number of directory files: Split large directories into multiple small directories as much as possible, reducing the number of items processed per readdir call. Enable directory content caching: build a cache mechanism, update the cache regularly or when directory content changes, and reduce frequent calls to readdir. Memory caches (such as Memcached or Redis) or local caches (such as files or databases) can be considered. Adopt efficient data structure: If you implement directory traversal by yourself, select more efficient data structures (such as hash tables instead of linear search) to store and access directory information
