Home Database Mysql Tutorial 关于 Redis 的几种数据库设计方案的内存占用测试

关于 Redis 的几种数据库设计方案的内存占用测试

Jun 07, 2016 pm 04:30 PM
redis about Memory several kinds occupy database plan test design

最近在做一个项目,数据库使用的是 Redis。在设计数据结构时,不知道哪种实现是最优的,于是做了下测试。 测试环境如下: OS X10.8.3 Redis 2.6.12 Python 2.7.4 redis-py 2.7.2 hiredis 0.1.1 ujson 1.30 MessagePack 0.3.0 注意: 因为是拿 Python 测试的

最近在做一个项目,数据库使用的是 Redis。在设计数据结构时,不知道哪种实现是最优的,于是做了下测试。

测试环境如下:
OS X10.8.3
Redis 2.6.12
Python 2.7.4
redis-py 2.7.2
hiredis 0.1.1
ujson 1.30
MessagePack 0.3.0
注意:
  1. 因为是拿 Python 测试的,所以可能对其他语言并不完全适用。
  2. 使用的测试数据是特定的,可能对更小或更大的数据并不完全适用。

测试结果就不列出了,直接说结论吧。
  1. 最差的存储方式就是用一个 hash 来存储一个实体(即一条记录)。时间上比其他方案慢 1 ~ 2 倍,空间占用较大。
    更重要的是拿出来的字段类型是字符串,还得自己转换类型。
    唯一的好处就是可以单独操作一个字段。
  2. 使用 string 类型来存储也是不推荐的,不过稍好于前一种方式。在单个实体较小时,会暴露出 key 占用内存较多的缺点。
  3. 用一个 hash 来存储一个类型的所有实体(即一张表),在实现上比较简单,内存占用尚可。
  4. 用多个 hash 来存储一个类型的所有实体(即分表),在实现上稍微复杂点,但占用的内存最小。
    如果单个字段值较小(缺省值是 64 字节),单个 hash 存储的字段数不多(缺省值是 512 个)时,会采用 hash zipmap 来存储,内存占用会显著减小。
    单个 hash 存储的字段数建议为 2 的次方,例如 1024。略微超过这个值,会导致内存占用和延迟时间都增加。
    Instagram 的工程师认为,使用 hash zipmap 时,最佳的字段数为 1000 左右。不过据我测试,基本都是随字段数增加而变慢,而内存占用从 128 直到 1024 的变化基本可以忽略。
  5. 存储为 JSON 格式是种不错的选择。对包含中文的内容来说,设置 ensure_ascii=False 可以节省大量内存。
    ujson 比 json 性能好很多,后者在设置 ensure_ascii=False 后性能急剧下降。
  6. cPickle 比 ujson 的性能要差,不过支持更多类型(如 datetime)。
  7. MessagePack 比 ujson 有一点不太明显的性能优势,不过丧失了可读性,且取回 unicode 需要自己 decode。
    号称比 Protocol Buffer 快 4 倍应该可以无视了,至少其 Python 库没有明显优势。
  8. 使用 zlib 压缩可以节省更多内存,不过性能变慢 1 ~ 2 倍。
看这个测试结果,感觉还不如用 MongoDB 省事……

最后附上测试代码:

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

# -*- coding: utf-8 -*-

import cPickle

import json

import time

import zlib

import msgpack

import redis

import ujson

class Timer:   

    def __enter__(self):

        self.start = time.time()

        return self

    def __exit__(self, *args):

        self.end = time.time()

        self.interval = self.end - self.start

def test(function):

    def wrapper(*args, **kwargs):

        args_list = []

        if args:

            args_list.append(','.join((str(arg) for arg in args)))

        if kwargs:

            args_list.append(','.join('%s=%s' % (key, value) for key, value in kwargs.iteritems()))

        print 'call %s(%s):' % (function.func_name, ', '.join(args_list))

        redis_client.flushall()

        print 'memory:', redis_client.info()['used_memory_human']

        with Timer() as timer:

            result = function(*args, **kwargs)

        print 'time:', timer.interval

        print 'memory:', redis_client.info()['used_memory_human']

        print

        return result

    return wrapper

redis_client = redis.Redis()

pipe = redis_client.pipeline(transaction=False)

articles = [{

    'id': i,

    'title': u'团结全世界正义力量痛击日本',

    'content': u'近期日本社会有四种感觉极度高涨,即二战期间日本军国主义扩张战争的惨败在日本右翼势力内心留下的耻辱感;被美国长期占领和控制的压抑感;经济长期停滞不前的焦虑感;对中国快速崛起引发的失落感。为此,日本为了找到一个发泄口,对中国采取了一系列挑衅行为,我们不能听之任之。现在全国13亿人要万众一心,团结起来,拿出决心、意志和能力,果断实施对等反击。在这场反击日本右翼势力的反攻倒算中,中国不是孤立的,我们要团结全世界一切反法西斯战争的正义力量,痛击日本对国际正义的挑战。',

    'source_text': u'环球时报',

    'source_url': 'http://opinion.huanqiu.com/column/mjzl/2012-09/3174337.html',

    'time': '2012-09-13 09:23',

    'is_public': True

} for i in xrange(10000)]

@test

def test_hash():

    for article in articles:

        pipe.hmset('article:%d' % article['id'], article)

    pipe.execute()

@test

def test_json_hash():

    for article in articles:

        pipe.hset('article', article['id'], json.dumps(article))

    pipe.execute()

@test

def test_ujson_hash():

    for article in articles:

        pipe.hset('article', article['id'], ujson.dumps(article))

    pipe.execute()

@test

def test_ujson_string():

    for article in articles:

        pipe.set('article:%d' % article['id'], ujson.dumps(article))

    pipe.execute()

@test

def test_zlib_ujson_string():

    for article in articles:

        pipe.set('article:%d' % article['id'], zlib.compress(ujson.dumps(article, ensure_ascii=False)))

    pipe.execute()

@test

def test_msgpack():

    for article in articles:

        pipe.hset('article', article['id'], msgpack.packb(article))

    pipe.execute()

@test

def test_pickle_string():

    for article in articles:

        pipe.set('article:%d' % article['id'], cPickle.dumps(article))

    pipe.execute()

@test

def test_json_without_ensure_ascii():

    for article in articles:

        pipe.hset('article', article['id'], json.dumps(article, ensure_ascii=False))

    pipe.execute()

@test

def test_ujson_without_ensure_ascii():

    for article in articles:

        pipe.hset('article', article['id'], ujson.dumps(article, ensure_ascii=False))

    pipe.execute()

def test_ujson_shard_id():

    @test

    def test_ujson_shard_id_of_size(size):

        for article in articles:

            article_id = article['id']

            pipe.hset('article:%d' % (article_id / size), article_id % size, ujson.dumps(article, ensure_ascii=False))

        pipe.execute()

    for size in (2, 4, 8, 10, 16, 32, 64, 100, 128, 256, 500, 512, 513, 1000, 1024, 1025, 2048, 4096, 8092):

        test_ujson_shard_id_of_size(size)

    test_ujson_shard_id_of_size(512)

for key, value in sorted(globals().copy().iteritems(), key=lambda x:x[0]):

    if key.startswith('test_'):

        value()

Copy after login
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Nordhold: Fusion System, Explained
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1667
14
PHP Tutorial
1273
29
C# Tutorial
1255
24
How to configure Lua script execution time in centos redis How to configure Lua script execution time in centos redis Apr 14, 2025 pm 02:12 PM

On CentOS systems, you can limit the execution time of Lua scripts by modifying Redis configuration files or using Redis commands to prevent malicious scripts from consuming too much resources. Method 1: Modify the Redis configuration file and locate the Redis configuration file: The Redis configuration file is usually located in /etc/redis/redis.conf. Edit configuration file: Open the configuration file using a text editor (such as vi or nano): sudovi/etc/redis/redis.conf Set the Lua script execution time limit: Add or modify the following lines in the configuration file to set the maximum execution time of the Lua script (unit: milliseconds)

MySQL: An Introduction to the World's Most Popular Database MySQL: An Introduction to the World's Most Popular Database Apr 12, 2025 am 12:18 AM

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

Why Use MySQL? Benefits and Advantages Why Use MySQL? Benefits and Advantages Apr 12, 2025 am 12:17 AM

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

How to implement redis counter How to implement redis counter Apr 10, 2025 pm 10:21 PM

Redis counter is a mechanism that uses Redis key-value pair storage to implement counting operations, including the following steps: creating counter keys, increasing counts, decreasing counts, resetting counts, and obtaining counts. The advantages of Redis counters include fast speed, high concurrency, durability and simplicity and ease of use. It can be used in scenarios such as user access counting, real-time metric tracking, game scores and rankings, and order processing counting.

MySQL vs. Other Databases: Comparing the Options MySQL vs. Other Databases: Comparing the Options Apr 15, 2025 am 12:08 AM

MySQL is suitable for web applications and content management systems and is popular for its open source, high performance and ease of use. 1) Compared with PostgreSQL, MySQL performs better in simple queries and high concurrent read operations. 2) Compared with Oracle, MySQL is more popular among small and medium-sized enterprises because of its open source and low cost. 3) Compared with Microsoft SQL Server, MySQL is more suitable for cross-platform applications. 4) Unlike MongoDB, MySQL is more suitable for structured data and transaction processing.

Oracle's Role in the Business World Oracle's Role in the Business World Apr 23, 2025 am 12:01 AM

Oracle is not only a database company, but also a leader in cloud computing and ERP systems. 1. Oracle provides comprehensive solutions from database to cloud services and ERP systems. 2. OracleCloud challenges AWS and Azure, providing IaaS, PaaS and SaaS services. 3. Oracle's ERP systems such as E-BusinessSuite and FusionApplications help enterprises optimize operations.

How to optimize the performance of debian readdir How to optimize the performance of debian readdir Apr 13, 2025 am 08:48 AM

In Debian systems, readdir system calls are used to read directory contents. If its performance is not good, try the following optimization strategy: Simplify the number of directory files: Split large directories into multiple small directories as much as possible, reducing the number of items processed per readdir call. Enable directory content caching: build a cache mechanism, update the cache regularly or when directory content changes, and reduce frequent calls to readdir. Memory caches (such as Memcached or Redis) or local caches (such as files or databases) can be considered. Adopt efficient data structure: If you implement directory traversal by yourself, select more efficient data structures (such as hash tables instead of linear search) to store and access directory information

MySQL: Structured Data and Relational Databases MySQL: Structured Data and Relational Databases Apr 18, 2025 am 12:22 AM

MySQL efficiently manages structured data through table structure and SQL query, and implements inter-table relationships through foreign keys. 1. Define the data format and type when creating a table. 2. Use foreign keys to establish relationships between tables. 3. Improve performance through indexing and query optimization. 4. Regularly backup and monitor databases to ensure data security and performance optimization.

See all articles