golang map implementation principle
In the process of learning Golang, map is a data structure we often use, which can be used to store key-value pairs. But, have you ever thought about how map is implemented? In this article, we will explore how map is implemented in Golang.
Introduction to the implementation principle of map
map is actually a hash table, and its implementation principle is the hash algorithm. A hash table is a data structure that directly accesses memory storage locations based on keys, that is, it provides a way to map keys to memory addresses. In a hash table, keywords are called "keys" and the memory addresses calculated by the hash function are called "buckets". The buckets store key-value pairs.
In Golang, map is implemented by storing key-value pairs in the bucket memory area. Each bucket has a header structure used to describe the status of the bucket. This header structure is called " bucket".
bucket data structure
Let’s first look at the bucket data structure:
type bmap struct { tophash [bucketCnt]uint8 // data 的类型在 map 被分配空间的时候由 map 的 key 和 value 类型决定 // 如果 key 类型和 value 类型都是未知类型,那么 data 的类型是 byte // 因为 byte 可以用于存储任何类型,也就是说 data 可以存储任何类型的 key 和 value // 在存储 key 和 value 的时候,需要将 key 和 value 转换成 uintptr,然后再转换成 byte 存储到 data 中 // 读取 key 和 value 的时候反向操作即可 // 因此,map 中的 key 和 value 一般都是需要转换成 uintptr 类型的 // 这里我们可以简单地理解为 data 存储了键值对的二进制数据 // 第一个 bucket 中如果存不下某个 key-value,还需要在第二个 bucket 中存储 // 所以,data 的长度可以为 1 或 2 // 当 data 的长度为 1 或 2 时,存储的键值对的个数分别为 1 或 2 // 因此,bucket 中最多只能存储 8 个键值对 // 否则,就需要将键值对存储在 overflow 中 // overflow 是一个指向单链表头部的指针数组 // 如果某个桶存储的键值对数量超过了 8,那么就将多余的键值对存储到 overflow 中 // overflow 中的第一个 bucket 中也是能够存储 8 个键值对的,如果第一个 bucket 中存不下的话 // 那么还需要在第二个 bucket 中继续存储 // 所以,overflow 数组的长度也可以为 1 或 2 // 当 overflow 数组长度为 1 或 2 时,最多存储 8 个键值对的 overflow // 当 overflow 数组长度为 3 时,最多存储 24 个键值对的 overflow // 也就是说,如果 map 中存储的键值对数量很多的话,那么可能会有很多个 overflow 桶 // 而 overflow 数组是动态分配的,因此它的长度是不定的,只有在运行时才能确定 data unsafe.Pointer // overflow 是一个指针数组,用于存储除当前桶以外的其它桶 // 如果某个桶的键值对数量超过了 8,那么它会将多余的键值对存储到 overflow 中 // overflow 数组中的每个元素都是一个 bucket 指针 // 这些 bucket 指针构成了一个链表,也就是所谓的 overflow 链表 overflow *[]*bmap // 这里存储的是这个 bucket 当前存储的键值对的个数 count int16 // flags 存储了 bucket 的标志 // 一个 bucket 可能会有如下三种状态: // 1. empty,表示这个 bucket 没有存储任何键值对 // 2. iterator,表示这个 bucket 正在被访问,不能再被其它操作使用 // 3. deleted,表示这个 bucket 存储的键值对已经被删除了 // 注意,该标志位只有在 GC 达到一定阈值后才会被置为 true,而且并不是立即删除键值对,而是在下一次 GC 的时候才会内存回收 flags uint8 // Bmap 的类型标记 // Bmap 用来标识存储于上文中 data 指针中的数据类型 // 如果 data 中存储的是 int,那么 Bmap 的值为 hmap.BmapInt // 如果 data 中存储的是 string,那么 Bmap 的值为 hmap.BmapString // 如果 data 中存储的是 interface{},那么 Bmap 的值为 hmap.BmapInterface // 如果 data 中存储其他类型,那么 Bmap 的值为 hmap.BmapOther BmapType uint16 // 与该 bucket 相关联的 key 的类型 tophash0 uint8 }
From the above definition, we can see that the bucket stores the following information:
- tophash: used to quickly filter out unmatched keys;
- data: data that stores key-value pairs;
- overflow: a pointer array that stores overflow linked lists;
- count: stores the number of key-value pairs stored in the current bucket, up to 8;
- flags: stores some status information of the bucket;
- BmapType: storage Indicates the type of data stored in data;
- tophash0: The type of key associated with the bucket.
Implementation Principle
Since the data stored in map is a key-value pair, and the key value of the key-value pair is non-repeatable and can be compared, we can use hashing The function converts the key value into a unique hash value, then uses the hash value as the index corresponding to the key value, and stores the value value at the index.
In Golang, each map has a hash table. This hash table is actually a continuous storage space that can store several buckets. Each bucket is a bucket type. structure.
So how is the size of the hash table determined? It is dynamically allocated memory when the map is created, and the default size used when space is first allocated is 8 buckets. If the number of buckets is not enough when adding elements for the first time, the size of the hash table will be expanded and the index position of each element in the hash table will be recalculated.
In Golang, the main function of the hash function is to hash the key value, making it more convenient to locate a bucket in the hash table, thereby speeding up the search for the value value corresponding to the key value. In Golang, the hash function is implemented using the MurmurHash3 algorithm.
Since the hash function maps key to an integer, different key values may be mapped to the same index. This situation is called a hash collision. When a hash conflict occurs, Golang uses the linked list method to resolve it, storing the key-value pairs on the same index in the linked list of the index. These key-value pairs are called overflow.
Summary
Golang’s map implementation principle mainly relies on hash tables and hash functions. A hash function is used to hash the key value, mapping it to a bucket in a hash table, thereby speeding up the operation of looking up the value. The hash table consists of several buckets, each bucket is represented by a bucket type structure, which stores key-value pair information. When there are multiple key-value pairs at the same index position in the hash table, the linked list method is used to store overflow. The map in Golang is actually a dynamic hash table, which can dynamically adjust the size of the hash table. When using maps, care needs to be taken to avoid hash collisions and including types with incomparable sizes as keys.
The above is the detailed content of golang map implementation principle. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











Golang is better than Python in terms of performance and scalability. 1) Golang's compilation-type characteristics and efficient concurrency model make it perform well in high concurrency scenarios. 2) Python, as an interpreted language, executes slowly, but can optimize performance through tools such as Cython.

Golang is better than C in concurrency, while C is better than Golang in raw speed. 1) Golang achieves efficient concurrency through goroutine and channel, which is suitable for handling a large number of concurrent tasks. 2)C Through compiler optimization and standard library, it provides high performance close to hardware, suitable for applications that require extreme optimization.

Goisidealforbeginnersandsuitableforcloudandnetworkservicesduetoitssimplicity,efficiency,andconcurrencyfeatures.1)InstallGofromtheofficialwebsiteandverifywith'goversion'.2)Createandrunyourfirstprogramwith'gorunhello.go'.3)Exploreconcurrencyusinggorout

Golang is suitable for rapid development and concurrent scenarios, and C is suitable for scenarios where extreme performance and low-level control are required. 1) Golang improves performance through garbage collection and concurrency mechanisms, and is suitable for high-concurrency Web service development. 2) C achieves the ultimate performance through manual memory management and compiler optimization, and is suitable for embedded system development.

Goimpactsdevelopmentpositivelythroughspeed,efficiency,andsimplicity.1)Speed:Gocompilesquicklyandrunsefficiently,idealforlargeprojects.2)Efficiency:Itscomprehensivestandardlibraryreducesexternaldependencies,enhancingdevelopmentefficiency.3)Simplicity:

Golang and Python each have their own advantages: Golang is suitable for high performance and concurrent programming, while Python is suitable for data science and web development. Golang is known for its concurrency model and efficient performance, while Python is known for its concise syntax and rich library ecosystem.

The performance differences between Golang and C are mainly reflected in memory management, compilation optimization and runtime efficiency. 1) Golang's garbage collection mechanism is convenient but may affect performance, 2) C's manual memory management and compiler optimization are more efficient in recursive computing.

Golang and C each have their own advantages in performance competitions: 1) Golang is suitable for high concurrency and rapid development, and 2) C provides higher performance and fine-grained control. The selection should be based on project requirements and team technology stack.
