Home Backend Development PHP Tutorial Detailed explanation of HashTable structure in PHP_PHP Tutorial

Detailed explanation of HashTable structure in PHP_PHP Tutorial

Jul 21, 2016 pm 03:07 PM
hashtable php zend use engine yes most go deep structure Detailed explanation

HashTable is the most important and widely used data structure in Zend engine. It is used to store almost everything.
1.2.1 Data structure
HashTable data structure is defined as follows:

Copy code Code As follows:

typedef struct bucket {
ulong h; // Store hash
uint nKeyLength;
void *pData; // Point to value, which is a copy of user data
void *pDataPtr;
struct bucket *pListNext; // pListNext and pListLast are composed of
struct bucket *pListLast; // The entire HashTable doubly linked list
struct bucket *pNext; // pNext and pLast are used to form A certain hash corresponds to
struct bucket *pLast; // Double linked list
char arKey[1]; // key
} Bucket;

typedef struct _hashtable {
uint nTableSize;
uint nTableMask;
uint nNumOfElements;
ulong nNextFreeElement;
Bucket *pInternalPointer; /* Used for element traversal */
Bucket *pListHead;
Bucket *pListTail;
Bucket **arBuckets; // hash array
dtor_func_t pDestructor; // Specify when HashTable is initialized and call when destroying Bucket
zend_bool persistent; // Whether to use C Memory allocation routines
unsigned char nApplyCount;
zend_bool bApplyProtection;
#if ZEND_DEBUG
int inconsistent;
#endif
} HashTable;


In general, Zend's HashTable is a linked list hash, which is also optimized for linear traversal, as shown below:


HashTable contains two data structures, a linked list hash and a doubly linked list. The former is used for fast key-value query, and the latter is convenient for linear traversal and sorting. A Bucket exists here at the same time in two data structures.
A few explanations about this data structure:
Why is a doubly linked list used in linked list hashing?
General linked list hashing only needs to be operated by key, and only a single linked list is enough. However, Zend sometimes needs to delete a given Bucket from the linked list hash, which can be achieved very efficiently using a double linked list.
What does nTableMask do?
This value is used to convert the hash value to the arBuckets array index. When initializing a HashTable, Zend first allocates memory of nTableSize size for the arBuckets array. nTableSize is the smallest 2^n that is not less than the user-specified size, which is 10* in binary. nTableMask = nTableSize – 1, which is binary 01*. At this time, h & nTableMask happens to fall in [0, nTableSize – 1], and Zend uses it as the index to access the arBuckets array.
What does pDataPtr do?
Normally, when the user inserts a key-value pair, Zend will make a copy of the value and point pData to the copy of the value. The copy operation requires calling Zend's internal routine emalloc to allocate memory. This is a very time-consuming operation and will consume a memory larger than the value (the extra memory is used to store cookies). If the value is small, it will cause Big waste. Considering that HashTable is mostly used to store pointer values, Zend introduces pDataPtr. When the value is as small as the pointer, Zend directly copies it to pDataPtr and points pData to pDataPtr. This avoids emalloc operations and also helps improve the Cache hit rate.
Why is the arKey size only 1? Why not use pointers to manage keys?
arKey is an array that stores keys, but its size is only 1, which is not enough to hold the key. The following code can be found in the initialization function of HashTable:

Copy code The code is as follows:

p = (Bucket *) pemalloc (sizeof(Bucket) - 1 + nKeyLength, ht->persistent);

可见,Zend为一个Bucket分配了一块足够放下自己和key的内存,上半部分是Bucket,下半部分是key,而arKey“恰好”是Bucket的最后一个元素,于是就可以使用arKey来访问key了。这种手法在内存管理例程中最为常见,当分配内存时,实际上是分配了比指定大小要大的内存,多出的上半部分通常被称为cookie,它存储了这块内存的信息,比如块大小、上一块指针、下一块指针等,baidu的Transmit程序就使用了这种方法。
不用指针管理key,是为了减少一次emalloc操作,同时也可以提高Cache命中率。另一个必需的理由是,key绝大部分情况下是固定不变的,不会因为key变长了而导致重新分配整个Bucket。这同时也解释了为什么不把value也一起作为数组分配了——因为value是可变的。

1.2.2 PHP数组
关于HashTable还有一个疑问没有回答,就是nNextFreeElement是干什么的?
不同于一般的散列,Zend的HashTable允许用户直接指定hash值,而忽略key,甚至可以不指定key(此时,nKeyLength为0)。同时,HashTable也支持append操作,用户连hash值也不用指定,只需要提供value,此时,Zend就用nNextFreeElement作为hash,之后将nNextFreeElement递增。
HashTable的这种行为看起来很奇怪,因为这将无法按key访问value,已经完全不是个散列了。理解问题的关键在于,PHP数组就是使用HashTable实现的——关联数组使用正常的k-v映射将元素加入HashTable,其key为用户指定的字符串;非关联数组则直接使用数组下标作为hash值,不存在key;而当在一个数组中混合使用关联和非关联时,或者使用array_push操作时,就需要用nNextFreeElement了。
再来看value,PHP数组的value直接使用了zval这个通用结构,pData指向的是zval*,按照上一节的介绍,这个zval*将直接存储在pDataPtr里。由于直接使用了zval,数组的元素可以是任意PHP类型。
数组的遍历操作,即foreach、each等,是通过HashTable的双向链表来进行的,pInternalPointer作为游标记录了当前位置。

1.2.3 变量符号表
除了数组,HashTable还被用来存储许多其他数据,比如,PHP函数、变量符号、加载的模块、类成员等。
一个变量符号表就相当于一个关联数组,其key是变量名(可见,使用很长的变量名并不是个好主意),value是zval*。
在任一时刻PHP代码都可以看见两个变量符号表——symbol_table和active_symbol_table——前者用于存储全局变量,称为全局符号表;后者是个指针,指向当前活动的变量符号表,通常情况下就是全局符号表。但是,当每次进入一个PHP函数时(此处指的是用户使用PHP代码创建的函数),Zend都会创建函数局部的变量符号表,并将active_symbol_table指向局部符号表。Zend总是使用active_symbol_table来访问变量,这样就实现了局部变量的作用域控制。
但如果在函数局部访问标记为global的变量,Zend会进行特殊处理——在active_symbol_table中创建symbol_table中同名变量的引用,如果symbol_table中没有同名变量则会先创建。

1.3 Memory and files
The resources owned by the program generally include memory and files. For ordinary programs, these resources are process-oriented. When the process ends, the operation The system or C library will automatically reclaim resources that we do not explicitly release.
However, the PHP program has its own particularity. It is based on pages. When a page is running, it will also apply for resources such as memory or files. However, when the page is finished running, the operating system or C library may not know the need. Carry out resource recycling. For example, we compile php into apache as a module and run apache in prefork or worker mode. In this case, the apache process or thread is reused, and the memory allocated by the php page will remain in the memory until the core is released.
In order to solve this problem, Zend provides a set of memory allocation APIs. Their functions are the same as the corresponding functions in C. The difference is that these functions allocate memory from Zend's own memory pool, and they can implement page-based Automatic recycling. In our module, the memory allocated for the page should use these APIs instead of C routines, otherwise Zend will try to efree our memory at the end of the page, and the result is usually a crush.
emalloc()
efree()
estrdup()
estrndup()
ecalloc()
erealloc()
In addition, Zend also provides a set of functions in the form VCWD_xxx The macros are used to replace the C library and the corresponding file API of the operating system. These macros can support PHP's virtual working directory and should always be used in module code. For the specific definition of macro, please refer to the PHP source code "TSRM/tsrm_virtual_cwd.h". You may notice that the close operation is not provided in all those macros. This is because the object of close is an opened resource and does not involve the file path, so you can use C or operating system routines directly; similarly, read/ Operations such as write also directly use C or operating system routines.

www.bkjia.comtruehttp: //www.bkjia.com/PHPjc/327576.htmlTechArticleHashTable is the most important and widely used data structure in Zend engine. It is used to store almost everything. . 1.2.1 Data structure The HashTable data structure is defined as follows: Copy code...
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1658
14
PHP Tutorial
1257
29
C# Tutorial
1231
24
How do you parse and process HTML/XML in PHP? How do you parse and process HTML/XML in PHP? Feb 07, 2025 am 11:57 AM

This tutorial demonstrates how to efficiently process XML documents using PHP. XML (eXtensible Markup Language) is a versatile text-based markup language designed for both human readability and machine parsing. It's commonly used for data storage an

Explain JSON Web Tokens (JWT) and their use case in PHP APIs. Explain JSON Web Tokens (JWT) and their use case in PHP APIs. Apr 05, 2025 am 12:04 AM

JWT is an open standard based on JSON, used to securely transmit information between parties, mainly for identity authentication and information exchange. 1. JWT consists of three parts: Header, Payload and Signature. 2. The working principle of JWT includes three steps: generating JWT, verifying JWT and parsing Payload. 3. When using JWT for authentication in PHP, JWT can be generated and verified, and user role and permission information can be included in advanced usage. 4. Common errors include signature verification failure, token expiration, and payload oversized. Debugging skills include using debugging tools and logging. 5. Performance optimization and best practices include using appropriate signature algorithms, setting validity periods reasonably,

Explain late static binding in PHP (static::). Explain late static binding in PHP (static::). Apr 03, 2025 am 12:04 AM

Static binding (static::) implements late static binding (LSB) in PHP, allowing calling classes to be referenced in static contexts rather than defining classes. 1) The parsing process is performed at runtime, 2) Look up the call class in the inheritance relationship, 3) It may bring performance overhead.

PHP Program to Count Vowels in a String PHP Program to Count Vowels in a String Feb 07, 2025 pm 12:12 PM

A string is a sequence of characters, including letters, numbers, and symbols. This tutorial will learn how to calculate the number of vowels in a given string in PHP using different methods. The vowels in English are a, e, i, o, u, and they can be uppercase or lowercase. What is a vowel? Vowels are alphabetic characters that represent a specific pronunciation. There are five vowels in English, including uppercase and lowercase: a, e, i, o, u Example 1 Input: String = "Tutorialspoint" Output: 6 explain The vowels in the string "Tutorialspoint" are u, o, i, a, o, i. There are 6 yuan in total

What are PHP magic methods (__construct, __destruct, __call, __get, __set, etc.) and provide use cases? What are PHP magic methods (__construct, __destruct, __call, __get, __set, etc.) and provide use cases? Apr 03, 2025 am 12:03 AM

What are the magic methods of PHP? PHP's magic methods include: 1.\_\_construct, used to initialize objects; 2.\_\_destruct, used to clean up resources; 3.\_\_call, handle non-existent method calls; 4.\_\_get, implement dynamic attribute access; 5.\_\_set, implement dynamic attribute settings. These methods are automatically called in certain situations, improving code flexibility and efficiency.

PHP and Python: Comparing Two Popular Programming Languages PHP and Python: Comparing Two Popular Programming Languages Apr 14, 2025 am 12:13 AM

PHP and Python each have their own advantages, and choose according to project requirements. 1.PHP is suitable for web development, especially for rapid development and maintenance of websites. 2. Python is suitable for data science, machine learning and artificial intelligence, with concise syntax and suitable for beginners.

PHP: A Key Language for Web Development PHP: A Key Language for Web Development Apr 13, 2025 am 12:08 AM

PHP is a scripting language widely used on the server side, especially suitable for web development. 1.PHP can embed HTML, process HTTP requests and responses, and supports a variety of databases. 2.PHP is used to generate dynamic web content, process form data, access databases, etc., with strong community support and open source resources. 3. PHP is an interpreted language, and the execution process includes lexical analysis, grammatical analysis, compilation and execution. 4.PHP can be combined with MySQL for advanced applications such as user registration systems. 5. When debugging PHP, you can use functions such as error_reporting() and var_dump(). 6. Optimize PHP code to use caching mechanisms, optimize database queries and use built-in functions. 7

PHP in Action: Real-World Examples and Applications PHP in Action: Real-World Examples and Applications Apr 14, 2025 am 12:19 AM

PHP is widely used in e-commerce, content management systems and API development. 1) E-commerce: used for shopping cart function and payment processing. 2) Content management system: used for dynamic content generation and user management. 3) API development: used for RESTful API development and API security. Through performance optimization and best practices, the efficiency and maintainability of PHP applications are improved.

See all articles