PostgreSQL's journey of exploration-LINUX-php.cn

Home

System Tutorial

LINUX

PostgreSQL's journey of exploration

王林

Jan 17, 2024 am 08:15 AM

linux linux tutorial Red Hat linux system linux command linux certification red hat linux linux video

PostgreSQLs journey of exploration

Postgres has several index types, and every new version seems to add some new index types. Each index type is useful, but which type to use depends on (1) the type of data, sometimes (2) the underlying data in the table and (3) the type of lookup performed. In the following content, we will introduce the index types you can use in Postgres, and when you should use which index type. Before we get started, here's a list of index types we'll walk you through:

B-Tree
Generalized Inverted Index (GIN)
Generalized Inverted Seach Tree (GiST)
Space partitioned GiST (SP-GiST)
Block Range Index (BRIN)
Hash
Now let’s start with indexing.

In Postgres, B-Tree indexes are the most common indexes you use

If you have a computer science degree, then B-Tree indexing may be the first index you learn. B-tree indexes create a tree that always maintains its balance. When it looks for something based on the index, it walks the tree to find the key and returns the data you're looking for. Using an index is significantly faster than a sequential scan because it only needs to read a few pages (when you are returning only a few records) as opposed to sequentially scanning thousands of records.

If you run a standard CREATE INDEX statement, it will create a B-tree index for you. B-tree indexes are valuable on most data types, such as text, numbers, and timestamps. If you're just starting to use indexes in your database, and don't use too many of Postgres' advanced features on your database, using a standard B-Tree index is probably your best option.

GIN index for multi-valued columns

Generalized Inverted Index, generally called GIN, is mostly suitable for data types when a single column contains multiple values.

According to Postgres documentation:

"GIN is designed to handle situations where the entry being indexed is a compound value, and the query processed by the index needs to search for values that occur in the compound entry. For example, this entry might be a document, and the query can search for the value contained in the document. Specify characters."

The most common data types included in this range are:

hStore
Array
Range
JSONB
One of the most satisfying things about GIN indexes is their ability to understand data stored in composite values. However, because a GIN index requires specific knowledge of the data structure for each individual type being added, not all data types are supported by the GIN index.

GiST index, for rows with overlapping values

Generalized Inverted Seach Tree (GiST) index is mostly suitable when your data overlaps with other rows of data in the same column. The best use of GiST indexes is if you declare a geometry data type and you want to know whether two polygons contain some points. In one case a particular point may be contained in a box, while at the same time, other points only exist in a polygon. Common data types indexed using GiST are:

Geometry type
Text type requiring full-text search
There are many fixed limits on the size of GiST indexes, otherwise, GiST indexes may become extremely large. At the cost of this, GiST indexes are lossy (inexact).

According to official documentation:

"GiST indexes are lossy, which means that the index may produce false matches, so it is necessary to check the real table rows to eliminate false matches. (PostgreSQL will automatically perform this action when necessary)"

This does not mean that you will get a false result, it just means that Postgres will do a small extra work to filter these false results before returning the data to you.

Special note: GIN and GiST indexes can often be used on the same data type. Usually one has good performance but takes up a lot of disk space, and vice versa. When it comes to GIN vs. GiST, there is no one-size-fits-all solution that will work in every situation, but the above rules should apply to most common situations.

SP-GiST index for larger data

The spatially partitioned GiST (SP-GiST) index adopts the spatially partitioned tree from Purdue Research. SP-GiST indexes are often used when your data has a natural clustering factor and is not a balanced tree. Phone numbers are a great example (at least US phone numbers are). They have the following format:

3-digit area code
3-digit prefix number (related to old telephone exchanges)
4-digit line number
This means that there is a natural clustering factor at the first three digits of the first set, followed by the second set of three digits, and then the numbers are evenly distributed. However, in some area codes of phone numbers, there is a higher saturation state than in others. The result can be a very unbalanced tree. Because there is a natural aggregation factor at the front and the data is not equally distributed, data like phone numbers might be a good case for SP-GiST.

BRIN index, for larger data

Block range indexes (BRIN) focus on some situations like SP-GiST, they are best used when the data has some natural ordering, and the data volume is often large. If you have a billion records in chronological order, BRIN may come in handy. If you are querying a large set of data that is naturally grouped, such as several zip codes, BRIN can help you ensure that similar zip codes are stored in close locations on disk.

When you have a very large database sorted by date or zip code, the BRIN index allows you to skip or exclude some unnecessary data very quickly. Additionally, BRIN indexes are relatively small compared to the overall data size, so when you have a large data set, BRIN indexes can perform better.

Hash index, finally not afraid of crash

Hash indexes have been present in Postgres for many years, however, until Postgres 10 was released, there was a huge caveat about their use, it was not WAL-logged. This means that if your server crashes and you can't failover to a standby or restore from an archive using something like wal-g, then you will lose that index until you rebuild it. With the release of Postgres 10, they are now WAL-logged, so you may consider using them again, but the real question is, should you?

Hash indexes sometimes provide faster lookups than B-Tree indexes, and are also fast to create. The biggest problem is that they are restricted to only "equality" comparison operations, so you can only use them for exact match lookups. This makes hash indexes much less flexible than commonly used B-Tree indexes, and you shouldn't think of them as a replacement, but as an index for special cases.

Which one should you use?

We have just introduced a lot, and it is normal if you are a little scared. If you know this before, CREATE INDEX will always create an index for you using a B-Tree, and the good news is that Postgres performs very well or very well for most databases. :) If you're considering using more Postgres features, here's a cheat sheet when you use other Postgres index types:

B-Tree - suitable for most data types and queries
GIN - for JSONB/hstore/arrays
GiST - suitable for full-text search and geometric data types
SP-GiST - suitable for large data sets that have natural aggregation factors but are unevenly distributed
BRIN - suitable for really large data sets with sequential order
Hash - good for equality operations, but usually a B-Tree index is still all you need.
If you have any questions or feedback about this article, feel free to join our slack channel.

The above is the detailed content of PostgreSQL's journey of exploration. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website

The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress images for free

Clothoff.io

AI clothes remover

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Grow A Garden - Complete Mutation Guide

3 weeks ago By DDD

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

How to fix KB5055612 fails to install in Windows 10?

3 weeks ago By DDD

Nordhold: Fusion System, Explained

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Mandragora: Whispers Of The Witch Tree - How To Unlock The Grappling Hook

3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial

1665

CakePHP Tutorial

1424

Laravel Tutorial

1322

PHP Tutorial

1269

C# Tutorial

1249

Related knowledge

Linux Architecture: Unveiling the 5 Basic Components Apr 20, 2025 am 12:04 AM

The five basic components of the Linux system are: 1. Kernel, 2. System library, 3. System utilities, 4. Graphical user interface, 5. Applications. The kernel manages hardware resources, the system library provides precompiled functions, system utilities are used for system management, the GUI provides visual interaction, and applications use these components to implement functions.

How to check the warehouse address of git Apr 17, 2025 pm 01:54 PM

To view the Git repository address, perform the following steps: 1. Open the command line and navigate to the repository directory; 2. Run the "git remote -v" command; 3. View the repository name in the output and its corresponding address.

How to run java code in notepad Apr 16, 2025 pm 07:39 PM

Although Notepad cannot run Java code directly, it can be achieved by using other tools: using the command line compiler (javac) to generate a bytecode file (filename.class). Use the Java interpreter (java) to interpret bytecode, execute the code, and output the result.

How to run sublime after writing the code Apr 16, 2025 am 08:51 AM

There are six ways to run code in Sublime: through hotkeys, menus, build systems, command lines, set default build systems, and custom build commands, and run individual files/projects by right-clicking on projects/files. The build system availability depends on the installation of Sublime Text.

What is the main purpose of Linux? Apr 16, 2025 am 12:19 AM

The main uses of Linux include: 1. Server operating system, 2. Embedded system, 3. Desktop operating system, 4. Development and testing environment. Linux excels in these areas, providing stability, security and efficient development tools.

laravel installation code Apr 18, 2025 pm 12:30 PM

To install Laravel, follow these steps in sequence: Install Composer (for macOS/Linux and Windows) Install Laravel Installer Create a new project Start Service Access Application (URL: http://127.0.0.1:8000) Set up the database connection (if required)

git software installation Apr 17, 2025 am 11:57 AM

Installing Git software includes the following steps: Download the installation package and run the installation package to verify the installation configuration Git installation Git Bash (Windows only)

How to use sublime shortcut keys Apr 16, 2025 am 08:57 AM

Sublime Text provides shortcuts to improve development efficiency, including commonly used (save, copy, cut, etc.), editing (indentation, formatting, etc.), navigation (project panel, file browsing, etc.), and finding and replacing shortcuts. Proficiency in using these shortcut keys can significantly improve Sublime's efficiency.

See all articles