How to design a MySQL high-performance table
Good logical design and physical design are the cornerstones of high performance. The schema should be designed according to the query statements that the system will execute, which often requires weighing various factors.
MySQL supports many data types. Choosing the correct data type is crucial to achieving high performance.
Smaller is usually better
Smaller data types are generally faster because they occupy less disk, memory, and CPU cache, and require fewer CPU cycles to process.
Just keep it simple
Operations on simple data types generally require fewer CPU cycles. For example, integer operations are cheaper than character operations because the character set and collation rules (collation) make character comparisons more complex than integer comparisons.
Try to avoid NULL
If the query contains NULL columns, it is more difficult for MySQL to optimize, because NULL columns make indexes, index statistics, and value comparisons more complex. Columns that can be NULL use more storage space and require special handling in MySQL. When NULLable columns are indexed, each index record requires an extra byte, which in MyISAM can even cause a fixed-size index (such as an index with only one integer column) to become a variable-size index.
Of course there are exceptions. For example, InnoDB uses a separate bit to store NULL values, so it has good space efficiency for sparse data.
1.Integer type
There are two types of numbers: whole numbers and real numbers. If you store integers, you can use these integer types: TINYINT, SMALLINT, MEDIUMINT, INT, BIGINT. Use 8, 16, 24, 32, and 64-bit storage space respectively.
The integer type has the optional **UNSIGNED ** attribute, which means that negative values are not allowed, which roughly doubles the upper limit of positive numbers. For example, TINYINT.UNSIGNED can store the range 0 - 255, while the storage range of TINYINT is -128 -127.
Signed and unsigned types use the same storage space and have the same performance, so you can choose the appropriate type according to the actual situation.
Your choice determines how MySQL saves data in memory and disk. However, integer calculations generally use 64-bit BIGINT integers, even in a 32-bit environment. (The exception is some aggregate functions, which use DECIMAL or DOUBLE for calculations).
MySQL can specify the width for integer types, such as INT(11), which is meaningless for most applications: it does not limit the legal range of values, but only specifies some interactive tools of MySQL (such as the MySQL command line client end) is used to display the number of characters. For storage and calculation purposes, INT(1) and INT(20) are the same.
2.Real number type
Real numbers are numbers with a decimal part. However, they are not just for storing decimal parts, DECIMAL can also be used to store integers larger than BIGINT.
FLOAT and DOUBLE types support approximate calculations using standard floating point operations.
DECIMAL type is used to store precise decimals.
Both floating point and DECIMAL types can specify precision. For DECIMAL columns, you can specify the maximum number of digits allowed before and after the decimal point. This affects the column's space consumption.
There are many ways to specify the precision required for floating-point columns, which will cause MySQL to choose a different data type, or to round off the value when storing. These precision definitions are non-standard, so we recommend specifying only the data type and not the precision.
Floating point types usually use less space than DECIMAL when storing values in the same range. FLOAT uses 4 bytes of storage. DOUBLE occupies 8 bytes and has higher precision and larger range than FLOAT. As with integer types, all you can choose is the storage type; MySQL uses DOUBLE as the type for internal floating point calculations.
Because of the additional space and computational overhead required, you should try to use DECIMAL only when performing precise calculations on decimals. But when the data is relatively large, you can consider using BIGINT instead of DECIMAL. Just multiply the currency unit to be stored by the corresponding multiple according to the number of decimal places.
3. String type
VARCHAR
- Used to store variable length strings, the length is supported to 65535
- Need to use 1 or 2 extra bytes to record the length of the string
- Suitable for: the maximum length of the string is much larger than the average length; updates are rare
CHAR
- Fixed length, the length range is 1~255
- Suitable for: storing very short strings, or all values close to the same length; frequently changed
Generosity is unwise
The space overhead of using VARCHAR(5) and VARCHAR(200) to store 'hello' is the same. So are there any advantages to using shorter columns?
It turns out there are great advantages. Longer columns consume more memory because MySQL typically allocates fixed-size blocks of memory to hold internal values. This is especially bad when using in-memory temporary tables for sorting or operations. It's equally bad when sorting using disk temporary tables.
So the best strategy is to allocate only the space you really need.
4.BLOB and TEXT types
BLOB and TEXT are both string data types designed to store large amounts of data, and are stored in binary and character modes respectively.
Unlike other types, MySQL treats each BLOB and TEXT value as an independent object. Storage engines usually do special processing when storing. When the BLOB and TEXT values are too large, InnoDB will use a dedicated "external" storage area for storage. At this time, each value requires 1 - 4 bytes to be stored in the row. The storage area stores the actual value.
The only difference between BLOB and TEXT is that the BLOB type stores binary data without collation rules or character sets, while the TEXT type has character sets and collation rules
5. Date and time types
Most of the time there are no alternatives to the type, so there is no question of what is the best choice. The only problem is what needs to be done when saving the date and time. MySQL provides two similar date types: DATE TIME and TIMESTAMP.
But currently we recommend the method of storing timestamps, so there will no longer be too much explanation on DATE TIME and TIMESTAMP here.
6.Other types
6.1 Select identifier
The smallest data type should be selected on the premise that it can meet the needs of the value range and reserve room for future growth.
- Integer type
Integers are usually the best choice for identity columns because they are fast and can use AUTO_INCREMENT.
- ENUM and SET types
The EMUM and SET types are generally a poor choice for identity columns, although they may be fine for some static "definition tables" that only contain fixed states or types. ENUM and SET columns are suitable for storing fixed information, such as ordered status, product type, and person's gender.
- String type
If possible, you should avoid using string types as identity columns because they are space-consuming and generally slower than numeric types.
You also need to pay more attention to completely "random" strings, such as strings generated by MDS(), SHAl() or UUID(). The new values generated by these functions are arbitrarily distributed over a large space, which can cause INSERT and some SELECT statements to be very slow. If UUID values are stored, the "-" sign should be removed.
6.2 Special type data
Some types of data wells are not directly consistent with built-in types. Timestamps with low kilosecond precision are one example. Another example is a 1Pv4 address. People often use VARCHAR(15) columns to store IP addresses. However, they are actually 32-bit unsigned integers, not strings. The representation of the address divided into four segments using decimal points is just to make it easier for people to read. So IP addresses should be stored as unsigned integers. MySQL provides INET_ATON() and INET_NTOA() functions to convert between these two representation methods.
2. Table structure design1. Paradigm and anti-paradigm
There are usually many ways to represent any given data, from fully normalized to fully denormalized, and a compromise between the two. In a normalized database, each fact appears exactly once. In contrast, in a denormalized database, information is redundant and may be stored in multiple places.
Advantages and Disadvantages of Paradigms
When considering performance improvement, it is often recommended to normalize the schema, especially in write-intensive scenarios.
- Normalized update operations are usually faster than denormalized.
- When data is well normalized, there is little or no duplicate data, so less data needs to be modified.
- Normalized tables are usually smaller and fit better in memory, so operations will be performed faster.
- Less redundant data means fewer DISTINCT or GROUP BY statements are needed to retrieve list data.
Advantages and disadvantages of anti-paradigm
Without the need for related tables, the worst case for most queries—even if the table does not use an index—is a full table scan. This can be much faster than associative when the data is larger than memory because random I/0 is avoided.
Individual tables can also use more efficient indexing strategies.
Mixing normalization and denormalization
In practical applications, it is often necessary to mix them, and partially normalized schemas, cache tables, and other techniques may be used.
Add redundant fields appropriately to the table, such as performance priority, but it will increase complexity. Table related queries can be avoided.
Simple and familiar database paradigm
<br>
First normal form (1NF): Field values are atomic and cannot be divided (all relational database systems satisfy the first normal form);<br>
For example: name field, where the surname and the first name are a whole. If the surname and the first name are distinguished, two independent fields must be set up;
Second Normal Form (2NF): A table must have a primary key, that is, each row of data can be uniquely distinguished;
Note: The first normal form must be satisfied first;
Third Normal Form (3NF): A table cannot contain information about non-key fields in other related tables, that is, the data table cannot have redundant fields;
Note: The second normal form must be satisfied first;
2. Table fields are less precise
- I/O efficient
- Separate fields are easy to maintain
- Single table 1G volume 500W evaluation
- A single line shall not exceed 200Byte
- No more than 50 INT fields in a single table
- No more than 20 CHAR(10) fields in a single table
- It is recommended that the number of fields in a single table be controlled within 20
- Split TEXT/BLOB, the TEXT type processing performance is much lower than VARCHAR, forcing the generation of hard disk temporary tables wastes more space.
The above is the detailed content of How to design a MySQL high-performance table. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

VS Code system requirements: Operating system: Windows 10 and above, macOS 10.12 and above, Linux distribution processor: minimum 1.6 GHz, recommended 2.0 GHz and above memory: minimum 512 MB, recommended 4 GB and above storage space: minimum 250 MB, recommended 1 GB and above other requirements: stable network connection, Xorg/Wayland (Linux)

The five basic components of the Linux system are: 1. Kernel, 2. System library, 3. System utilities, 4. Graphical user interface, 5. Applications. The kernel manages hardware resources, the system library provides precompiled functions, system utilities are used for system management, the GUI provides visual interaction, and applications use these components to implement functions.

vscode built-in terminal is a development tool that allows running commands and scripts within the editor to simplify the development process. How to use vscode terminal: Open the terminal with the shortcut key (Ctrl/Cmd). Enter a command or run the script. Use hotkeys (such as Ctrl L to clear the terminal). Change the working directory (such as the cd command). Advanced features include debug mode, automatic code snippet completion, and interactive command history.

To view the Git repository address, perform the following steps: 1. Open the command line and navigate to the repository directory; 2. Run the "git remote -v" command; 3. View the repository name in the output and its corresponding address.

Although Notepad cannot run Java code directly, it can be achieved by using other tools: using the command line compiler (javac) to generate a bytecode file (filename.class). Use the Java interpreter (java) to interpret bytecode, execute the code, and output the result.

Writing code in Visual Studio Code (VSCode) is simple and easy to use. Just install VSCode, create a project, select a language, create a file, write code, save and run it. The advantages of VSCode include cross-platform, free and open source, powerful features, rich extensions, and lightweight and fast.

The main uses of Linux include: 1. Server operating system, 2. Embedded system, 3. Desktop operating system, 4. Development and testing environment. Linux excels in these areas, providing stability, security and efficient development tools.

Causes and solutions for the VS Code terminal commands not available: The necessary tools are not installed (Windows: WSL; macOS: Xcode command line tools) Path configuration is wrong (add executable files to PATH environment variables) Permission issues (run VS Code as administrator) Firewall or proxy restrictions (check settings, unrestrictions) Terminal settings are incorrect (enable use of external terminals) VS Code installation is corrupt (reinstall or update) Terminal configuration is incompatible (try different terminal types or commands) Specific environment variables are missing (set necessary environment variables)
