How to use distinct and group by in MySQL
Let’s talk about the general conclusion first:
When the semantics are the same and there is an index:
group by## Both # and
distinctcan use indexes with the same efficiency.
- With the same semantics and no index:
distinct
is more efficient than
group by. The reason is that both distinct and
group bywill perform grouping operations, but
group bymay perform sorting and trigger filesort, resulting in low SQL execution efficiency.
- Why
group by# when the semantics are the same and there is an index? ## and
distinct
have the same efficiency? Under what circumstances will - group by
perform a sorting operation?
Find answers to these two questions. Next, let’s take a look at the basic use of
and group by
. Usage of distinct
Usage of distinct
SELECT DISTINCT columns FROM table_name WHERE where_conditions;
For example:
mysql> select distinct age from student; +------+ | age | +------+ | 10 | | 12 | | 11 | | NULL | +------+ 4 rows in set (0.01 sec)
The keyword is used to return uniquely different values. It is used before the first field in the query statement and applies to all columns in the main clause. If a column has a NULL value and you use the
clause on the column, MySQL will retain one NULL value and delete the other NULL values because the DISTINCT
clause statement treats all NULL values as the same value. distinct Multi-column deduplication
Multi-column deduplication is performed based on the specified deduplication column information, that is, only all specified column information If they are all the same, it will be considered as duplicate information. <div class="code" style="position:relative; padding:0px; margin:0px;"><pre class='brush:php;toolbar:false;'>SELECT DISTINCT column1,column2 FROM table_name WHERE where_conditions;
mysql> select distinct sex,age from student;
+--------+------+
| sex | age |
+--------+------+
| male | 10 |
| female | 12 |
| male | 11 |
| male | NULL |
| female | 11 |
+--------+------+
5 rows in set (0.02 sec)</pre><div class="contentsignin">Copy after login</div></div>
Usage of group by
For basic deduplication, the use of
group by is similar to distinct
. Single column deduplication
Syntax:
SELECT columns FROM table_name WHERE where_conditions GROUP BY columns;
Execution:
mysql> select age from student group by age; +------+ | age | +------+ | 10 | | 12 | | 11 | | NULL | +------+ 4 rows in set (0.02 sec)
Multiple column deduplication
Syntax:
SELECT columns FROM table_name WHERE where_conditions GROUP BY columns;
Execution:
mysql> select sex,age from student group by sex,age; +--------+------+ | sex | age | +--------+------+ | male | 10 | | female | 12 | | male | 11 | | male | NULL | | female | 11 | +--------+------+ 5 rows in set (0.03 sec)
Difference example
The syntax difference between the two is that
group by can perform single-column deduplication, and the principle of group by
The results are grouped and sorted first, and then the first piece of data in each group is returned. And deduplication is performed based on the fields following group by
. For example:
mysql> select sex,age from student group by sex; +--------+-----+ | sex | age | +--------+-----+ | male | 10 | | female | 12 | +--------+-----+ 2 rows in set (0.03 sec)
distinct and group by principle
In most examples,
DISTINCT can be regarded as a special GROUP BY
, their implementation is based on grouping operations, and they can all be implemented through loose index scan and compact index scan (the content of index scan will be introduced in detail in other articles, so I will not introduce it in detail here).
and GROUP BY
can both be scanned and searched using indexes. For example, the following two SQLs (just look at the content of the extra at the end of the table), we analyze these two SQLs, we can see that in the extra, these two SQLs use compact index scanningUsing index for group -by
. So, in general, for
and GROUP BY
statements with the same semantics, we can use the same index optimization methods to optimize them. <div class="code" style="position:relative; padding:0px; margin:0px;"><pre class='brush:php;toolbar:false;'>mysql> explain select int1_index from test_distinct_groupby group by int1_index;
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | test_distinct_groupby | NULL | range | index_1 | index_1 | 5 | NULL | 955 | 100.00 | Using index for group-by |
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set (0.05 sec)
mysql> explain select distinct int1_index from test_distinct_groupby;
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| 1 | SIMPLE | test_distinct_groupby | NULL | range | index_1 | index_1 | 5 | NULL | 955 | 100.00 | Using index for group-by |
+----+-------------+-----------------------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set (0.05 sec)</pre><div class="contentsignin">Copy after login</div></div>
But for
, before MYSQL8.0, GROUP Y
will be implicitly sorted by fields by default. As you can see, the following sql statement uses a temporary table and also performs filesort.
mysql> explain select int6_bigger_random from test_distinct_groupby GROUP BY int6_bigger_random; +----+-------------+-----------------------+------------+------+---------------+------+---------+------+-------+----------+---------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-----------------------+------------+------+---------------+------+---------+------+-------+----------+---------------------------------+ | 1 | SIMPLE | test_distinct_groupby | NULL | ALL | NULL | NULL | NULL | NULL | 97402 | 100.00 | Using temporary; Using filesort | +----+-------------+-----------------------+------------+------+---------------+------+---------+------+-------+----------+---------------------------------+ 1 row in set (0.04 sec)
Implicit sorting
For implicit sorting, we can refer to MySQL’s official explanation:
https://dev.mysql.com/doc/refman/5.7 /en/order-by-optimization.html
GROUP BY implicitly sorts by default (that is, in the absence of ASC or DESC designators for GROUP BY columns). However, relying on implicit GROUP BY sorting (that is, sorting in the absence of ASC or DESC designators) or explicit sorting for GROUP BY (that is, by using explicit ASC or DESC designators for GROUP BY columns) is deprecated. To produce a given sort order, provide an ORDER BY clause.Broad explanation:
GROUP BY defaults to implicit sorting (meaning that sorting will also be performed even if the GROUP BY column does not have an ASC or DESC indicator). However, GROUP BY for explicit or implicit sorting is deprecated. To generate a given sort order, provide an ORDER BY clause.
So, before MySQL8.0,
GROUP BY will sort the results according to the effect field (the subsequent field of GROUP BY
) by default. When the index can be used, GROUP BY
does not require additional sorting operations; but when the index cannot be used for sorting, the MySQL optimizer has to choose to use a temporary table and then sort itGROUP BY
. And when the size of the result set exceeds the temporary table size set by the system, MySQL will copy the temporary table data to the disk before operating, and the execution efficiency of the statement will become extremely low. This is why MySQL has chosen to deprecate this operation (implicit sorting).
Based on the above reasons, Mysql has optimized and updated this in 8.0:
https://dev.mysql.com/doc/refman/8.0/en/order-by-optimization.html
Previously (MySQL 5.7 and lower), GROUP BY sorted implicitly under certain conditions. In MySQL 8.0, that no longer occurs, so specifying ORDER BY NULL at the end to suppress implicit sorting (as was done previously) is no longer necessary. However, query results may differ from previous MySQL versions. To produce a given sort order, provide an ORDER BY clause.
A rough explanation:
In the past (before MySQL5.7 version), Group by would perform implicit sorting based on certain conditions. In MySQL 8.0, this functionality has been removed, so it is no longer necessary to disable implicit ordering by adding order by null
, however, query results may differ from previous MySQL versions. To produce results in a given order, specify the fields to be sorted by ORDER BY
.
Therefore, our conclusion also comes out:
In the case of the same semantics and index:
group by
anddistinct
Both can use indexes and have the same efficiency. Becausegroup by
anddistinct
are nearly equivalent, distinct can be regarded as a specialgroup by
.In the case of the same semantics and no index:
distinct
is more efficient thangroup by
. The reason is that bothdistinct
andgroup by
will perform grouping operations, butgroup by
will perform implicit sorting before MySQL8.0, causing filesort to be triggered and sql execution efficiency low. However, starting from MySQL8.0, MySQL has deleted the implicit sorting. Therefore, at this time, with the same semantics and no index, the execution efficiency ofgroup by
anddistinct
is almost the same. equivalent.
Compared with distinct
, group by
has clear semantics. And since the distinct keyword will take effect on all fields, group by
is more flexible when performing composite business processing. group by
can update the data according to the grouping situation. For complex processing, such as filtering data through having
, or operating on data through aggregate functions.
The above is the detailed content of How to use distinct and group by in MySQL. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

Apache connects to a database requires the following steps: Install the database driver. Configure the web.xml file to create a connection pool. Create a JDBC data source and specify the connection settings. Use the JDBC API to access the database from Java code, including getting connections, creating statements, binding parameters, executing queries or updates, and processing results.

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

The process of starting MySQL in Docker consists of the following steps: Pull the MySQL image to create and start the container, set the root user password, and map the port verification connection Create the database and the user grants all permissions to the database

The main role of MySQL in web applications is to store and manage data. 1.MySQL efficiently processes user information, product catalogs, transaction records and other data. 2. Through SQL query, developers can extract information from the database to generate dynamic content. 3.MySQL works based on the client-server model to ensure acceptable query speed.

Laravel is a PHP framework for easy building of web applications. It provides a range of powerful features including: Installation: Install the Laravel CLI globally with Composer and create applications in the project directory. Routing: Define the relationship between the URL and the handler in routes/web.php. View: Create a view in resources/views to render the application's interface. Database Integration: Provides out-of-the-box integration with databases such as MySQL and uses migration to create and modify tables. Model and Controller: The model represents the database entity and the controller processes HTTP requests.

The key to installing MySQL elegantly is to add the official MySQL repository. The specific steps are as follows: Download the MySQL official GPG key to prevent phishing attacks. Add MySQL repository file: rpm -Uvh https://dev.mysql.com/get/mysql80-community-release-el7-3.noarch.rpm Update yum repository cache: yum update installation MySQL: yum install mysql-server startup MySQL service: systemctl start mysqld set up booting
