Oracle 11g统计信息收集--多列统计信息的收集
我们在写SQL语句的时候,有的时候会碰到where子句后面有多个条件的情况,也就是根据多列的条件筛选得到数据。默认情况下,oracle
我们在写SQL语句的时候,有的时候会碰到where子句后面有多个条件的情况,也就是根据多列的条件筛选得到数据。默认情况下,Oracle会把多列的选择性(selectivity)相乘从而得到where语句的选择性,这样有可能会让Oracle的选择性变的不够准确,从而导致优化器做出错误的判断。比如对于汽车厂商和汽车型号,实际上是有关联关系的,一旦你知道了汽车的型号,就能判断出是哪一个厂商的汽车。再比如说酒店星级和酒店价格等级也有类似的对应关系。为了能够让优化器做出准确的判断,从而生成准确的执行计划,oracle在11g数据库中引入了多列统计信息的概念。
选择性:在本例中是 1/唯一值
我们有一张表BOOKS,两个列hotel_id,rate_category,我们来看一下这两列的数据分布:
SQL> select hotel_id,rate_category,count(1) from books
2 group by hotel_id,rate_category
3 order by hotel_id;
HOTEL_ID RATE_CATEGORY COUNT(1)
---------- ------------- ----------
10 11 19943
10 12 39385
10 13 20036
20 21 5106
20 22 10041
20 23 5039
6 rows selected.
仔细检查数据:hotel_id 10 的 rate_category 列仅包含 11、12 和 13,而 hotel_id 20 的该列仅包含 21、22 和 23(11、12 和 13 一个都不包含)。为什
么?原因可能与酒店的星级有关。酒店 20 是一家定价较高的酒店,而租金等级 11、12 和 13 是较低的等级,因此它们不适用于一家高收费的酒店。同样地,
21、22 和 23 是较高的租金等级,因此它们不适用于酒店 10 这样的经济型酒店。而且,酒店 10 的房间预定数量多于酒店 20。
在表books的两个列上创建索引,并收集表的统计信息。
SQL> create index book_idx1 on books(hotel_id);
Index created.
SQL> create index book_idx2 on books(rate_category);
Index created.
SQL> analyze table books compute statistics;
Table analyzed.
如果我们要找到表中满足条件20号酒店价格等级是21的记录,执行计划会是什么样子呢?
SQL> set autotrace trace exp
SQL> select hotel_id,rate_category from books where hotel_id=20 and rate_category=21;
Execution Plan
----------------------------------------------------------
Plan hash value: 2688610195
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8296 | 33184 | 47 (3)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| BOOKS | 8296 | 33184 | 47 (3)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("RATE_CATEGORY"=21 AND "HOTEL_ID"=20)
SQL> set autotrace off
SQL> select count(1) from books;
COUNT(1)
----------
99550
SQL> select 99550/8296 from dual;
99550/8296
----------
11.9997589
从上例中可以看到,oracle选择了走全表扫描,判定的记录条数是8296条,而我么表中真实的数据是5106条,对于整张表99550条记录来说,应当可以使用到索引的。但是oracle没有,因为oracle会把两个列分别考虑,而计算出来的选择性是hotel_id 1/2,rate_category 1/6,从而得到了语句的选择性是1/12,这也就
是我们在执行计划中看到8296(99550*1/12)条记录的原因。
为了能够让oracle得到准确的执行记录,我们可以采取两个方法
1.使用程序包 dbms_stats 中的新函数 create_extended_stats 创建一个虚拟列,然后对表收集统计信息。
大致如下:
dbms_stats.create_extended_stats('SCOTT', 'BOOKS','(HOTEL_ID, RATE_CATEGORY)')
下次再收集表的统计信息时,将会自动收集您的列组的多列统计信息。
2.直接在程序包 dbms_stats 指定method_opt,收集统计信息时,把列组合作为单独列使用
在这里我们使用第二种方法
SQL> begin
2 dbms_stats.gather_table_stats (
3 ownname => 'SCOTT',
4 tabname => 'BOOKS',
5 estimate_percent=> 100,
6 method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (HOTEL_ID,RATE_CATEGORY)',
7 cascade => TRUE
8 );
9 end;
10 /
PL/SQL procedure successfully completed.
收集完列组统计信息后,再来看一下语句的执行计划
SQL> set autotrace trace exp
SQL> select hotel_id,rate_category from books where hotel_id=20 and rate_category=21;
Execution Plan
----------------------------------------------------------
Plan hash value: 1484887743
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5106 | 30636 | 19 (0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| BOOKS | 5106 | 30636 | 19 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | BOOK_IDX2 | 5106 | | 11 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("HOTEL_ID"=20)
2 - access("RATE_CATEGORY"=21)
该输出清晰地显示索引 BOOK_IDX2 已使用。为什么现在使用了索引?注意“Rows”列下方的值 (5106)。优化程序正确地确定了值组合的行数的估计值,而非分开的各个值的行数的估计值。
当然了,对于其他的条件,oracle也可以做出准确的判断
SQL> set autotrace trace exp
SQL> select hotel_id,rate_category from books where hotel_id=10 and rate_category=12;
Execution Plan
----------------------------------------------------------
Plan hash value: 2688610195

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics











The main role of MySQL in web applications is to store and manage data. 1.MySQL efficiently processes user information, product catalogs, transaction records and other data. 2. Through SQL query, developers can extract information from the database to generate dynamic content. 3.MySQL works based on the client-server model to ensure acceptable query speed.

InnoDB uses redologs and undologs to ensure data consistency and reliability. 1.redologs record data page modification to ensure crash recovery and transaction persistence. 2.undologs records the original data value and supports transaction rollback and MVCC.

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

Compared with other programming languages, MySQL is mainly used to store and manage data, while other languages such as Python, Java, and C are used for logical processing and application development. MySQL is known for its high performance, scalability and cross-platform support, suitable for data management needs, while other languages have advantages in their respective fields such as data analytics, enterprise applications, and system programming.

MySQL is suitable for small and large enterprises. 1) Small businesses can use MySQL for basic data management, such as storing customer information. 2) Large enterprises can use MySQL to process massive data and complex business logic to optimize query performance and transaction processing.

MySQL index cardinality has a significant impact on query performance: 1. High cardinality index can more effectively narrow the data range and improve query efficiency; 2. Low cardinality index may lead to full table scanning and reduce query performance; 3. In joint index, high cardinality sequences should be placed in front to optimize query.

The basic operations of MySQL include creating databases, tables, and using SQL to perform CRUD operations on data. 1. Create a database: CREATEDATABASEmy_first_db; 2. Create a table: CREATETABLEbooks(idINTAUTO_INCREMENTPRIMARYKEY, titleVARCHAR(100)NOTNULL, authorVARCHAR(100)NOTNULL, published_yearINT); 3. Insert data: INSERTINTObooks(title, author, published_year)VA

MySQL is suitable for web applications and content management systems and is popular for its open source, high performance and ease of use. 1) Compared with PostgreSQL, MySQL performs better in simple queries and high concurrent read operations. 2) Compared with Oracle, MySQL is more popular among small and medium-sized enterprises because of its open source and low cost. 3) Compared with Microsoft SQL Server, MySQL is more suitable for cross-platform applications. 4) Unlike MongoDB, MySQL is more suitable for structured data and transaction processing.
