内存列式存储 vs Buffer Cache
Oracle DB 12c的In-Memory选项(DBIM)将表中列的所有行的数据载入内存,为何不能像Buffer Cache那样只把频繁访问的数据块置入内存中呢? 内存列式存储和Buffer Cache的访问模式 原因是两者支持的访问模式不同,对于Buffer Cache,支持的是OLTP应用,访问模式为
Oracle DB 12c的In-Memory选项(DBIM)将表中列的所有行的数据载入内存,为何不能像Buffer Cache那样只把频繁访问的数据块置入内存中呢?
内存列式存储和Buffer Cache的访问模式
原因是两者支持的访问模式不同,对于Buffer Cache,支持的是OLTP应用,访问模式为non-uniform access patterns,也就是说表中的某些行访问比其它行频繁,因此才能通过只缓存10%的数据,就可以涵盖95%的数据访问。可以假设缓存10%的数据就可以得到20倍的性能提升。
而内存列式存储支持的是分析型应用,访问的是少数列,但却需要扫描表中所有行的数据。缓存部分行的数据意义不大,例如如果内存列式存储可以得到100倍性能提升,如果只缓存表中10%的数据只能得到1.1倍的性能提升,而不是100倍。所以在DBIM的设置中,你可以指定全表,表中的部分列,部分分区,表空间,但你无法用where条件指定只缓存列中的部分行。
所以对于分析性的应用,之所以内存列式存储比行式存储(即使是通过alter table tablename cache 全部缓存到内存)快,最重要的原因就是列存储的格式非常适合分析性应用。
列存储格式
以下的图很好的说明了为何列式存储适合分析。
如果采用传统的行式存储来进行分析,例如查询第4列,这时需要逐行访问,需要查询第1到第3列这些无关数据。
如果是采用列式存储,则只需要访问第4列即可,避免了无效I/O,效率自然提升了。
看一下Oracle在Open World 2013上发布的测试结果:
行式和列式都是在内存中,DBIM快了近800倍,单个core每1/6秒处理30亿行数据,难以置信?!
SIMD
这时以前在高性能计算和图像处理中使用的技术,即Single Instruction Multiple Data,其实就是对于数据的批处理,只不过其非常适合列式数据。
storage index
storage index其实在Exadata中就有了,其实就是将列分区为IMCU,预先计算和实时维护好每一个IMCU中的最大和最小值,查询时匹配where条件,就可以跳过许多无关的IMCU,从而节省I/O和时间。原理上和分区是类似的。
不过数据库重启后需要重新计算。
压缩
列式存储通常都会压缩,因为其中的数据重复值较多,DBIM中压缩是缺省的选项。
压缩不仅可以在内存中缓存更多的数据,而且还可以减少I/O。不过考虑到如果有较多OLTP的访问,这时不要选取压缩比较高的压缩方式,以免压缩和解压时消耗过多的资源。
内存中Join和Aggregation的优化
通过Bloom Filter将Join转换为列扫描可以加快Join速度,在内存中更是如此。
而通过key vector,原理与Bloom Filter类似,也可以在线构建聚集表的结果,具体原理看白皮书。
参考
In-Memory Column Store versus the Buffer Cache

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Solutions to Oracle cannot be opened include: 1. Start the database service; 2. Start the listener; 3. Check port conflicts; 4. Set environment variables correctly; 5. Make sure the firewall or antivirus software does not block the connection; 6. Check whether the server is closed; 7. Use RMAN to recover corrupt files; 8. Check whether the TNS service name is correct; 9. Check network connection; 10. Reinstall Oracle software.

The method to solve the Oracle cursor closure problem includes: explicitly closing the cursor using the CLOSE statement. Declare the cursor in the FOR UPDATE clause so that it automatically closes after the scope is ended. Declare the cursor in the USING clause so that it automatically closes when the associated PL/SQL variable is closed. Use exception handling to ensure that the cursor is closed in any exception situation. Use the connection pool to automatically close the cursor. Disable automatic submission and delay cursor closing.

In Oracle, the FOR LOOP loop can create cursors dynamically. The steps are: 1. Define the cursor type; 2. Create the loop; 3. Create the cursor dynamically; 4. Execute the cursor; 5. Close the cursor. Example: A cursor can be created cycle-by-circuit to display the names and salaries of the top 10 employees.

Oracle database paging uses ROWNUM pseudo-columns or FETCH statements to implement: ROWNUM pseudo-columns are used to filter results by row numbers and are suitable for complex queries. The FETCH statement is used to get the specified number of first rows and is suitable for simple queries.

To stop an Oracle database, perform the following steps: 1. Connect to the database; 2. Shutdown immediately; 3. Shutdown abort completely.

Building a Hadoop Distributed File System (HDFS) on a CentOS system requires multiple steps. This article provides a brief configuration guide. 1. Prepare to install JDK in the early stage: Install JavaDevelopmentKit (JDK) on all nodes, and the version must be compatible with Hadoop. The installation package can be downloaded from the Oracle official website. Environment variable configuration: Edit /etc/profile file, set Java and Hadoop environment variables, so that the system can find the installation path of JDK and Hadoop. 2. Security configuration: SSH password-free login to generate SSH key: Use the ssh-keygen command on each node

SQL statements can be created and executed based on runtime input by using Oracle's dynamic SQL. The steps include: preparing an empty string variable to store dynamically generated SQL statements. Use the EXECUTE IMMEDIATE or PREPARE statement to compile and execute dynamic SQL statements. Use bind variable to pass user input or other dynamic values to dynamic SQL. Use EXECUTE IMMEDIATE or EXECUTE to execute dynamic SQL statements.

Oracle garbled problems can be solved by checking the database character set to ensure they match the data. Set the client character set to match the database. Convert data or modify column character sets to match database character sets. Use Unicode character sets and avoid multibyte character sets. Check that the language settings of the database and client are correct.
