Home Database Mysql Tutorial MySQL批量插入遇上唯一索引避免方法_MySQL

MySQL批量插入遇上唯一索引避免方法_MySQL

Jun 01, 2016 pm 01:23 PM

bitsCN.com

一、背景

以前使用SQL Server进行表分区的时候就碰到很多关于唯一索引的问题:Step8:SQL Server 当表分区遇上唯一约束,没想到在MySQL的分区中一样会遇到这样的问题:MySQL表分区实战。

今天我们来了解MySQL唯一索引的一些知识:包括如何创建,如何批量插入,还有一些技巧上SQL;

这些问题的根源在什么地方?有什么共同点?MySQL中也有分区对齐的概念?唯一索引是在很多系统中都会出现的要求,有什么办法可以避免?它对性能的影响有多大?

二、过程

(一) 导入差异数据,忽略重复数据,IGNORE INTO的使用

在MySQL创建表的时候,我们通常创建一个表的时候是以一个自增ID值作为主键,那么MySQL就会以PRIMARY KEY作为聚集索引键和主键,既然是主键,那当然是唯一的了,所以重复执行下面的插入语句会报1062错误:如Figure1所示;

-- 创建测试表
CREATE TABLE `testtable` (
`Id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`UserId` INT(11) DEFAULT NULL,
`UserName` VARCHAR(10) DEFAULT NULL,
`UserType` INT(11) DEFAULT NULL,
PRIMARY KEY (`Id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;

-- 插入测试数据
INSERT INTO testtable(Id,UserId,UserName,UserType)
VALUES(1,101,'aa',1),(2,102,'bbb',2),(3,103,'ccc',3);

u1_1062

(Figure1:Duplicate entry '1' for key 'PRIMARY')

但是在实际的生产环境中,需求往往是需要在UserId键值中设置唯一索引,今天我就以这个作为示例,进行唯一索引的测试:

-- 创建测试表1
CREATE TABLE `testtable1` (
`Id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`UserId` INT(11) DEFAULT NULL,
`UserName` VARCHAR(10) DEFAULT NULL,
`UserType` INT(11) DEFAULT NULL,
PRIMARY KEY (`Id`),
UNIQUE KEY `IX_UserId` (`UserId`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;

-- 创建测试表2
CREATE TABLE `testtable2` (
`Id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`UserId` INT(11) DEFAULT NULL,
`UserName` VARCHAR(10) DEFAULT NULL,
`UserType` INT(11) DEFAULT NULL,
PRIMARY KEY (`Id`),
UNIQUE KEY `IX_UserId` (`UserId`)
) ENGINE=INNODB DEFAULT CHARSET=utf8;

-- 插入测试数据1
INSERT INTO testtable1(Id,UserId,UserName,UserType)
VALUES(1,101,'aa',1),(2,102,'bbb',2),(3,103,'ccc',3);

-- 插入测试数据2
INSERT INTO testtable2(Id,UserId,UserName,UserType)
VALUES(1,201,'aaa',1),(2,202,'bbb',2),(3,203,'ccc',3),(4,101,'xxxx',5);

u2_table1

(Figure2:testtable1记录)

u3_table2

(Figure3:testtable2记录)

通过执行上面的SQL脚本,我们在testtable1和testtable2都创建了唯一索引:UNIQUE KEY `IX_UserId` (`UserId`),这就说明UserId在testtable1和testtable2表中都是唯一的,如果把testtable2的数据批量导入到testtable1,如果执行下面【导入1】的SQL,就会出现1062的错误,导致整个过程会回滚,没有达到导入差异数据的目的。

INSERT INTO testtable1(UserId,UserName,UserType)
SELECT UserId,UserName,UserType FROM testtable2;

u4_unique

(Figure4:Duplicate entry '101' for key 'IX_UserId')

MySQL提供一个关键字:IGNORE,这个关键字判断每条记录是否存在,是否违反饿了表中的唯一索引,如果存在就不插入,而不存在的记录就会插入。

-- 导入2
INSERT IGNORE INTO testtable1(UserId,UserName,UserType)
SELECT UserId,UserName,UserType FROM testtable2;

所以执行完【导入2】,就会产生Figure5的结果,这已经达到了我们的目的了,但是你有没发现自增的ID值跳过了一些值,这是因为我们之前执行【导入1】失败造成的,虽然我们的事务回滚了,但是自增ID会出现断层。在SQL Server中也会有这样的问题。扩展阅读:简单实用SQL脚本Part:查找SQL Server 自增ID值不连续记录

u5_效果

(Figure5:IGNORE效果)

(二) 导入并覆盖重复数据,REPLACE INTO 的使用

1. 把testtable1和testtable2分别回滚到Figure2和Figure3的状态(使用TRUNCATE TABLE命名再执行Insert语句),这个时候再执行下面的SQL,看有什么效果:

-- 导入3
REPLACE INTO testtable1(UserId,UserName)
SELECT UserId,UserName FROM testtable2;

u6_rep

(Figure6:REPLACE效果)

从上图Figure6中,我们可以看到:UserId为101的记录发生了改变,不单UserName修改了,而且UserType也变为NULL了。

所以,如果导入中发现了重复的,先删除再插入,如果记录有多个字段,在插入的时候如果有的字段没有赋值,那么新插入的记录这些字段为空(新插入记录的UserType都为NULL)。

需要注意的是,当你replace的时候,如果被插入的表如果没有指定列,会用NULL表示,而不是这个表原来的内容。如果插入的内容列和被插入的表列一样,则不会出现NULL。

2. 如果我们表结构UserType字段不允许为空,而且没有默认值的情况,执行【导入3】会发生什么事情呢?

u7_not null

(Figure7:返回警告信息)

u8_0

(Figure8:UserType被设置为0)

通过Figure7和Figure8,我们知道数据记录还是插入了,只是返回Field 'UserType' doesn't have a default value的警告,插入记录的UserType字段都被设置为0('UserType' 为int数据类型)。

3. 如果我们希望导入的时候一起更新UserType字段的值,这自然很简单了,使用下面的SQL脚本就可以解决:

-- 导入4
REPLACE INTO testtable1(UserId,UserName,UserType)
SELECT UserId,UserName,UserType FROM testtable2;

u9_rep

(Figure9:一起更新UserType)

(三) 导入保留重复数据未指定字段,INSERT INTO ON DUPLICATE KEY UPDATE 的使用

把testtable1和testtable2分别回滚到Figure2和Figure3的状态(使用TRUNCATE TABLE命名再执行Insert语句),这个时候再执行下面的SQL,看有什么效果:

-- 导入5
INSERT INTO testtable1(UserId,UserName)
SELECT UserId,UserName FROM testtable2
ON DUPLICATE KEY UPDATE
testtable1.UserName = testtable2.UserName;

u10_update

(Figure10:保留UserType值)

对比Figure2、Figure3与Figure10,UserId为101的记录:更新了UserName的值,保留了UserType的值;但是由于【导入5】中没有指定UserType,所以新插入记录的UserType是为NULL的。

-- 导入6
INSERT INTO testtable1(UserId,UserName,UserType)
SELECT UserId,UserName,UserType FROM testtable2
ON DUPLICATE KEY UPDATE
testtable1.UserName = testtable2.UserName;

u11_update

(Figure11:保留UserType值)

对比Figure2、Figure3与Figure11,只插入testtable2表的UserId,UserName字段,但是保留testtable1表的UserType字段。如果发现有重复的记录,做更新操作;在原有记录基础上,更新指定字段内容,其它字段内容保留。

(四) 总结

当在一个UNIQUE键上插入包含重复值的记录时,默认的insert会报1062错误,MYSQL可以通过以上三种不同的方式和你的业务逻辑进行处理。

三、参考文献

MYSQL插入处理重复键值的几种方法

bitsCN.com
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Roblox: Bubble Gum Simulator Infinity - How To Get And Use Royal Keys
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌
Nordhold: Fusion System, Explained
3 weeks ago By 尊渡假赌尊渡假赌尊渡假赌

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

Java Tutorial
1664
14
PHP Tutorial
1269
29
C# Tutorial
1249
24
MySQL's Role: Databases in Web Applications MySQL's Role: Databases in Web Applications Apr 17, 2025 am 12:23 AM

The main role of MySQL in web applications is to store and manage data. 1.MySQL efficiently processes user information, product catalogs, transaction records and other data. 2. Through SQL query, developers can extract information from the database to generate dynamic content. 3.MySQL works based on the client-server model to ensure acceptable query speed.

Explain the role of InnoDB redo logs and undo logs. Explain the role of InnoDB redo logs and undo logs. Apr 15, 2025 am 12:16 AM

InnoDB uses redologs and undologs to ensure data consistency and reliability. 1.redologs record data page modification to ensure crash recovery and transaction persistence. 2.undologs records the original data value and supports transaction rollback and MVCC.

MySQL: An Introduction to the World's Most Popular Database MySQL: An Introduction to the World's Most Popular Database Apr 12, 2025 am 12:18 AM

MySQL is an open source relational database management system, mainly used to store and retrieve data quickly and reliably. Its working principle includes client requests, query resolution, execution of queries and return results. Examples of usage include creating tables, inserting and querying data, and advanced features such as JOIN operations. Common errors involve SQL syntax, data types, and permissions, and optimization suggestions include the use of indexes, optimized queries, and partitioning of tables.

MySQL's Place: Databases and Programming MySQL's Place: Databases and Programming Apr 13, 2025 am 12:18 AM

MySQL's position in databases and programming is very important. It is an open source relational database management system that is widely used in various application scenarios. 1) MySQL provides efficient data storage, organization and retrieval functions, supporting Web, mobile and enterprise-level systems. 2) It uses a client-server architecture, supports multiple storage engines and index optimization. 3) Basic usages include creating tables and inserting data, and advanced usages involve multi-table JOINs and complex queries. 4) Frequently asked questions such as SQL syntax errors and performance issues can be debugged through the EXPLAIN command and slow query log. 5) Performance optimization methods include rational use of indexes, optimized query and use of caches. Best practices include using transactions and PreparedStatemen

Why Use MySQL? Benefits and Advantages Why Use MySQL? Benefits and Advantages Apr 12, 2025 am 12:17 AM

MySQL is chosen for its performance, reliability, ease of use, and community support. 1.MySQL provides efficient data storage and retrieval functions, supporting multiple data types and advanced query operations. 2. Adopt client-server architecture and multiple storage engines to support transaction and query optimization. 3. Easy to use, supports a variety of operating systems and programming languages. 4. Have strong community support and provide rich resources and solutions.

MySQL vs. Other Programming Languages: A Comparison MySQL vs. Other Programming Languages: A Comparison Apr 19, 2025 am 12:22 AM

Compared with other programming languages, MySQL is mainly used to store and manage data, while other languages ​​such as Python, Java, and C are used for logical processing and application development. MySQL is known for its high performance, scalability and cross-platform support, suitable for data management needs, while other languages ​​have advantages in their respective fields such as data analytics, enterprise applications, and system programming.

MySQL: From Small Businesses to Large Enterprises MySQL: From Small Businesses to Large Enterprises Apr 13, 2025 am 12:17 AM

MySQL is suitable for small and large enterprises. 1) Small businesses can use MySQL for basic data management, such as storing customer information. 2) Large enterprises can use MySQL to process massive data and complex business logic to optimize query performance and transaction processing.

How does MySQL index cardinality affect query performance? How does MySQL index cardinality affect query performance? Apr 14, 2025 am 12:18 AM

MySQL index cardinality has a significant impact on query performance: 1. High cardinality index can more effectively narrow the data range and improve query efficiency; 2. Low cardinality index may lead to full table scanning and reduce query performance; 3. In joint index, high cardinality sequences should be placed in front to optimize query.

See all articles