首页 数据库 mysql教程 如何识别和解决MySQL复制从属滞后_MySQL

如何识别和解决MySQL复制从属滞后_MySQL

Jun 01, 2016 pm 01:16 PM

How to identify and cure MySQL replication slave lag在 Percona MySQL 支持团队中,我们经常看到客户抱怨复制延迟的问题 - 很多时候该问题最终与 MySQL 复制从属延迟有关。这对于 MySQL 用户来说当然不是什么新鲜事,多年来我们在 MySQL 性能博客上发布了一些关于此主题的帖子(过去两篇特别受欢迎的帖子是:“MySQL 复制延迟的原因”和“使用 MySQL 复制管理从属设备滞后”,均由 Percona 首席执行官 Peter Zaitsev 撰写)

然而,在今天的帖子中,我将分享一些识别复制延迟的新方法 - 包括从属设备滞后的可能原因– 以及如何解决这个问题。

如何识别复制延迟
MySQL 复制使用两个线程,IO_THREAD 和 SQL_THREAD。 IO_THREAD 连接到主服务器,在二进制日志事件进入时从主服务器读取二进制日志事件,然后将它们复制到名为中继日志的本地日志文件。另一方面,SQL_THREAD 从复制从属本地存储的中继日志(IO 线程写入的文件)中读取事件,然后尽快应用它们。每当复制延迟时,首先要确定它是在从属 IO_THREAD 上延迟还是在从属 SQL_THREAD 上延迟。

通常,I/O 线程不会导致巨大的复制延迟,因为它只是从主服务器读取二进制日志。然而,这取决于网络连接、网络延迟……服务器之间的速度有多快。由于带宽使用率较高,从属 I/O 线程可能会很慢。通常,当从属 IO_THREAD 能够足够快地读取二进制日志时,它会复制并堆积从属上的中继日志 - 这表明从属 IO_THREAD 不是从属滞后的罪魁祸首。

另一方面,当从属 SQL_THREAD 是复制延迟的根源时,可能是因为来自复制流的查询在从属上执行的时间太长。有时这是因为主/从之间的硬件不同、模式索引不同、工作负载不同。此外,从OLTP工作负载有时会因为锁定而导致复制延迟。例如,如果针对 MyISAM 表的长时间运行读取会阻塞 SQL 线程,或者针对 InnoDB 表的任何事务都会创建 IX 锁并阻塞 SQL 线程中的 DDL。另外,请考虑到 MySQL 5.6 之前的从站是单线程的,这将是从站 SQL_THREAD 延迟的另一个原因。

让我通过主站状态/从站状态示例向您展示,以识别从站是否滞后在从属 IO_THREAD 或从属 SQL_THREAD 上。

mysql-master> SHOW MASTER STATUS;+------------------+--------------+------------------+------------------------------------------------------------------+| File | Position| Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set|+------------------+--------------+------------------+------------------------------------------------------------------+| mysql-bin.018196 | 15818564 || bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:1-2331947 |+------------------+--------------+------------------+------------------------------------------------------------------+mysql-slave> SHOW SLAVE STATUS/G*************************** 1. row ***************************Slave_IO_State: Queueing master event to the relay logMaster_Host: master.example.comMaster_User: replMaster_Port: 3306Connect_Retry: 60Master_Log_File: mysql-bin.018192Read_Master_Log_Pos: 10050480Relay_Log_File: mysql-relay-bin.001796Relay_Log_Pos: 157090Relay_Master_Log_File: mysql-bin.018192Slave_IO_Running: YesSlave_SQL_Running: YesReplicate_Do_DB:Replicate_Ignore_DB:Replicate_Do_Table:Replicate_Ignore_Table:Replicate_Wild_Do_Table:Replicate_Wild_Ignore_Table:Last_Errno: 0Last_Error:Skip_Counter: 0Exec_Master_Log_Pos: 5395871Relay_Log_Space: 10056139Until_Condition: NoneUntil_Log_File:Until_Log_Pos: 0Master_SSL_Allowed: NoMaster_SSL_CA_File:Master_SSL_CA_Path:Master_SSL_Cert:Master_SSL_Cipher:Master_SSL_Key:Seconds_Behind_Master: 230775Master_SSL_Verify_Server_Cert: NoLast_IO_Errno: 0Last_IO_Error:Last_SQL_Errno: 0Last_SQL_Error:Replicate_Ignore_Server_Ids:Master_Server_Id: 2Master_UUID: bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:2-973166Master_Info_File: /var/lib/mysql/i1/data/master.infoSQL_Delay: 0SQL_Remaining_Delay: NULLSlave_SQL_Running_State: Reading event from the relay logMaster_Retry_Count: 86400Master_Bind:Last_IO_Error_Timestamp:Last_SQL_Error_Timestamp:Master_SSL_Crl:Master_SSL_Crlpath:Retrieved_Gtid_Set: bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:2-973166Executed_Gtid_Set: bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:2-973166,ea75c885-c2c5-11e3-b8ee-5cf3fcfc9640:1-1370Auto_Position: 1
登录后复制

mysql-master>SHOWMASTERSTATUS;

------------- ----- -------------- ------------------ ------------- -------------------------------------------------- ---

|文件|位置  |Binlog_Do_DB|Binlog_Ignore_DB|Executed_Gtid_Set                                                |

------------------ ------ -------- ------------------ ------------------------ ------------------------------------------

|mysql -bin.018196|15818564    |                  |bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:1-2331947                  |

------------------ ----------- --- ------------------ ------------------------------------------ -------------------------------------------------

mysql-slave>SHOWSLAVESTATUS/G

****************************1.行**************** *************

Slave_IO_State:Queueingmastereventtotherelaylog

Master_Host:master.example.com

Master_User:repl

Master_Port:3306

Connect_Retry:60

Master_Log_File:mysql-bin.018192

Read_Master_Log_Pos:10050480

Relay_Log_File:mysql-relay-bin.001796

Relay_Log_Pos:157090

Relay_Master_Log_File:mysql-bin.018192

Slave_IO_Running:是

Slave_SQL_Running:是

Replicate_Do_DB:

Replicate_Ignore_DB:

Replicate_Do_Table:

Replicate_Ignore_Table:

Replicate_Wild_Do_Table:

Replicate_Wild_Ignore_Table:

Last_Errno:0Last _错误:跳过_计数器:0

Exec_Master_Log_Pos:5395871

Relay_Log_Space:10056139

Until_Condition:无

Until_Log_File:

Until_Log_Pos:0

Master_SSL_允许:否

Master_SSL_CA_文件:

Master_SSL_CA_路径:

Master_SSL_Cert:

Master_SSL_Cipher:

Master_SSL_Key:

Seconds_Behind_Master:230775

Master_SSL_Verify_Server_Cert:否

Last_IO_Errno:0

Last_IO_Error:

Last_SQL_Errno:0

Last_SQL_Error:

Replicate_Ignore_Server_Ids :

Master_Server_Id:2

Master_UUID:bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:2-973166

Master_Info_File:/var/lib/mysql/i1/data/ master.info

SQL_Delay:0

SQL_Remaining_Delay:NULL

Slave_SQL_Running_State:Readingeventfromtherelaylog

Master_Retry_Count:86400

Master_Bind:

Last_IO_Error_Timestamp:

Last_SQL_Error_Timestamp :

Master_SSL_Crl:

Master_SSL_Crlpath:

Retrieved_Gtid_Set:bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:2-973166

已执行_G tid_Set:bb11b389-d2a7- 11e3-b82b-5cf3fcfc8f58:2-973166,

ea75c885-c2c5-11e3-b8ee-5cf3fcfc9640:1-1370

自动位置:1

This clearly suggests that the slave IO_THREAD is lagging and obviously because of that the slave SQL_THREAD is lagging, too, and it yields replication delays. As you can see the Master log file ismysql-bin.018196(File parameter from master status)and slave IO_THREAD is on mysql-bin.018192(Master_Log_File from slave status) which indicates slave IO_THREAD is reading from that file, while on master it’s writing onmysql-bin.018196, so the slave IO_THREAD is behind by4binlogs. Meanwhile, the slave SQL_THREAD is reading from same file i.e.mysql-bin.018192(Relay_Master_Log_File from slave status)This indicates that the slave SQL_THREAD is applying events fast enough, but it’s lagging too, which can be observed from the difference betweenRead_Master_Log_PosExec_Master_Log_Posfrom show slave status output.

You can calculate slave SQL_THREAD lag fromRead_Master_Log_PosExec_Master_Log_Posin general as long asMaster_Log_Fileparameter output from show slave status and Relay_Master_Log_Fileparameter from show slave status output are the same. This will give you rough idea how fast slave SQL_THREAD is applying events. As I mentioned above, the slave IO_THREAD is lagging as in this example then off course slave SQL_THREAD is behind too. You can read detailed description of show slave status output fieldshere.

Also, theSeconds_Behind_Masterparameter shows a huge delay in seconds. However, this can be misleading, because it only measures the difference between the timestamps of the relay log most recently executed, versus the relay log entry most recently downloaded by the IO_THREAD. If there are more binlogs on the master, the slave doesn’t figure them into the calculation of Seconds_behind_master.  You can get a more accurate measure of slave lag usingpt-heartbeatfromPercona Toolkit.So, we learned how to check replication delays – either it’s slave IO_THREAD or slave SQL_THREAD. Now, let me provide some tips and suggestions for what exactly causing this delay.

Tips and Suggestions What Causing Replication Delay & Possible Fixes
Usually, the slave IO_THREAD is behind because of slow network between master/slave. Most of the time, enabling slave_compressed_protocolhelps to mitigate slave IO_THREAD lag. One other suggestion is to disable binary logging on slave as it’s IO intensive too unless you required it for point in time recovery.

To minimize slave SQL_THREAD lag, focus on query optimization. My recommendation is to enable the configuration optionlog_slow_slave_statementsso that the queries executed by slave that take more thanlong_query_timewill be logged to the slow log. To gather more information about query performance, I would also recommend setting the configuration optionlog_slow_verbosityto “full”.

This way we can see if there are queries executed by slave SQL_thread that are taking long time to complete. You can follow my previous post about how to enable slow query log for specific time period with mentioned optionshere.  And as a reminder,  log_slow_slave_statements as variable were first introduced inPercona Server5.1 which is now part of vanilla MySQL from version 5.6.11 In upstream version of MySQL Server log_slow_slave_statements were introduced as command line option. Details can be foundherewhile log_slow_verbosity is Percona Server specific feature.

One another reason for delay on slave SQL_THREAD if you use row based binlog format is that if your any database table missing primary key or unique key then it will scan all rows of the table for DML on slave and causes replication delays so make sure all your tables should have primary key or unique key. Check this bug report for detailshttp://bugs.mysql.com/bug.php?id=53375 You can use below query on slave to identify which of database tables missing primary or unique key.

mysql> SELECT t.table_schema,t.table_name,engineFROM information_schema.tables t INNER JOIN information_schema .columns con t.table_schema=c.table_schema and t.table_name=c.table_nameGROUP BY t.table_schema,t.table_nameHAVING sum(if(column_key in ('PRI','UNI'), 1,0)) =0;
登录后复制

mysql>SELECTt.table_schema,t.table_name,engine

FROMinformation_schema.tablestINNERJOINinformation_schema.columnsc

ont.table_schema=c.table_schemaandt.table_name=c.table_name

GROUPBYt.table_schema,t.table_name

HAVINGsum(if(column_keyin('PRI','UNI'),1,0))=0;

One improvement is made for this case in MySQL 5.6, where in memory hash is used  slave_rows_search_algorithmscomes to the rescue.

Note that Seconds_Behind_Master is not updated while we read huge RBR event, So, “lagging” may be related to just that – we had not completed reading of the event. For example, in row based replication huge transactions may cause delay on slave side e.g. if you have 10 million rows table and you do “DELETE FROM table WHERE id < 5000000″5Mrows will be sent to slave, each row separately which will be painfully slow. So, if you have to delete oldest rows time to time from huge table using Partitioning might be good alternative for this for some kind of workloads where instead using DELETE use DROP old partition may be good and only statement is replicated because it will be DDL operation.

To explain it better, let suppose you havepartition1holding rows of ID’s from 1to1000000,partition2– ID’s from1000001to2000000and so on so instead of deleting via statement “DELETE FROM table WHERE ID<=1000000;” you can do “ALTER TABLE DROP partition1;” instead. For alter partitioning operations checkmanual– Check this wonderful post too from my colleague Roman explaining possible grounds for replication delayshere

pt-stalkis one of finest tool fromPercona Toolkitwhich collects diagnostics data when problems occur. You can setup pt-stalk as follows sowhenever there is a slave lag it can log diagnostic information which we can be later analyze to check to see what exactly causing the lag.

Here is how you can setup pt-stalk so that it captures diagnostic data when there is slave lag:

------- pt-plug.sh contents#!/bin/bashtrg_plugin() {mysqladmin $EXT_ARGV ping &> /dev/nullmysqld_alive=$?if [[ $mysqld_alive == 0 ]]thenseconds_behind_master=$(mysql $EXT_ARGV -e "show slave status" --vertical | grep Seconds_Behind_Master | awk '{print $2}')echo $seconds_behind_masterelseecho 1fi}# Uncomment below to test that trg_plugin function works as expected#trg_plugin--------- That's the pt-plug.sh file you would need to create and then use it as below with pt-stalk:$ /usr/bin/pt-stalk --function=/root/pt-plug.sh --variable=seconds_behind_master --threshold=300 --cycles=60 --notify-by-email=muhammad@example.com --log=/root/pt-stalk.log --pid=/root/pt-stalk.pid --daemonize

-------pt-plug.shcontents

#!/bin/bashtrg_plugin(){

mysqladmin$EXT_ARGVping&>/dev/null

mysqld_alive=$?

if[[$mysqld_alive==0]]

then

seconds_behind_master=$(mysql$EXT_ARGV-e"show slave status"--vertical|grepSeconds_Behind_Master|awk'{print $2}')

echo$seconds_behind_master

elseecho1fi}

# Uncomment below to test that trg_plugin function works as expected

#trg_plugin-------

--That'sthept-plug.shfileyouwouldneedtocreateandthenuseitasbelowwithpt-stalk:

$/usr/bin/pt-stalk--function=/root/pt-plug.sh--variable=seconds_behind_master--threshold=300--cycles=60--notify-by-email=muhammad@example.com--log=/root/pt-stalk.log--pid=/root/pt-stalk.pid--daemonize

You can adjust the threshold, currently its300seconds, combining that with –cycles, it means that if seconds_behind_master value is >=300for60seconds or more then pt-stalk will start capturing data. Adding–notify-by-emailoption will notify via email when pt-stalk captures data. You can adjust the pt-stalk thresholds accordingly so that’s how it triggers to collect diagnostic data during problem.

Conclusion
A lagging slave is a tricky problem but a common issue in MySQL replication. I’ve tried to cover most aspects of replication delays in this post. Please let me know in the comments section if you know of any other reasons for replication delay.

本站声明
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn

热AI工具

Undresser.AI Undress

Undresser.AI Undress

人工智能驱动的应用程序,用于创建逼真的裸体照片

AI Clothes Remover

AI Clothes Remover

用于从照片中去除衣服的在线人工智能工具。

Undress AI Tool

Undress AI Tool

免费脱衣服图片

Clothoff.io

Clothoff.io

AI脱衣机

Video Face Swap

Video Face Swap

使用我们完全免费的人工智能换脸工具轻松在任何视频中换脸!

热门文章

<🎜>:泡泡胶模拟器无穷大 - 如何获取和使用皇家钥匙
3 周前 By 尊渡假赌尊渡假赌尊渡假赌
Mandragora:巫婆树的耳语 - 如何解锁抓钩
3 周前 By 尊渡假赌尊渡假赌尊渡假赌
北端:融合系统,解释
3 周前 By 尊渡假赌尊渡假赌尊渡假赌

热工具

记事本++7.3.1

记事本++7.3.1

好用且免费的代码编辑器

SublimeText3汉化版

SublimeText3汉化版

中文版,非常好用

禅工作室 13.0.1

禅工作室 13.0.1

功能强大的PHP集成开发环境

Dreamweaver CS6

Dreamweaver CS6

视觉化网页开发工具

SublimeText3 Mac版

SublimeText3 Mac版

神级代码编辑软件(SublimeText3)

热门话题

Java教程
1667
14
CakePHP 教程
1426
52
Laravel 教程
1328
25
PHP教程
1273
29
C# 教程
1255
24
MySQL的角色:Web应用程序中的数据库 MySQL的角色:Web应用程序中的数据库 Apr 17, 2025 am 12:23 AM

MySQL在Web应用中的主要作用是存储和管理数据。1.MySQL高效处理用户信息、产品目录和交易记录等数据。2.通过SQL查询,开发者能从数据库提取信息生成动态内容。3.MySQL基于客户端-服务器模型工作,确保查询速度可接受。

说明InnoDB重做日志和撤消日志的作用。 说明InnoDB重做日志和撤消日志的作用。 Apr 15, 2025 am 12:16 AM

InnoDB使用redologs和undologs确保数据一致性和可靠性。1.redologs记录数据页修改,确保崩溃恢复和事务持久性。2.undologs记录数据原始值,支持事务回滚和MVCC。

MySQL的位置:数据库和编程 MySQL的位置:数据库和编程 Apr 13, 2025 am 12:18 AM

MySQL在数据库和编程中的地位非常重要,它是一个开源的关系型数据库管理系统,广泛应用于各种应用场景。1)MySQL提供高效的数据存储、组织和检索功能,支持Web、移动和企业级系统。2)它使用客户端-服务器架构,支持多种存储引擎和索引优化。3)基本用法包括创建表和插入数据,高级用法涉及多表JOIN和复杂查询。4)常见问题如SQL语法错误和性能问题可以通过EXPLAIN命令和慢查询日志调试。5)性能优化方法包括合理使用索引、优化查询和使用缓存,最佳实践包括使用事务和PreparedStatemen

MySQL与其他编程语言:一种比较 MySQL与其他编程语言:一种比较 Apr 19, 2025 am 12:22 AM

MySQL与其他编程语言相比,主要用于存储和管理数据,而其他语言如Python、Java、C 则用于逻辑处理和应用开发。 MySQL以其高性能、可扩展性和跨平台支持着称,适合数据管理需求,而其他语言在各自领域如数据分析、企业应用和系统编程中各有优势。

MySQL:从小型企业到大型企业 MySQL:从小型企业到大型企业 Apr 13, 2025 am 12:17 AM

MySQL适合小型和大型企业。1)小型企业可使用MySQL进行基本数据管理,如存储客户信息。2)大型企业可利用MySQL处理海量数据和复杂业务逻辑,优化查询性能和事务处理。

MySQL索引基数如何影响查询性能? MySQL索引基数如何影响查询性能? Apr 14, 2025 am 12:18 AM

MySQL索引基数对查询性能有显着影响:1.高基数索引能更有效地缩小数据范围,提高查询效率;2.低基数索引可能导致全表扫描,降低查询性能;3.在联合索引中,应将高基数列放在前面以优化查询。

初学者的MySQL:开始数据库管理 初学者的MySQL:开始数据库管理 Apr 18, 2025 am 12:10 AM

MySQL的基本操作包括创建数据库、表格,及使用SQL进行数据的CRUD操作。1.创建数据库:CREATEDATABASEmy_first_db;2.创建表格:CREATETABLEbooks(idINTAUTO_INCREMENTPRIMARYKEY,titleVARCHAR(100)NOTNULL,authorVARCHAR(100)NOTNULL,published_yearINT);3.插入数据:INSERTINTObooks(title,author,published_year)VA

MySQL与其他数据库:比较选项 MySQL与其他数据库:比较选项 Apr 15, 2025 am 12:08 AM

MySQL适合Web应用和内容管理系统,因其开源、高性能和易用性而受欢迎。1)与PostgreSQL相比,MySQL在简单查询和高并发读操作上表现更好。2)相较Oracle,MySQL因开源和低成本更受中小企业青睐。3)对比MicrosoftSQLServer,MySQL更适合跨平台应用。4)与MongoDB不同,MySQL更适用于结构化数据和事务处理。

See all articles