搜公众号
推荐 原创 视频 Java开发 开发工具 Python开发 Kotlin开发 Ruby开发 .NET开发 服务器运维 开放平台 架构师 大数据 云计算 人工智能 开发语言 其它开发 iOS开发 前端开发 JavaScript开发 Android开发 PHP开发 数据库
Lambda在线 > 有赞coder > 死锁案例二

死锁案例二

有赞coder 2019-02-15
举报

文 | 杨一 on 运维


一、前言

死锁,其实是一个很有意思也很有挑战的技术问题,大概每个 DBA 都会在工作过程中遇见。关于死锁我会持续写一个系列的案例分析,希望能够对想了解死锁的朋友有所帮助。本文源于我们的生产案例:并发申请 gap 锁导致的死锁案例,与之前的 死锁案例一不同,本案例是因为RR模式下两个事务中的 sql 可以获取同一个 gap 锁,导致对方事务的 insert 相互等待,导致死锁的。

二、案例分析

2.1 测试环境准备

Percona server 5.6.24 事务隔离级别为 RR

 
   
   
 
  1. CREATE TABLE `t4` (

  2.  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT ,

  3.  `kdt_id` int(11) unsigned NOT NULL ,

  4.  `admin_id` int(11) unsigned NOT NULL ,

  5.  `biz` varchar(20) NOT NULL DEFAULT '1' ,

  6.  `role_id` int(11) unsigned NOT NULL ,

  7.  `shop_id` int(11) unsigned NOT NULL DEFAULT '0' ,

  8.  `operator` varchar(20) NOT NULL DEFAULT '0' ,

  9.  `operator_id` int(11) NOT NULL DEFAULT '0' ,

  10.  `create_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',

  11.  `update_time` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '更新时间',

  12.  PRIMARY KEY (`id`),

  13.  UNIQUE KEY `uniq_kid_aid_biz_rid` (`kdt_id`,`admin_id`,`role_id`,`biz`)

  14. ) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULTCHARSET=utf8;


  15. INSERT INTO `t4` (`id`, `kdt_id`, `admin_id`, `biz`, `role_id`, `shop_id`, `operator`, `operator_id`, `create_time`, `update_time`)

  16. VALUES

  17. (1,10,1,'retail',1,0,'0',0,'2017-05-09 15:55:26','2017-05-09 15:55:26'),

  18. (2,20,1,'retail',1,0,'0',0,'2017-05-09 15:55:40','2017-05-09 15:55:40'),

  19. (3,30,1,'retail',1,0,'0',0,'2017-05-09 15:55:55','2017-05-09 15:55:55'),

  20. (4,40,1,'retail',1,0,'0',0,'2017-05-09 15:56:06','2017-05-09 15:56:06'),

  21. (5,50,1,'retail',1,0,'0',0,'2017-05-09 15:56:16','2017-05-09 15:56:16');

2.2 本测试案例场景是两个事务删除不存的行,然后在 insert 记录

2.3 死锁日志

 
   
   
 
  1. ------------------------

  2. LATEST DETECTED DEADLOCK

  3. ------------------------

  4. 2017-09-11 14:51:03 7f78eaf25700

  5. *** (1) TRANSACTION:

  6. TRANSACTION 462308535, ACTIVE 20 sec inserting

  7. mysql tables in use 1, locked 1


  8. LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1

  9. MySQL thread id 3584515, OS thread handle 0x7f78ea5f5700, query id 780258123 localhost root update

  10. insert into t4(`kdt_id`, `admin_id`, `biz`, `role_id`, `shop_id`, `operator`, `operator_id`, `create_time`, `update_time`)

  11. VALUES('18', '2', 'retail', '2', '0', '0', '0', CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)

  12. *** (1) WAITING FOR THIS LOCK TO BE GRANTED:

  13. RECORD LOCKS space id 225 page no 4 n bits 72 index `uniq_kid_aid_biz_rid` of table `test`.`t4` trx id 462308535 lock_mode X locks gap before rec insert intention waiting

  14. *** (2) TRANSACTION:

  15. TRANSACTION 462308534, ACTIVE 29 sec inserting, thread declared inside InnoDB 5000

  16. mysql tables in use 1, locked 1

  17. 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1

  18. MySQL thread id 3584572, OS thread handle 0x7f78eaf25700, query id 780258153 localhost root update

  19. INSERT INTO t4(`kdt_id`, `admin_id`, `biz`, `role_id`, `shop_id`, `operator`, `operator_id`, `create_time`, `update_time`)

  20. VALUES ('15', '1', 'retail', '2', '0', '0', '0', CURRENT_TIMESTAMP, CURRENT_TIMESTAMP)

  21. *** (2) HOLDS THE LOCK(S):

  22. RECORD LOCKS space id 225 page no 4 n bits 72 index `uniq_kid_aid_biz_rid` of table `test`.`t4` trx id 462308534 lock_mode X locks gap before rec

  23. *** (2) WAITING FOR THIS LOCK TO BE GRANTED:

  24. RECORD LOCKS space id 225 page no 4 n bits 72 index `uniq_kid_aid_biz_rid` of table `test`.`t4` trx id 462308534 lock_mode X locks gap before rec insert intention waiting

  25. *** WE ROLL BACK TRANSACTION (2)

2.4 死锁日志分析

首先根据《死锁案例一》 和《一个最不可思议的 MySQL 死锁分析》中强调 delete 不存在的记录是要加上 GAP 锁,事务日志中显示 Lock_mode X wait

1. T2 delete from t4 where kdt_id = 15 and admin_id = 1 and biz = 'retail' and role_id = '1'; 符合条件的记录不存在,导致 T2 先持有了(lock_mode X locks gap before rec) 锁住[(2,20,1,'retail',1,0)-(3,30,1,'retail',1,0)]的区间 ,防止符合条件的记录插入。

2. T1 的 delete 于 T2 的 delete 一样 同样申请了 (lock_mode X locks gap before rec) 锁住[(2,20,1,'retail',1,0)-(3,30,1,'retail',1,0)]的区间 。

It is also worth noting here that conflicting locks can be held on a gap by different transactions. For example, transaction A can hold a shared gap lock (gap S-lock) on a gap while transaction B holds an exclusive gap lock (gap X-lock) on the same gap. The reason conflicting gap locks are allowed is that if a record is purged from an index, the gap locks held on the record by different transactions must be merged.

3. T1 的insert 语句申请插入意向锁,但是插入意向锁和T2持有的X GAP (lock_mode X locks gap before rec) 冲突,故等待T2中的GAP 锁释放。

Gap locks in InnoDB are “purely inhibitive”, which means they only stop other transactions from inserting to the gap. They do not prevent different transactions from taking gap locks on the same gap. Thus, a gap X-lock has the same effect as a gap S-lock.

4. T2 的 insert 语句申请插入意向锁,但是插入意向锁和 T1 持有 X GAP (lock_mode X locks gap before rec) 冲突,故等待 T1 中的 GAP 锁释放。

T1(INSERT )等待T2(DELETE),T2(INSERT)等待T1(DELETE) 故而循环等待,出现死锁。有兴趣的读者朋友可以测试一下 delete 存在记录的场景。

2.6 如何解决呢

  1. 先select 检查一下看看是否存在,然后在删除。这里也存在两个或者多个会话并发执行同一个select where条件的,这里需要开发同学做处理。

  2. 使用insert into on deuplicate key语法不存在则插入,而不是先删除,再插入。

三、小结

RR 事务隔离级别和 GAP 锁是导致死锁的常见原因,但是业务逻辑设计不合理也会出发死锁,本文的案例通过修改业务逻辑最终将死锁解决。

扩展阅读

1. 

2. 

3. 

-The End-

Vol.151









有赞技术团队

为 300 万商家,150 个行业,200 亿电商交易额

提供技术支持


微商城|零售|美业


技术博客:tech.youzan.com




The bigger the dream, 

the more important the team.

版权声明:本站内容全部来自于腾讯微信公众号,属第三方自助推荐收录。《死锁案例二》的版权归原作者「有赞coder」所有,文章言论观点不代表Lambda在线的观点, Lambda在线不承担任何法律责任。如需删除可联系QQ:516101458

文章来源: 阅读原文

相关阅读

举报