测试“奇怪的select” 造成的行锁等待
个会话
SQL> select userenv('sid') from dual;
USERENV('SID')
--------------
70
delete from cat where id=101 此时事务不结束
第二个会话
SQL> select * from cat;
ID NAME
---------- ----------------------------------------
101 cat1
102 cat2
103 cat3
SQL> select userenv('sid') from dual;
USERENV('SID')
--------------
58
SQL> update cat set name='testtest' where id=101;
这个语句被阻塞了
验证阻塞关系
SQL> select session_id,sql_id,event,blocking_session,count(*) from v$active_session_history where blocking_session is not null and event like '%TX%' group by session_id,sql_id,event,blocking_session
2 ;
SESSION_ID SQL_ID EVENT BLOCKING_SESSION COUNT(*)
---------- -------------------------- -------------------------------------------------- ---------------- ----------
58 06ppp344arcg7 enq: TX - row lock contention 70 575
分析:70号会话阻塞了58号会话,58号会话正在执行SQL:06ppp344arcg7
SQL> l
1* select * from table(dbms_xplan.display_cursor('06ppp344arcg7'))
SQL> /
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID 06ppp344arcg7, child number 0
-------------------------------------
update cat set name='testtest' where id=101
Plan hash value: 165090673
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------
| 0 | UPDATE STATEMENT | | | | 1 (100)| |
| 1 | UPDATE | CAT | | | | |
|* 2 | INDEX UNIQUE SCAN| SYS_C0010408 | 1 | 25 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("ID"=101)
19 rows selected.
被阻塞的会话正在执行update语句,而造成阻塞的会话显然是70好会话的delete语句,该语句事务没有结束,我们肯能从v$open_cursor抓出该语句,但是我们暂时不考虑这个问题,
感兴趣可以参考我之前的测试,下面我们在70号会话执行一个select * from all_objects;如果该SQL没有结束,我们从造成阻塞的70号会话会抓出该SQL,结合阻塞关系,我们会突然意识到
在70号会话正在执行的select语句,竟然阻塞了update语句,并且二者的表根本没有关系,这是个十分容易迷糊人的地方。
SQL> select session_id,sql_id,event,blocking_session,count(*) from v$active_session_history where blocking_session is not null and event like '%TX%' group by session_id,sql_id,event,blocking_session;
SESSION_ID SQL_ID EVENT BLOCKING_SESSION COUNT(*)
---------- -------------------------- -------------------------------------------------- ---------------- ----------
58 06ppp344arcg7 enq: TX - row lock contention 70 1068
SQL> select sql_id from v$session where sid=70;
SQL_ID
--------------------------
25u1mbkcr9rnu
SQL> select * from table(dbms_xplan.display_cursor('25u1mbkcr9rnu',null,'allstats last'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID 25u1mbkcr9rnu, child number 0
-------------------------------------
select * from all_objects
Plan hash value: 2378945577
上面我们从70号会话找出正在执行的SQL 是:select * from all_objects,而该语句无论如何也无法说服我们它会造成行锁等待,当然从原理上说,肯定是其他DML操作这里是Delete操作导致的。
下面我们从事务角度查询,因为该阻塞意味着造成阻塞的会话的事务没有结束。
SQL> l
1* select s.sid,s.status,program,(select substr(sql_text,1,80) from v$open_cursor where sql_id=s.sql_id and s.status='ACTIVE') sql_text,t.start_time,t.used_ublk from gv$session s,gv$transaction t where s.inst_id=t.inst_id and s.taddr=t.addr and s.sid=70
SQL> /
SID STATUS PROGRAM SQL_TEXT START_TIME USED_UBLK
---------- ---------------- ---------------------------------------- ------------------------------ -------------------- ----------
70 INACTIVE sqlplus@test1 (TNS V1-V3) 04/09/21 23:50:08 1
确实这里有start_time
下面视图查询会话70,事务没有结束的SQL,它会依然阻塞会话58的update操作。
SQL> select s.sid,s.program,s.event,TO_CHAR(S.LOGON_TIME,'YYYY-MM-DD HH24:MI:SS') as sess_LOGON_TIME,TO_CHAR(T.START_DATE,'YYYY-MM-DD HH24:MI:SS') as trans_start_date,s.blocking_session,s.status,(select q.sql_text from v$sql q where q.last_active_time=t.start_date and rownum<=1) as sql_text from v$session s,v$transaction t where s.saddr=t.ses_addr;
SID PROGRAM EVENT SESS_LOGON_TIME TRANS_START_DATE BLOCKING_SESSION
---------- ---------------------------------------- -------------------------------------------------- -------------------------------------- -------------------------------------- ----------------
STATUS SQL_TEXT
---------------- ------------------------------
70 sqlplus@test1 (TNS V1-V3) SQL*Net message from client 2021-04-09 23:46:12 2021-04-09 23:50:08
INACTIVE
不幸我们没有找到这个事务没有结束的SQL。
这里我们做个总结,核心问题是造成阻塞的会话执行完delete语句后,没有提交(具体为什么需要跟业务沟通了),后续select语句执行时间比较长,如果终用户在执行完select语句后,70号会话的事务提交。
会话70;
SQL> commit;
Commit complete.
会话58,阻塞消除
SQL> update cat set name='testtest' where id=101;
0 rows updated.
继续插锁等待
SQL> l
1* select session_id,sql_id,event,blocking_session,count(*) from v$active_session_history where blocking_session is not null and event like '%TX%' group by session_id,sql_id,event,blocking_session
SQL> /
SESSION_ID SQL_ID EVENT BLOCKING_SESSION COUNT(*)
---------- -------------------------- -------------------------------------------------- ---------------- ----------
58 06ppp344arcg7 enq: TX - row lock contention 70 1991
SQL> /
SESSION_ID SQL_ID EVENT BLOCKING_SESSION COUNT(*)
---------- -------------------------- -------------------------------------------------- ---------------- ----------
58 06ppp344arcg7 enq: TX - row lock contention 70 1991
SQL> /
SESSION_ID SQL_ID EVENT BLOCKING_SESSION COUNT(*)
---------- -------------------------- -------------------------------------------------- ---------------- ----------
58 06ppp344arcg7 enq: TX - row lock contention 70 1991
从等待时间看,已经不再增加,当然也可以通过v$session看
SQL> select sid,event from v$session where sid=58;
SID EVENT
---------- --------------------------------------------------
58 SQL*Net message from client
此时由于70号会话的事务提交,58号会话的update得以继续执行。
从这里看出出于各种原因,只要70号会话的事务执行时间增长,并且有类似并发的操作,那么update语句的行锁等待就会被放大。
解决方法:70号会话中的delete及时提交,或者从业务逻辑考虑是否去掉该SQL,此时迷惑我们的select语句造成行锁等待就水落石出了,当然找打问题解决方案也就应用而生。