Isolation level UR the beauty and the beast in one
Written by Klaas Brant Tuesday, 06 October 2009 01:00
DB2 for z/OS uses locking in order to guarantee that you will never get incorrect or invalid data. The locking mechanism will protect you both from logical and physical corrupt data. Part of the locking is the isolation level which can be set either at bind time or at each individual SQL statement. Other databases also support isolation levels; but there is not a standard for isolations. Different implementations use different names for the same thing. This can be confusing. Via the isolation level you instruct DB2 how deal with locking. Many people never really studied locking. They simply use a parameter which others, in the past, have made to their company’s default. The default of DB2 for z/OS is Repeatable Read (RR). The default of DB2 Linux/Unix/Windows is Cursor Stability (CS). The default off RR will in all cases guarantee the correct result but also creates the most overhead in the database. CS uses the least resources. This might have been the reason to make it the default on a smaller platform.
It is surprising how even experienced DB2 professionals often do not really understand the difference between isolations levels RR, RS, CS and UR. But even when people do not completely understand the isolation levels, they will tell you that isolation Uncommitted Read (UR) is dangerous one. Is this really true? There are a few cases where isolation UR can be a good thing and in most cases isolation UR is indeed dangerous choice. It can result in incorrect results returned by the database. DB2 z/OS version 9.1 has new options to skip locked rows during query. This is very different from isolation UR. This process is very questionable and should be left to special application that can deal with this in their logic. This article will focus on isolation UR and review when it can and cannot be used.
With isolation UR you read all data regardless whether the data is locked or not. This is why is has nicknames like “dirty read” or “read through locks”. When we are going through a large set of data and are doing calculations like average or sum the isolation UR can be useful. Your statement will perform quicker because there is no overhead of locking. Even obtaining S locks while reading takes (CPU) time. Also, maybe more important, you do not interfere with other applications running when omitting the locks.
Often we process large amounts of data in order to make calculations or use aggregation functions like average or sum. When the data is constantly update the result of these calculations would (slightly) vary each time you would repeat the query. If this is not a problem, then isolation UR can be a way to process data with less resource consumption and less interference with other applications. In some cases the variance in the result is not acceptable especially when people are querying consolidated data. Due to its nature this data will never be updated anymore. No matter what type of querying you do, when querying read-only data then isolation UR can save resources. A third case where data can, maybe should, be queried with isolation UR is the DB2 catalog. These tables are updated although not often. Typical updaters of the catalog are DDL, utilities but also SQL (think of identity columns and sequences). These update processes have an extreme high priority and you should not interfere with your query locks. So many DBA's use isolation UR when querying the catalog; even if this means that every now and then the result could be incorrect. So these three cases were examples were isolation can be a good thing.
But isolation UR can also be bad. When isolation level UR is used it can even return rows that don’t qualify according to your predicates! Since Version 5 (remember, type 2 indexes!) the indexes do not get locked anymore. The lock on the data will protect the application from incorrect data even if the access path is via an index. When DB2 has to update a row then it will lock the page of the row with an X lock. Then it will update the indexes using latch/unlatch logic. Finally when done with all the indexes it will update the page. So first update the indexes, then the page itself. The indexes are protected via the page lock. The physical update process of the index is done with the latches. SQL will always obey latches and wait for the object to be unlatched. It has to, because during the latch the object being updated is in an inconsistent state. Here is a small example of isolation UR leading to incorrect results: You have a column which contains a status flag: it can contain the values A or B. The table which has this column has a many indexes. One of the indexes has the status flag as the first column in the index. CICS transactions constantly switch the flag from A to B and back again. Your SQL predicate asks for rows with status = B and you use isolation UR. DB2 can use a matching index scan to obtain the rows. There is a small possibility that indexes are already updates (say status =B) but the row on the data page is not yet updated. Via the RID in the index you access the row on the data page. This row still shows status=A which is the wrong value in the column according to your predicate. The row is still selected because the predicate was applied in the index. Applications can go wild on these logic errors! This is a typical case where isolation UR should never ever be used!
Conclusion in some cases like read-only data, aggregation functions on large amounts of data or where update priority is more important than correct results (e.g. catalog query) then isolation UR can be good. In all other cases don’t even think about it!