When starting to work on this assignment, the only information i had was that up from a point (after some million transactions) our database was getting slow. I needed to know why a database like Oracle was behaving so badly after less than 100 million transactions in the database. Clearly, it was a problem of contention, a problem of waiting. I needed to know why that was, so the first thing i did was to ask our DBA for his Oracle Database Architecture book, and dig in.
After getting a better view on the logical (and physical) structure of the Oracle DB, i had another discussion with the DBA. It was still unclear to me how, despite the fact that our queries were using the indexes, we not only had problems in reading, but in writing also. Together we found out that the size of the (all) indexes on some of our columns were at some points bigger (if not double) the size of the data. Oracle was starting to spend a lot of time internally on managing the indexes.
We started to do an analysis on the biggest tables, and their indexes, in order to find out:
- who were the biggest consumers (storage)
- which were the indexes that were summing up the storage
- which indexes did we really need, and which could we drop
Of course, one would say that at least the last point here should have been done before creating the indexes. Unfortunately we fell into the trap of creating the indexes based on gut or on daily operations (or performance tests), not really following the business scenarios.
Like i already mentioned, we also had some very long response times when doing a read operation on the database (a normal business scenario, where a user would retrieve about 1000 records from the database) for specific queries involving the same tables. I run a test on one of those queries, using Oracle’s trace function to see where did Oracle spend that much time. I was amazed by the results:
- Running the query on a table containing 2.5 million records – 30 seconds
- Running the query on a table containing 30 million records – timeout!!!
Now, how was that possible? Let us dive into the real problem
Displaying storage and structure statistics on Oracle tables and indexes
Without having access to the Oracle Statistics ( we use TOAD for that )The first thing that i did, was to design myself a query that would satisfy the following requirements:
- show the number of rows / table
- show the number of blocks / table
- show the total size in bytes /table
- show the indexes and their size / table
- show the indexes clustering factor
- show the level of each index
- show the number of leaf blocks for each index
I needed to display only the big consumers, so i limited that to all tables having more than 1 million records. After playing a little with oracle’s tables, i came up with the following snippet:
select a.table_name, b.blocks, b.num_rows, (b.blocks*4)/1024 as table_storage_mb, a.index_name, sum(ue.BYTES)/(1024*1024) as index_storage_mb, a.clustering_factor, a.BLEVEL, a.LEAF_BLOCKS from user_indexes a inner join user_tables b on a.table_name=b.table_name inner join user_extents ue on a.index_name=ue.segment_name where b.num_rows is not null and b.num_rows > 1000000 GROUP BY a.table_name, b.num_rows, (b.blocks*4)/1024, a.index_name, a.clustering_factor, a.BLEVEL, a.LEAF_BLOCKS ORDER BY table_storage_mb desc, a.table_name, index_storage_mb desc, a.index_name
This returned the following results (i had to shorten the name of the tables, indexes, and the table headers)
The table headers have the following meaning:
TBL – Table; TROWS – Rows in table; TBLKS – Table Blocks; T-BM – Table storage in MBytes; INAME – Index name; IMB – Index storage in MBytes; ICL_FACT – Index clustering factor; LF_BLKS – Index Leaf Blocks
Oracle Index Clustering Factor – Performance issues
The Oracle Reference Manual definition on the index clustering factor looks like this:
It indicates the amount of order of the rows in the table based on the values of the index:
- If the value is near the number of blocks, then the table is very well ordered.
In this case, the index entries in a single leaf block tend to point to rows that are stored in the same data blocks
- If the value is near the number of rows, then the table is very randomly ordered.
In this case, it is unlikely that the index entries in the same leaf block will point to rows in the same data blocks
I like referring to Tom Kyte’s simplified definition of this factor. In his book ( Expert Oracle Database Architecture) Tom says that:
we could also view the clustering factor as a number that represents the number of logical i/o’s against the table, that would be performed to read the entire table via the index.
Let’s have an example, out of the resultset of the query above:
First, i will do a select count(id) on the table, which will use the primary key (the indexes with the ‘S’ prefix are system generated indexes on primary keys)
select count(id) from TBL2 – bad clustering factor ( near to the number of rows)
This will do an Index Full Scan, using the S_12887 index, with the following results:
index full scan – cost 14998, card 1638820, cr_buf_gets: 14923 – 83 sec
Second, i will do a select on the id, using the GSTAT index, which has a good clustering factor (near to the number of blocks)
select /*+ INDEX (TBL2GSTAT ) */ count(id) from TBL2
This will also do an Index Full Scan, this time using the index that i pointed out (GSTAT), with the following results:
index full scan – cost 9283, card 1638820, cr_buf_gets – 10647 – 13 sec
As you can see, running the same query on an index with a good clustering factor was 5 times better!
I need to present a graphical representation of how oracle will distribute the data into blocks, when dealing first with a well organised index, and then with a badly organised index (high clustering factor) I will not try to reinvent the wheel ( ) , so i will just use the graphical representation that Mohammed Mehraj Hussain did in his post
As you can see, index values in the same leaf block tend to point to rows that are distributed in the same blocks.
In this case, index values in the same leaf block tend to point to rows that are spread over multiple blocks.
Exactly, so what? Oracle is a super power database, so it really shouldn’t matter where the data is located. A block is a block, and Oracle will read it anyway right?
And this is the moment where i have to point out the reason why this is so important.
When going for the data, ORACLE does an index range scan, and gets a list of blocks. If he notices that the next row in the index is located on the same database block as the row it has just read, Oracle will not performa another I/O operation to retrieve the table block from the buffer cache, because it already has a handle to the block, so it will just use it. If the row is not on the same block, then it will release the block, and perform another I/O into the buffer cache to retrieve the next block to be processed (see Tom Kyte’s book – Expert Oracle Database Architecture 9i and 10g – Apress)
So let’s suppose our database is configured to support storing 100 rows of data into one block of data. If we had 1000 rows of data, and 100 index values in a leaf node, then, in a well organized index structure (good clustering factor), in order to read all data from the database, we will need exactly 10 I/O’s ( additional to the 3 I/O’s to get to the leaf node) Otherwise, in the most pessimistic scenario, when dealing with a badly organized index, we could use as much as 1000 I/O operations against the table.
Now, see the difference? And this was just an example for a small set of data. Imagine querying tables containing millions of rows, like the ones above…
Using sequences for generating index values and effects on performance
When using sequences for generating your index values, you have two options:
- you either use one sequence for all tables: in this case you will definitely have a problem with the index clustering values, because the scattering factor in the values of the index of one table will be very high
- you use one sequence / table: in this case you are (more or less) sure that the index values are near to each other, in an ascending order
Let’s discuss the two options for a second.
Using one sequence for all tables: let us suppose a transaction writes data into ten tables, and that each of the ten tables has an index whose values are generated by the same sequence. After 10 transactions, the index for the first table would look like:
This will scatter the way Oracle distributes the data, so that, in the worse case scenario, for 10 transactions writing in the first table, we will need 10 blocks of data (instead of two, like in a well ordered index) Doing a query to retrieve the data based on this index, will generate therefore an I/O for each row to be retrieved…costly, isn’t it?
Using a separate sequence for each of the tables: in this case the values in the index will be ascending, near to each other, so that the index above will look totally different:
and the data will be distributed evenly, in two blocks of data, reaching the scope, a well ordered index, with a good clustering factor, drastically reducing the number of I/O’s on the table
Sounds like a great solution. True, as long as you are running in a single application server environment. But what happens when you have multiple application servers, each of them retrieving a set of numbers from the sequence? Let me detail that for a minute
Using sequences for generating index values in a clustered application server environment
So, supposing that using a sequence / table is the right implementation, let us imagine we want to have a distributed system, running on four application servers, and one database. What will happen is that each application server will call the sequence, which will return the configured number of sequences. Let’s say that we have configured the sequence to generate 1000 values / call. A typical workflow would look like this:
- First application server gets the values 1 – 1000
- Second application server gets the values 10001 – 2000
- Third application server: 2001 – 3000
- Fourth application server: 3001 – 4000
Four users, landing each on a different application server (let’s suppose they land on the servers above ascending), running the same transaction, will generate the following values in the index:
(1, 1001, 2001, 3001, 2, 1002, 2002, 3002, 4002, ….)
So we are back to the initial problem…, a scattered index, therefore a bad clustering factor. Solutions? Don’t know yet, but still looking for them. The problem is quite complex, and i am no DBA …
Will get back here once i know what to do, and once the solution that me and my team will decide for will prove itself.
Until then, this was probably the longest post (in time spent, and content) after the WELD memory leak ( which was also a lot of fun) I have read and learned a lot while working on this subject, and i hope i can be of help to people out there having the same, or similar problems.
For the final words, i will quote Tom, like i usually do in the last days :
you can treat a database like a black box, and just stick data into it, or you can understand how it works and exploit it fully (…) If you choose to understand exactly how Oracle database platform should be used, then you will find that there are few information management problems that you cannot solve quickly and elegant