(5049821d E137 4b67 81ca 07bc8dd77cd7) Writing Optimal SQL by Jonathan Lewis
(5049821d E137 4b67 81ca 07bc8dd77cd7) Writing Optimal SQL by Jonathan Lewis
By Jonathan Lewis
Embarcadero Technologies
This white paper is a companion piece to Embarcadero’s live webcast event on June 10th, 2010.
The slides for that event will be delivered to all LIVE attendees after the event. Much of the
content contained in this white paper was first presented at the UKOUG Conference 2009 and is
based on material from the “Writing Optimal SQL” full-day course that Jonathan Lewis has been
presenting for the last five years.
INTRODUCTION
Before you can write efficient SQL you need to know what your data looks like and
whether the storage mechanisms that you (or the database architect) have chosen
for the data will allow you to find an efficient access path for the queries you want to
write. It’s also important to know whether the access paths that you think make sense
are paths that the optimizer is actually able to choose.
Critically you have to know how much data you are going to collect from the
database and whereabouts in the database you can find it. This means you need to
be able to analyse the data by volume and by distribution – which is what the
optimizer is trying to do, of course, when you use the dbms_stats package to collect
statistics about the data.
Unfortunately the optimizer isn’t always able to get a good picture of the data. It
may work out (nearly correctly) that the “average” customer in your sales system
places about 40 orders per year; on the other hand you may be aware that you
don’t have an “genuinely average” customer because 20 of your customers are
responsible for 80% of your sales, 1,000 of your customers are responsible for the next
15% of your sales, and the remaining 80% of your customers are responsible for the
last 5% of your sales.
Many of the bad plans generated by Oracle’s query optimizer are the result of a
difference of opinion between your knowledge of the data and Oracle’s opinion of
what the data looks like. (We won’t pursue this point in this presentation, but
sometimes the best way to get a good execution plan is to tell the optimizer what
your data really looks like by writing little programs to create some “artificial but
truthful” statistics, rather than using the dbms_stats package to gather statistics.)
GOOD INDEXING
If you know the data and understand what a given query is supposed to achieve you
need to be sure that the optimizer can find an efficient path to your data. Typically
this means ensuring you have a suitable set of indexes but designing a good set of
indexes for each table is not a trivial problem, it requires a lot of knowledge of how
the data will be used. (Indexing is not the only way to make sure that you can get at
your data efficiently, of course, but after a system has gone into production it may be
the only viable option for fixing performance problems,)
In the example in slide 9 we see a very simple query to list the definitions of all the
indexes in the database. You could expand this query to include far more
information, of course, such as the sizes (rows and blocks) of the table and its indexes,
the number of distinct values and nulls in each column, the expressions used in
If you have a foreign key that needs protection from the locking problem then the
index need only start with the relevant columns; and if you create an index to protect
a foreign key there’s a fair chance that the leading column(s) of the index are very
repetitive, so you need to consider using the “compress” option on the index to
reduce its size. (Index compression may increase the CPU usage slightly but the
benefit of reducing the size of the index – hence amount of I/O relating to the index –
may be well worth the trade-off).
The site that supplied the example on slide 9 had a convention that foreign key index
names include the letters FK (with PK for primary key and UK for unique key); so we
see the “foreign key index” on (grp_id) is technically redundant because there is an
index on (grp_id, role_id) that starts with the same column; similarly the indexes on
(org_id) and (per_id) are technically redundant because there are indexes on
(org_id, ap_id) and (per_id, ap_id) respectively. We do have to be just a little
cautious about dropping the indexes though – looking at the on-site data I could see
that two of them were hugely repetitious and so could probably be dropped very
safely, but there was a possibility that the third (per_id) might be used in some cases
where the optimizer would decide not to use the (per_id, ap_id) index because the
larger leaf-block count or clustering_factor made it seem too expensive. There are
cases where you might need to check the relative sizes of such statistics before
dropping an index and then write a little script that adjusts the clustering_factor (in
particular) of the larger index each time you collect stats on it.
We can’t say that the index on (role_id) is redundant, because it isn’t matched by
another index and we don’t know if there are any locking problems to worry about;
however, we might know (or we can discover) that the role_id is very repetitive, so
we can recreate the index as compressed. Revisiting the other indexes that made
the “foreign key indexes” redundant we can also decide that they can be
compressed – two of them on just the first column (the ap_id is unique), and the third
on both columns.
The last index to look at is a function-based index on “last update date” – truncated
to the day. There are several thoughts you might have on this index. First, any one
date might identify a huge amount of data scattered widely across the table – is the
index even used and useful; second, since any date identifies a lot of rows it’s the
type of index that would benefit from compression; third, as you update data rows
are going to be deleted from the “low” end of the index and inserted at the “high”
end – this index is probably going to be fairly empty for a huge fraction of its range
and hold tightly packed data in the last few days, so it’s probably going to worth
using the coalesce command on this index every few days. (For more thoughts on
this particular class of index, and the problems associated with it see the items on
“Index Explosion” on my blog).
Let’s face it. Both the database developer and database administrator’s role within
the application lifecycle is expanding. Database tools that enable collaboration and
communication with management, QA, development and partners can help
everyone succeed within this connected environment. For example, are you able to
pinpoint and communicate problems in a way that can quickly get management,
development and QA on the same page? How quickly can you generate a report
you’ll need to prove an SLA? Can you place projects under a central version control
system with just a few mouse clicks? How quickly can you compare and
communicate ongoing schema changes between development and production?
How about reverse engineering an application that a line of business drops on you
unexpectedly? When evaluating database tools, consider how each tool will help
you collaborate and communicate with internal and external stakeholders.
Slide 10 is an example of a query to find out how many rows there are for each value
in a given column – with examples of the various sample clauses you could use on a
large table. This is immediately helpful for a column with just a few values (up to 254, if
you’re thinking of histograms), but not much use in my little demonstration case
where there are 10,000 distinct values.
But we can refine the query – it tells us, for example, that the values 2 and 3 appear
12 times so we might ask how many other values appear 12 times, and that’s what
the query on slide 11 tells us. With the aggregated results we can see that the worst
case example is a value with 22 rows and perhaps we ought to optimize our query on
the basis of that 22 rows (or at least write our query to protect against the potential
threats of picking that specific row.)
But the volume of data is not the only thing you need to know – we can refine our
query again to determine something about how scattered the data is. Instead of
counting the number of rows per distinct value and then summing we count the
number of blocks per value and sum. In this case we see that the worst “scattering”
means that there are two values where I have to visit 19 different blocks to collect the
data. (The index() hint is a valid way of optimising the query shown because colX is
part of the primary key t1_pk.)
We can do similar analysis with indexes. Slide 13 shows the query used by the
dbms_stats package to collect index statistics for a simple B-tree index, and I’ve
extracted the sys_op_lbid() function (lbid = leaf block id) to write a query that uses
the same “double aggregation” method I applied to the column data – first we
count the number of index entries in each leaf block, then aggregate blocks with the
same number of entries to get a picture of the internal storage efficiency of the index
– which gives us an idea of how many leaf blocks we will have to read if we use that
index. (It also gives us some clues about the “health” of the index, especially if we
know how big the index keys are and, therefore, how many rows there should be per
block if we allow for the typical 70% utilisation with binomial distribution that you get
from “random arrival” B-tree indexes. The “smashed” index in slide 15, for example,
should give us a clue that something about our processing is leaving a lot of empty
space in the index, and prompt us to think about why that might be happening and
whether we need to do anything about it.
My demonstration query joins five table with a three-table subquery. The requirement
of the query is a realistic and potentially awkward one to meet. To turn this into a
picture I have simply stepped through each table in the from clause, keeping an eye
on the where clause at the same time, and drawing a box to represent each table,
with a line representing a join between tables. For the subquery I’ve drawn a box with
a dashed outline and created a picture of the subquery inside it, joining tables in the
outer query to the tables in the subquery where there are correlation predicates. I’ve
used “crow’s feet” to represent the cardinality (one to one, one to many) of the
relationships between tables, and I’ve drawn and labelled an incoming arrow for
each constant predicate on a table.
When complete, this diagram will also include details about any relevant indexes
that exist (slide 18), the statistics about those indexes, and the statistics about the
probably data volume and scatter (slide 19) and any operational information you
might have (e.g. likely caching benefits) – but that’s too much information for one
slide. In many cases it’s also more information than you will actually need to write out,
especially if you are already very familiar with the data.
Once you’ve got all this information in a tidy graphical presentation you can pick a
table (more or less at random, if you want to) and ask the following questions:
The ideal is to find a starting point which returns a small amount of data very cheaply
and then visit the rest of the tables in an order that makes each visit cheap and
keeps the result set small. To demonstrate the approach, let’s consider the picture of
the query and start on the suppliers table from the main query (ignoring the path
presented on slide 21).
We start at suppliers because we know that we will collect just a few suppliers from
Leeds and we can do it efficiently because we have a suitable index on the table.
The next obvious place to go from suppliers is to the products they supply – there is
an index to do this, and possibly the Leeds suppliers don’t supply many different
products in total, so the intermediate result set is still small.
We now have a choice, we could go to the order_lines table either through the
foreign key index or by hash join and tablescan – but that choice will have to acquire
a lot of rows that will be extremely randomly scattered through a very large table, so
we decide to move into the subquery instead as this will (eventually) reduce the
number of products before we go to the order lines. (Strategy note – given the
choice, a join that reduces the volume of the data set will usually be better than a
join that increases the volume of the data set, unless the join happens to be very
expensive to operate at this step of the query).
So we go to the product_match table (which may increase or decrease the result set
– maybe there are only a few products with alternative sources, maybe every
product has a couple of alternatives), to the asubquery appearance of products
table which grows the length of the rows in the intermediate result but not the
number, to the suppliers table which allows us (probably) to all rows rows where the
supplier is not from Leeds – perhaps most of the data we have collected so far.
Then we’re back to the point where we have to visit order_lines table. It’s possible, of
course, that our knowledge of the data tells us that there are virtually no products
which match the Leeds/Not Leeds requirements, and that hardly anyone ever buys
them; so it’s possible that the random walk through the order_lines table would be
cheap enough to keep us happy at this point – but in a generic sales system we
probably wouldn’t want to do this join.
Having reviewed (a limited set of) the effects of starting with the main suppliers table
let’s consider the path shown in slide 21. We want all the orders in the last week from
customers in London. In our system recent orders are the ones that are mostly likely to
stay cached, and we have an index into orders by date so, even though we will pick
up far more data than we really want, starting with orders will probably be quite
efficient. We then need to discard redundant data as soon as possible, so we next
join to customers to eliminate all but the London customers. From there we have to
go to order_lines, but because we are coming from “recent orders” the order lines
we want are also likely to be well cached and well clustered so, even though this
may be quite a lot of data, we should still be able to acquire it efficiently. The
argument about the ordering of the rest of the tables is left as an exercise to the
reader.
It’s important to note two variations here. First our knowledge of the caching and
clustering of the orders and order_lines tables may be better than Oracle’s, and it is
this difference of opinion that may make Oracle choose a brute-force path rather
than trying for the precision path we have identified; so we may have to hint the
code, although we may find that it is sufficient (and generally more sensible) to
correct the clustering_factor on the orders and order_lines indexes. Secondly, if we
had an existing index (id_customer, date_ordered) on the orders table we might
have chosen to start at the customers table and follow that index into the orders
table because of the increased precision it would have given us; so this is a case
where we might raise a flag to review the indexing strategy.
CASE STUDY
The example starting at slide 22 is a query (with camouflage) that came from an
OLTP system that needed some performance improvements with minimum structural
or code changes. It’s an odd looking query, until you realise that its purpose is to
provide a drop-down menu of contracts that a person is allowed to see as they tab
across a screen into a particular field.
Checking the AWR report for the system, I found that the optimizer had generated 11
different execution plans for this query over the previous seven days – none of them
particularly efficient. (The main problem in almost all cases was the tablescan of the
transactions table.)
After creating the sketch of the query and filling in relevant information a couple of
critical points stand out. There were 240 offices of different sizes – which is reflected in
the variation in number of contracts per office – so Oracle’s automatic stats
collection job had generated a frequency histogram on the id_office column of the
contracts table; thanks to bind variable peeking this means the plan could vary
dramatically from day to day depending on who happened to be the first person to
run the query in the morning.
The second critical piece of information (which was not available to the optimizer, of
course) was that the standard query was for “what happened since yesterday
morning” – which means the data that people want from the transactions table is
only the most recent slice, which will end up being very well cached. So even if we
start by visiting a lot of redundant transaction data, we won’t be doing a lot of disk
I/O to find it.
Another point to consider (and again this is not information that is available to the
optimizer) is that the number of contracts that an office has handled will always
increase with time – but the number of transactions per day is (virtually) unchanging
because older contracts stop generating transactions.
So our first solution is: get rid of the histogram that’s destabilising the system, and take
the path transactions -> contracts -> transaction_types. This starts with far more data
than most users need and does an unreasonable amount of logical I/O, but it runs
sufficiently swiftly and consistently to keep everyone happy.
There was a lot of resistance to changing indexes, but various changes in indexing
could make this path more efficient, and could event introduce a better path.
Option 1 was to add the office_id column to the existing (primary key) index on the
contracts table (id). This would give us an index-only path to collect the contracts
infomration. This change would relatively low risk (compared to most index changes)
because the (id) index was unique and single column – but it still needed careful
testing for unexpected side effects in other queries.
Option 2 was to extend the (created) index on transactions to add into the index all
the columns from the table that we needed so that we didn’t have to visit the table
at all – a change that would make a dramatic difference to buffer visits – but which
might have more impact on other queries.
At the end of the day, the client chose the safest course – no index changes, just hint
the code and drop the histogram. It wasn’t the fastest option, but it was fast enough
and a stable solution because it depended on the (totally consistent) number of
transactions per day, rather than a variable, and growing, quantity like the number of
transactions per office.
CONCLUSION
When the optimizer fails to find a good execution plan for a query it’s usually
because its model of your data doesn’t match reality. To solve such problems you
need to do the job the optimizer is trying to do, and you need a mechanism that lets
you do it easily. A graphical approach makes it easy to grasp the “shape” of a query
and also allows you to present a large amount of information about the query in a
very readable form.
If you can present the query, statistics and indexes in a readily comprehensible
fashion it becomes much easier to recognise the best available execution path and
decide if there are any structural changes (such as new or modified indexes) that
should be introduced to allow a better path to become available.
Jonathan Lewis
www.jlcomp.demon.co.uk
jonathanlewis.wordpress.com
Who am I ?
Independent Consultant.
25+ years in IT
22+ using Oracle
Jonathanlewis.wordpress.com
www.jlcomp.demon.co.uk
1
Highlights
Where is it ?
2
Knowing the data - conflict
Strategy Choice
3
Common Outcomes
You think the task is
Oracle thinks Small Big
Good Bad
Small
the task is
Plan Plan
Bad Good
Big
Plan Plan
Optimizer problems
Correlated columns
Non-equality joins
Bind variables
4
Know the metadata
select AP_GRP_FK_I GRP_ID
table_owner, AP_GRP_ROLE_I GRP_ID (compress)
table_name, ROLE_ID
index_name,
AP_ORG_AP_I ORG_ID (compress 1)
column_name AP_ID
from
AP_ORG_FK_I ORG_ID
dba_ind_columns
order by AP_PER_AP_I PER_ID (compress 1)
AP_ID
table_owner,
table_name, AP_PER_FK_I PER_ID
index_name, AP_PK AP_ID
column_position (compress)
AP_ROLE_FK_I ROLE_ID
;
AP_UD_I TRUNC(UPD_DATE)
(drop, compress, coalesce)
Jonathan Lewis Efficient SQL
© 2006 - 2010 9 / 28
5
Know the data (b)
select CT COUNT(*)
ct, count(*) 1 9001
...
from (
6 37
select 7 78
colX, 8 94
count(*) ct 9 117
from t1 10 112
11 126
group by colX There are 99 values 12 99
) that appear 12 times
13 97
group by ct 14 86
order by ct 15 49
; 16 32
...
22 1
6
Know the data (d)
select /*+ index(t,"T1_I1") */
count(*) nrw,
count(distinct sys_op_lbid(49721,'L',t.rowid)) nlb,
count(distinct hextoraw(
sys_op_descend("DATE_ORD") ||
sys_op_descend("SEQ_ORD")
)) ndk,
sys_op_countchg(substrb(t.rowid,1,15),1) clf
from
"TEST_USER"."T1" t
where
"DATE_ORD" is not null
or "SEQ_ORD" is not null www.apress.com/resource/bookfile/2410
;
7
Know the data (f)
KEYS_PER_LEAF BLOCKS KEYS_PER_LEAF BLOCKS
3 114 17 206
18 373
4 39 19 516
6 38 20 678
7 39 21 830
13 37 22 979
23 1,094
14 1 24 1,178
Smashed Index 21 1 Normal Index
25 1,201
27 15 26 1,274
28 3 27 1,252
28 1,120
39 1 29 1,077
54 6 30 980
55 3 31 934
244 1 32 893
33 809
281 1 34 751
326 8 35 640
36 738
37 625
38 570
39 539
40 489
Jonathan Lewis Efficient SQL
© 2006 - 2010 15 / 28
8
Draw the query - outline
Orders in the last week where
the customer is in London
the supplier is from Leeds Order_lines
there is a supplier elsewhere
Exists Recent
Product_match Products Orders
Products
Order_lines
PK FK PK FK
PK
Date
Product_match Products Orders
FK
Products
PK FK PK FK
9
Draw the query - statistics
Huge Order_lines
1:10
Good Clustering
Date:
Big Orders 1:2,500
Good Clustering
Small Customers
10
Sketch in paths - analysis
Order_lines
4
6 3
1
Product_match Products Orders
7
Products 5 2
8
Suppliers Suppliers Customers
11
Case Study (b)
.
| Id| Operation | Name |Rows |Bytes | Cost |
| 0| SELECT STATEMENT | | | | 14976 |
| 1| SORT UNIQUE | | 7791 | 304K| 14976 |
| 2| FILTER | | | | |
| 3| HASH JOIN | | 7798 | 304K| 14974 |
| 4| VIEW | | 9819 |98190 | 1599 |
| 5| HASH JOIN | | | | |
| 6| INDEX RANGE SCAN | CON_OFF_FK | 9819 |98190 | 35 |
| 7| INDEX FAST FULL SCAN| CON_PK | 9819 |98190 | 1558 |
| 8| HASH JOIN | | 7798 | 228K| 13374 |
| 9| TABLE ACCESS FULL | TRANS_TYPES | 105 | 945 | 3 |
| 10| TABLE ACCESS FULL | TRANSACTIONS | 7856 | 161K| 13370 |
qxt != ‘NONE’
Transaction_types 100 rows no exclusions
12
Case Study (d)
Get rid of the histogram on office_id !
Modified Indexes
contracts(id, office_id)
Expected plan:
hash join
table access full transaction_types
nested loop
table access by rowid transactions
index range scan transactions_idx
index range scan contracts_idx
Modified Indexes
transactions(created, con_id, error, id_ttype)
contracts(id, office_id)
Expected plan:
hash join
table access full transaction_types
nested loop
index range scan transactions_idx
index range scan contracts_idx
13
Case Study (f)
Modified Indexes
contracts(id_office, id)
transactions(id_contract, created)
Expected plan:
hash join
table access full transaction_types
nested loop
index range scan contracts_idx
table access by rowid transactions
index range scan transactions_idx
Summary
Optimizer deficiencies
Structure the query with hints
14