0% found this document useful (0 votes)

38 views16 pages

Advancedchapter 2 2013

Uploaded by

Tegbaru Tamene

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views16 pages

Advancedchapter 2 2013

Uploaded by

Tegbaru Tamene

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

CHAPTER TWO

2. Query Processing and Optimization

Learning Objectives: After completing this chapter, you will be able to familiar with the
following concepts:
• Query Processing
• Query Optimization
• Pipelining
2.1. Overview of Query Processing and Optimization
Prerequisite concepts for this chapter: Components of DBMS Environment, SQL and
Relational Algebra and File Structures and Indexing.

Query processing: The activities involved in retrieving data from the database are called as query
processing. The activities involved in parsing, validating, optimizing, and executing a query. The
aims of query processing are to transform a query written in a high-level language (SQL) into low-
level language (implementing the relational algebra).
Query optimization: The activity of choosing an efficient execution strategy for processing a
query is called query optimization. Its aim is to choose the one that minimizes the resource usage.
A DBMS uses different techniques to process, optimize, and execute highlevel queries. A query
expressed in high-level query language must be first scanned, parsed, and validated.
The scanner identifies the language components (tokens) in the text of the query, while the parser
checks the correctness of the query syntax. The query is also validated (by accessing the system
catalog) whether the attribute names and relation names are valid. An internal representation (tree
or graph) of the query is created. The optimizer generates alternative plans and chooses the plan
with the least estimated cost.
2.2. Query Processing
The aim of query processing is to find information in one or more databases and deliver it to the
user quickly and efficiently. Traditional techniques work well for databases with standard, single-
site relational structures, but databases containing more complex and diverse types of data demand
new query processing and optimization techniques.

1
2.2.1. Query Processing Phases
Query processing can be divided into four main phases: decomposition (consisting of parsing and
validation), optimization, code generation, and execution, as illustrated in Figure 2.1.

Query in high-level language (SQL)

Query
Decomposition
Database catalog
Relational algebra expression
Query
Optimization
Execution plan Database statistics
Query
Generation
Generated code
Runtime query
execution
Query output Main database

Figure 2-1: Typical phases when processing a high-level query.

Basic Steps in Query Processing:

Step 1. Parsing and translation: System checks the syntax of the query.
• Creates a parse-tree representation of the query.
• Translates the query into a relational-algebra expression.
• Parser checks syntax, verifies relations
Step2: Optimization: finding the cheapest evaluation plan for a query.
• A query optimizer must know the cost of each operation.
• Each relational-algebra operation can be executed by one of several different algorithms.

Step 3: Evaluation: The query-execution engine takes a query-evaluation plan, executes that plan,
and returns the answers to the query.

2
Query in high-level language (SQL)
Query
Decomposition
Database catalog
Relational algebra expression
Query
Optimization
Execution plan Database statistics
Query
Generation
Generated code
Runtime query
execution
Query output Main database

2.2.1.1. Query Decomposition

Query decomposition is the ﬁrst phase of query processing. The aims of query decomposition are
to transform a high-level query into a relational algebra query, and to check that the query is
syntactically and semantically correct. The typical stages of query decomposition are analysis,
normalization, semantic analysis, simpliﬁcation, and query restructuring. Also, query
decomposition consists of parsing and validation. Typical stages in query decomposition are:
1. Analysis: lexical and syntactical analysis of the query correctness. In this stage, the high-level
query has been transformed into some internal representation that is more suitable for
processing. Query tree will be built for the query processing. The internal form that is typically
chosen is some kind of query tree, which is constructed from leaf node, non-leaf node and
root. The sequence of operations is directed from the leaves to the root.
2. Normalization: The normalization stage of query processing converts the query into a
normalized form that can be more easily manipulated. The predicate WHERE will be

converted to Conjunctive (v) or Disjunctive (^) Normal form.

• Conjunctive normal form: A sequence of conjuncts that are connected with the ∧ (AND)
operator. Each conjunct contains one or more terms connected by the ∨ (OR) operator. For
example: (position = ‘Manager’ ∨ salary > 20000) ∧ branchNo = ‘B003’. A conjunctive
selection contains only those tuples that satisfy all conjuncts.

3
• Disjunctive normal form: A sequence of disjuncts that are connected with the ∨ (OR)
operator. Each disjunct contains one or more terms connected by the ∧ (AND) operator. For
example, we could rewrite the above conjunctive normal form as: (position =‘Manager’ ∧
branchNo =‘B003’ ) ∨(salary >20000 ∧ branchNo =‘B003’). A disjunctive selection contains
those tuples formed by the union of all tuples that satisfy the disjuncts.
3. Semantic Analysis: The objective of semantic analysis is to reject normalized queries that are
incorrectly formulated or contradictory. A query is incorrectly formulated if components do
not contribute to the generation of the result, which may happen if some join specifications are
missing. A query is contradictory if its predicate cannot be satisfied by any tuple. For example,
the predicate (position = ‘Manager’ ∧ position = ‘Assistant’) on the Staff relation is
contradictory, as a member of staff cannot be both a Manager and an Assistant simultaneously.
However, the predicate ((position = ‘Manager’ ∧ position = ‘Assistant’) ∨ salary > 20000)
could be simplified to (salary > 20000) by interpreting the contradictory clause as the boolean
value FALSE. Unfortunately, the handling of contradictory clauses is not consistent between
DBMSs. Algorithms to handle contradictory clauses are.
• Construct a relation connection graph: If the graph is not connected, the query is incorrectly
formulated that represent the source of projection operations.
• Construct a normalized attribute connection graph: If the graph has a cycle for which the
valuation sum is negative, the query is contradictory that represents a selection operation.
4. Simplification: The objectives of the simplification stage are to detect redundant
qualifications, eliminate common subexpressions, and transform the query to a semantically
equivalent but more easily and efficiently computed form. Typically, access restrictions, view
definitions, and integrity constraints are considered at this stage. If the user does not have the
appropriate access to all the components of the query, the query must be rejected. For example:
CREATE VIEW Staff3 AS SELECT * SELECT staffNo, fName, lName, salary, branchNo
FROM Staff WHERE branchNo = ‘B003’ and salary > 20000;
5. Query Restructuring: In the final stage of query decomposition, the query is restructured to
provide a more efficient implementation. More than one translation is possible use
transformation rules.
Most real-world data is not well structured. Today's databases typically contain much non-
structured data such as text, images, video, and audio, often distributed across computer networks.

4
In this complex environment, efficient and accurate query processing becomes quite challenging.
There could be tons of tricks (not only in storage and query processing, but also in concurrency
control, recovery, etc.)
2.3. Query Optimization
The activity of choosing an eﬃcient execution strategy for processing a query is called as query
optimization. Everyone wants the performance of their database to be optimal. In particular, there
is often a requirement for a specific query or object that is query based, to run faster. Problem of
query optimization is to find the sequence of steps that produces the answer to user request in the
most efficient manner, given the database structure. The performance of a query is affected by the
tables or queries that underlies the query and by the complexity of the query. When data/workload
characteristics change:
• The best navigation strategy changes
• The best way of organizing the data changes
Query optimizers are one of the main means by which modern database systems achieve their
performance advantages. Given a request for data manipulation or retrieval, an optimizer will
choose an optimal plan for evaluating the request from among the manifold alternative
strategies. That means there are many ways (access paths) for accessing desired file/record. The
optimizer tries to select the most efficient (cheapest) access path for accessing the data. DBMS is
responsible to pick the best execution strategy based on various considerations. Query optimizers
were already among the largest and most complex modules of database systems.
Most efficient processing: Least amount of I/O and CPU resources.
Selection of the best method: In a non-procedural language the system does the optimization at
the time of execution. On the other hand, in a procedural language, programmers have some
flexibility in selecting the best method. For optimizing the execution of a query the programmer
must know:
• File organization.
• Record access mechanism and primary or secondary key.
• Data location on disk.
• Data access limitations.

5
To write correct code, application programmers need to know how data is organized physically
(e.g., which indexes exist), to write efficient code, application programmers also need to worry
about data/workload characteristics.
The process of choosing a suitable execution strategy for processing a query. Two internal
representations of a query: Query Tree and Query Graph A query tree is a tree data structure
that corresponds to a relational algebra expression. It represents the input relations of the query as
leaf nodes of the tree and represents the relational algebra operations as internal nodes. Query
graph: A graph data structure that corresponds to a relational calculus expression relations in the
query are represented by relation nodes, which are displayed as single circles. It does not indicate
an order on which operations to perform first. There is only a single graph corresponding to each
query.
Query Optimization Can be achieved through two techniques:
Using heuristic rules: Reorder the operations in the internal representation of a query (tree or
graph) to improve performance. A heuristic rule works well in MOST cases but it is NOT
GUARANTEED to work in ALL possible cases. Selections before joins better efficiency.
Using cost estimations: Find the costs of the different execution strategies and choose the one
with the lowest cost. Computationally intensive and Most DBMSs combine both.
2.3.1. Approaches to Query Optimization
2.3.1.1. Heuristics Approach
The heuristical approach to query optimization, which uses transformation rules to convert one
relational algebra expression into an equivalent form that is known to be more efﬁcient. The
heuristic approach uses the knowledge of the characteristics of the relational algebra operations
and the relationship between the operators to optimize the query. Thus, the heuristic approach
of optimization will make use of:
• Properties of individual operators:
• Association between operators:
• Query Tree: a graphical representation of the operators, relations, attributes and predicates
and processing sequence during query processing. Query tree has three main parts:
o The Leaves: the base relations used for processing the query/ extracting the required
information

6
o The Root: the final result/relation as an output based on the operation on the relations
used for query processing
o Nodes: intermediate results or relations before reaching the final result.

Sequence of execution of operation in a query tree will start from the leaves and continues to the
intermediate nodes and ends at the root. The properties of each operation and the association
between operators is analyzed using set of rules called transformation rules. Use of the
transformation rules will transform the query to relatively good execution strategy. Process for
heuristics optimization: The parser of a high-level query generates an initial internal
representation. Apply heuristics rules to optimize the internal representation. A query execution
plan is generated to execute groups of operations based on the access paths available on the files
involved in the query.
2.3.2. Transformation Rules for the Relational Algebra Operations
By applying transformation rules, the optimizer can transform one relational algebra expression
into an equivalent expression that is known to be more efficient. Use these rules to restructure the
(canonical) relational algebra tree generated during query decomposition. In listing these rules, we
use three relations R, S, and T, with R defined over the attributes A ={A1, A2, . . . , An}, and S
defined over B ={B1, B2, . . . , Bn}; p, q, and r denote predicates, and L, L1, L2, M, M1, M2, and
N denote sets of attributes.
1. Conjunctive selection operations can cascade into individual selection operations (and vice
versa). This transformation is sometimes referred to as cascade of selection.

σp∧q∧r(R) =σ p(σq(σr(R))) where p, q and r are predicates

7
Example: σ branchNo=‘B003’ ∧ salary>15000(Staff) =σ branchNo=‘B003’(σ salary>15000(Staff))
2. Commutativity of Selection operations.
σp(σq(R))=σq(σp(R)) where p and q are predicates
Example: σ branchNo=‘B003’(σ salary>15000(Staff)) =σ salary>15000(σ branchNo=‘B003’(Staff))
3. In a sequence of Projection operations, only the last in the sequence is required. Also, called
Cascade of projection: Π L Π M ...Π N(R) =Π L(R)
Example: Π lNameΠ branchNo, lName(Staff) =Π lName(Staff)
4. Commutativity of Selection and Projection. If the predicate p involves only the attributes in
the projection list, then the Selection and Projection operations commute:
Π A1, . . . , Am(σ p(R)) =σ p(Π A1, . . . , Am(R)) where p ∈{A1, A2, . . . , Am}

Example: Π fName, lName(σ lName=‘Beech’(Staff)) =σ lName=‘Beech’(Π fName, lName(Staff))

5. Commutativity of Theta join and Cartesian product.
Theta join: R ⋈p S = S ⋈p R Cartesian product: R × S = S × R
As the Equijoin and Natural join are special cases of the Theta join, then this rule also applies
to these Join operations. For example, using the Equijoin of Staff and Branch:
Staff ⋈Staff.branchNo=Branch.branchNo Branch = Branch ⋈Staff.branchNo=Branch.branchNo Staff
6. Commutativity of Selection and Theta join (or Cartesian product). If the selection predicate
involves only attributes of one of the relations being joined, then the Selection and Join (or
Cartesian product) operations commute:
σ p(R ⋈r S) = (σ p(R)) ⋈r S
σ p(R × S) = (σ p(R)) × S where p ∈{A1, A2, . . . , An}
Example: σposition=‘Manager’∧city=‘London’(Staff⋈Staff.branchNo=Branch.branchNo Branch)=(σ
position=‘Manager’(Staff)) ⋈Staff.branchNo=Branch.branchNo (σ city=‘London’(Branch))
7. Commutativity of Projection and Theta join (or Cartesian product).
a. If the projection list is of the form L = L1 ∪ L2, where L1 involves only attributes of R,
and L2 involves only attributes of S, then provided the join condition only contains
attributes of L, the Projection and Theta join operations commute as:
ΠL1 ∪ L2(R ⋈r S) = (ΠL1(R)) ⋈r (Π L2(S))
Example: Πposition, city, branchNo(Staff⋈ Staff.branchNo=Branch.branchNo Branch)=(Πposition,

8
branchNo(Staff)) ⋈Staff.branchNo=Branch.branchNo(Π city, branchNo(Branch))
b. If the join condition contains additional attributes not in L, say attributes M = M1 ∪ M2
where M1 involves only attributes of R, and M2 involves only attributes of S, then a ﬁnal
Projection operation is required:
ΠL1 ∪ L2(R ⋈r S) =Π L1 ∪ L2(Π L1 ∪ M1(R)) ⋈r (Π L2 ∪ M2(S))
Example: Πposition, city(Staff⋈Staff.branchNo=Branch.branchNo Branch)=Πposition, city((Πposition,

branchNo(Staff)) ⋈ Staff.ranchNo=Branch.branchNo (Π city, branchNo (Branch)))

8. Commutativity of Union and Intersection (but not Set difference).
R ∪ S = S ∪ R and R ∩ S = S ∩ R
9. Commutativity of Selection and set operations (Union, Intersection, and Set difference).
σp(R ∪ S) =σ p(S) ∪σ p(R)
σ p(R ∩ S) =σ p(S) ∩σ p(R)
σ p(R − S) =σ p(S) −σ p(R)
10. Commutativity of Projection and Union.
Π L(R ∪ S) =Π L(S) ∪Π L(R)
11. Associativity of Theta join (and Cartesian product). Cartesian product and Natural join are
always associative:
(R ⋈S) ⋈ T = R ⋈ (S ⋈ T)
(R × S) × T = R × (S × T)
If the join condition q involves only attributes from the relations S and T, then Theta join is
associative in the following manner: (R⋈p S) ⋈q ∧ r T = R ⋈p ∧ r (S ⋈q T)
Example: (Staff⋈Staff.staffNo=PropertyForRent.staffNoPropertyForRent) ⋈ownerNo=Owner.ownerNo∧

Staff.lName=Owner.lName Owner = Staff ⋈Staff.staffNo=PropertyForRent.staffNo ∧ Staff.lName=lName

(PropertyForRent ⋈ownerNo Owner)

12. Associativity of Union and Intersection (but not Set difference).
(R ∪ S) ∪ T = S ∪ (R ∪ T)
(R ∩ S) ∩ T = S ∩ (R ∩ T)
For prospective renters who are looking for ﬂats, ﬁnd the properties that match their
requirements and are owned by owner CO93. We can write this query in SQL as:

9
SELECT p.propertyNo, p.street FROM Client c, Viewing v, PropertyForRent p WHERE
c.prefType = ‘Flat’ AND c.clientNo = v.clientNo AND v.propertyNo = p.propertyNo AND
c.maxRent >= p.rent AND c.prefType = p.type AND p.ownerNo = ‘CO93’;
Converting the SQL to relational algebra, we have: Πp.propertyNo, p.street(σ c.prefType=‘Flat’ ∧

c.clientNo=v.clientNo ∧ v.propertyNo=p.propertyNo ∧ c.maxRent>=p.rent ∧ c.prefType=p.type ∧ p.ownerNo=‘CO93’((c ×

v) × p))
Heuristic Approach will be implemented by using the above transformation rules in the following
sequence or steps.
Sequence for Applying Transformation Rules are:
1. Step 1 Use
• Rule-1➔Cascade Selection
2. Step 2 Use
• Rule-2: Commutativity of Selection
• Rule-4: Commuting Selection with Projection
• Rule-6: Commuting Selection with Join and Cartesian
• Rule-10: Commuting Selection with Set operations
3. Step 3 Use
• Rule-9: Associativity of Binary Operations (JOIN, CARTESIAN, UNION and
INTERSECTION). Rearrange nodes by making the most restrictive operations to
be performed first (moving it as far down the tree as possible)
4. Step 4 Use
• Perform Cartesian Operations with the subsequent Selection Operation
5. Step 5 Use
• Rule-3: Cascade of Projection
• Rule-4: Commuting Projection with Selection
• Rule-7: Commuting Projection with Join and Cartesian
• Rule-11: Commuting Projection with Union
Main Heuristic
The main heuristic is to first apply the operations that reduce the size of intermediate results
E.g., Apply SELECT and PROJECT operations before applying the JOIN or other binary
operations. General heuristic optimization Algorithm
10
1- Push selections down
2- Apply more restrictive selections first Selectivity estimated by DBMS
3- Combine cross products and selections to become joins
4- Push projections down
The main heuristic is to first apply operations that reduce the size (the cardinality and/or the degree)
of the intermediate relation. That is:
• Perform SELECTION as early as possible: that will reduce the cardinality (number of tuples)
of the relation.
• Perform PROJECTION as early as possible: that will reduce the degree (number of attributes)
of the relation. Both a and b will be accomplished by placing the SELECT and PROJECT
operations as far down the tree as possible.
• SELECT and JOIN operations with most restrictive conditions resulting with smallest absolute
size should be executed before other similar operations. This is achieved by reordering the
nodes with JOIN
Example: consider the following schemas and the query, where the EMPLOYEE and the
PROJECT relations are related by the WORKS_ON relation.
• EMPLOYEE (EEmpID, FName, LName, Salary, Dept, Sex, DoB)
• PROJECT (PProjID, PName, PLocation, PFund, PManagerID)
• WORKS_ON (WEmpID, WProjID)
WEmpID (refers to employee identification) and WProjID (refers to project identification) are
foreign keys to WORKS_ON relation from EMPLOYEE and PROJECT relations respectively.
Query: The manager of the company working on road construction would like to view employees
name born before January 1 1965 who are working on the project named Ring Road. Relational
Algebra representation of the query will be:

<FName, LName> ( <DoB<Jan1 1965 WEmpID=EEmpIDPProjID=WProjID  PName=’Ring

Road’> (EMPLOYEEX WORKS_ON X PROJECT))

The SQL equivalence for the above query will be: SELECT FName, LName FROM EMPLOYEE,
WORKS_ON, PROJECT WHEREDoB<Jan 1 1965 EEmpID=WEmpID  WProjID=PProjID 

PName=”Ring Road”

11
The initial query tree will be:

<FName, LName>

(DoB<Jan1 1965) (WEmpID=EEmpID) (PProjID=WProjID)(PName=’Ring Road’)

X PROJECT

EMPLOYEE WORKS_ON

By applying the first step (cascading the selection) we will come up with the following structure.

(DoB<Jan1 1965)( (WEmpID=EEmpID)( (PProjID=WProjID)( (PName=’Ring Road’) (EMPLOYEE X

WORKS_ON X PROJECT)) ) )
By applying the second step it can be seen that some conditions have attribute that belong to a
single relation (DoB belongs to EMPLOYEE and PName belongs to PROJECT) thus the selection

operation can be commuted with Cartesian Operation. Then, since the condition WEmpID=EEmpID
base the employee andWORKS_ON relation the selection with this condition can be cascaded.

( (PProjID=WProjID) ( PName=’Ring
( Road’) PROJECT ) X ( (WEmpID=EEmpID) (WORKS_ONX
( (DoB<Jan1 1965) EMPLOYEE)))) The query tree after this modification will be:

12
<FName, LName>

(PProjID=WProjID)

X
(PName=’Ring Road’)
(WEmpID=EEmpID)

X PROJECT
(DoB<Jan1 1965)
WORKS_ON

EMPLOYEE

Using the third step, perform most restrictive operations first. From the query given we can see
that selection on PROJECT is most restrictive than selection on EMPLOYEE. Thus, it is better to
perform selection on PROJECT before selection on EMPLOYEE. Rearrange the nodes to achieve
this.

<FName, LName>

(WEmpID=EEmpID)

X
(DoB<Jan1 1965)
(PProjID=WProjID)

X EMPLOYEE

(PName=’Ring Road’)
WORKS_ON

PROJECT

Using the forth step, Perform Cartesian Operations with the subsequent Selection Operation.

13
<FName, LName>

(WEmpID=EEmpID)

(PProjID=WProjID) (DoB<Jan1 1965)

(PName=’Ring Road’) EMPLOYEE

WORKS_ON

PROJECT

Using the fifth step, Perform the projection as early as possible.

<FName, LName>

(WEmpID=EEmpID)

<FName, LName,EEmpID>
<WEmpID>

(DoB<Jan1 1965)
(PProjID=WProjID)

EMPLOYEE
<PProjID>
WORKS_ON
(PName=’Ring Road’)

PROJECT

2.3.3. Cost Estimation Approach to Query Optimization

The main idea is to minimize the cost of processing a query. The cost function is comprised of:
• I/O cost + CPU processing cost + communication cost + Storage cost
These components might have different weights in different processing environments. The DBMs
will use information stored in the system catalogue for the purpose of estimating cost. The main
target of query optimization is to minimize the size of the intermediate relation. The size will have
effect in the cost of:
• Disk Access

14
• Data Transpiration
• Storage space in the Primary Memory
• Writing on Disk
The statistics in the system catalogue used for cost estimation purpose are:
• Cardinality of a relation: the number of tuples contained in a relation currently (r)
• Degree of a relation: number of attributes of a relation
• Number of tuples on a relation that can be stored in one block of memory
• Total number of blocks used by a relation
• Number of distinct values of an attribute (d)
• Selection Cardinality of an attribute (S): that is average number of records that will satisfy
an equality condition S=r/d
By sing the above information one could calculate the cost of executing a query and selecting the
best strategy, which is with the minimum cost of processing.

2.3.3.1. Cost Components for Query Optimization

The costs of query execution can be calculated for the following major process we have during
processing.
1. Access Cost of Secondary Storage: Data is going to be accessed from secondary storage, as
a query will be needing some part of the data stored in the database. The disk access cost can
again be analyzed in terms of: Searching, Reading, and Writing, data blocks used to store some
portion of a relation. The disk access cost will vary depending on the file organization used
and the access method implemented for the file organization. In addition to the file
organization, the data allocation scheme, whether the data is stored contiguously or in scattered
manner, will affect the disk access cost.
2. Storage Cost: While processing a query, as any query would be composed of many database
operations, there could be one or more intermediate results before reaching the final output.
These intermediate results should be stored in primary memory for further processing. The
bigger the intermediate relation, the larger the memory requirement, which will have impact
on the limited available space. This will be considered as a cost of storage.
3. Computation Cost: Query is composed of many operations. The operations could be database
operations like reading and writing to a disk, or mathematical and other operations like:
Searching, Sorting, Merging, Computation on field values

15
4. Communication Cost: In most database systems the database resides in one station and
various queries originate from different terminals. This will have impact on the performance
of the system adding cost for query processing. Thus, the cost of transporting data between
the database site and the terminal from where the query originate should be analyzed.
2.4. Pipelining
Pipelining is another method used for query optimization. It used to improve the performance of
queries. It is sometime known as stream-based processing or on-the-ﬂy processing or queries. As
query optimization tries to reduce the size of the intermediate result, pipelining uses a better way
of reducing the size by performing different conditions on a single intermediate result
continuously. Thus the technique is said to reduce the number of intermediate relations in query
execution. Pipelining performs multiple operations on a single relation in a pipeline.
Generally, a pipeline is implemented as a separate process or thread within the DBMS. Each
pipeline takes a stream of tuples from its inputs and creates a stream of tuples as its output. A
buffer is created for each pair of adjacent operations to hold the tuples being passed from the ﬁrst
operation to the second one. One drawback with pipelining is that the inputs to operations are not
necessarily available all at once for processing. This can restrict the choice of algorithms.
Examples, Let’s say we have a relation on employee with the following schema Employee(ID,
FName, LName, DoB, Salary, Position, Dept). If a query would like to extract supervisors with
salary greater than 2000, the relational algebra representation of the query will be:

(Salary>2000)  (Position=Supervisor)(Employee)
After reading the relation from the memory, the system could perform the operation by cascading
the SELECT operation.

1. Approach One: (Salary>2000) ( (Position=Supervisor) (Employee))

Using this approach, we will have the following relation Employee
• Relation created by the Operation: R1 = (Position=Supervisor) (Employee)
• The resulting Relation with the Operation: R2 = (Salary>2000)(R1)
2. Approach Two: One can select a single tuple from the relation Employee and perform both
tests in a pipeline and create the final relation at once. This is what is called PIPELINING.

DB_Assignment2report
No ratings yet
DB_Assignment2report
4 pages
Chapter - 1 - Query Optimization
No ratings yet
Chapter - 1 - Query Optimization
38 pages
Advanced Database Systems Chapter 2
100% (1)
Advanced Database Systems Chapter 2
16 pages
Chapter 2 Adb
No ratings yet
Chapter 2 Adb
21 pages
Advanced Database Chapter Two Query Processing and Optimization
100% (1)
Advanced Database Chapter Two Query Processing and Optimization
43 pages
Query Optimization
No ratings yet
Query Optimization
11 pages
Chapter 2 Querry Proccessing
No ratings yet
Chapter 2 Querry Proccessing
7 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
31 pages
Advanced Database System Chapter Three Query Processing and Optimization
No ratings yet
Advanced Database System Chapter Three Query Processing and Optimization
94 pages
Chapter 1 - Query Processing and Optimization
No ratings yet
Chapter 1 - Query Processing and Optimization
62 pages
Advanced Database Ch2 and 3
100% (1)
Advanced Database Ch2 and 3
73 pages
Chapter One1
No ratings yet
Chapter One1
21 pages
Query Processing
0% (1)
Query Processing
15 pages
Chapter 2 Query Processing and Optimization
No ratings yet
Chapter 2 Query Processing and Optimization
45 pages
Chapter 1 Query Processing and Optimization
No ratings yet
Chapter 1 Query Processing and Optimization
40 pages
Chapter 2 Query processing and optimization [Autosaved]
No ratings yet
Chapter 2 Query processing and optimization [Autosaved]
35 pages
Query Processing Concepts
No ratings yet
Query Processing Concepts
99 pages
Presentation9 - Query Processing and Query Optimization in DBMS
No ratings yet
Presentation9 - Query Processing and Query Optimization in DBMS
36 pages
Query Processing
No ratings yet
Query Processing
20 pages
Chapter 4 Query Optimization
100% (2)
Chapter 4 Query Optimization
35 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
23 pages
Chapter 1 Query Processing
No ratings yet
Chapter 1 Query Processing
58 pages
Adbs CH2
No ratings yet
Adbs CH2
56 pages
Chapter 2 Query Processing
No ratings yet
Chapter 2 Query Processing
56 pages
Ch-2 Query Processing and Optimization
No ratings yet
Ch-2 Query Processing and Optimization
26 pages
Chapter 1 Query Processing
No ratings yet
Chapter 1 Query Processing
26 pages
CO3-Notes-Query Processing and Optimization
No ratings yet
CO3-Notes-Query Processing and Optimization
5 pages
2 Algorithms For Query Processing Optimization
No ratings yet
2 Algorithms For Query Processing Optimization
46 pages
Itm661 Lecture03 Part2 2015
No ratings yet
Itm661 Lecture03 Part2 2015
47 pages
Chapter 2 - Query Processing and Optimization
100% (1)
Chapter 2 - Query Processing and Optimization
28 pages
Chapter 1 Query Processing
100% (1)
Chapter 1 Query Processing
63 pages
CO3-SESSION-23
No ratings yet
CO3-SESSION-23
27 pages
QUERY Processing and Relational Algebra
No ratings yet
QUERY Processing and Relational Algebra
27 pages
Final DBMS Unit 7
No ratings yet
Final DBMS Unit 7
48 pages
CHAPTER_2_Query_Processing_&_Optimization_Handout_Material
No ratings yet
CHAPTER_2_Query_Processing_&_Optimization_Handout_Material
17 pages
Advanced Database System Chapter Two Query Processing and Optimization
No ratings yet
Advanced Database System Chapter Two Query Processing and Optimization
50 pages
Query Processing
No ratings yet
Query Processing
5 pages
Chapter 2 Query Processing and Optimization
No ratings yet
Chapter 2 Query Processing and Optimization
58 pages
ADBChapter 1
No ratings yet
ADBChapter 1
32 pages
Module - 4
No ratings yet
Module - 4
60 pages
CO3 Session 11
No ratings yet
CO3 Session 11
27 pages
2 Chapter 3 Query Optimization
No ratings yet
2 Chapter 3 Query Optimization
29 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
63 pages
Query Processing Optimization
No ratings yet
Query Processing Optimization
38 pages
36-Module-4 Query Optimization-16-03-2024
No ratings yet
36-Module-4 Query Optimization-16-03-2024
6 pages
Ad Database All Slide
No ratings yet
Ad Database All Slide
49 pages
04 Advanced Database System Chap 02 [RVUNC]
No ratings yet
04 Advanced Database System Chap 02 [RVUNC]
50 pages
Chapter 20
No ratings yet
Chapter 20
99 pages
Chapter 5
No ratings yet
Chapter 5
45 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
61 pages
CH - 1 Query Process SW
No ratings yet
CH - 1 Query Process SW
43 pages
Ch1 Query Processing (2)
No ratings yet
Ch1 Query Processing (2)
49 pages
Query Processing and Optimization
No ratings yet
Query Processing and Optimization
127 pages
Ch-2 (B) Overview of Query Processing
No ratings yet
Ch-2 (B) Overview of Query Processing
73 pages
CH - 2 Query Process
No ratings yet
CH - 2 Query Process
44 pages
Chapter-2
No ratings yet
Chapter-2
47 pages
Chapter - 2 Query Processing
No ratings yet
Chapter - 2 Query Processing
64 pages
Chapter 1 Query Processing and Optimization
No ratings yet
Chapter 1 Query Processing and Optimization
129 pages
Ivunit Query Processing
No ratings yet
Ivunit Query Processing
12 pages
ADBMS Chapter One
No ratings yet
ADBMS Chapter One
21 pages
Advanced SQL Performance Tuning: Optimize Your Database Workloads
From Everand
Advanced SQL Performance Tuning: Optimize Your Database Workloads
Robert Johnson
No ratings yet
ML Links
No ratings yet
ML Links
176 pages
HCI (CH 3)
No ratings yet
HCI (CH 3)
71 pages
Presentation 4
No ratings yet
Presentation 4
99 pages
Presentation 5
No ratings yet
Presentation 5
47 pages
Lab 1 Python
No ratings yet
Lab 1 Python
17 pages
Synopsis: Leave Management Is An Intranet Based Application That Can Be Accessed
No ratings yet
Synopsis: Leave Management Is An Intranet Based Application That Can Be Accessed
26 pages
DataBase System CH - 1 and 2
No ratings yet
DataBase System CH - 1 and 2
65 pages
VNR Vignana Jyothi Institute of Engineering and Technology Hyderabad B.Tech. Ii Year Computer Science and Engineering (Data Science) Iii Semester R22
No ratings yet
VNR Vignana Jyothi Institute of Engineering and Technology Hyderabad B.Tech. Ii Year Computer Science and Engineering (Data Science) Iii Semester R22
42 pages
Department of Mca
No ratings yet
Department of Mca
32 pages
Exam Ref Dp900 Microsoft Azure Data Fundamentals 2nd Edition Nicola Farquharson pdf download
No ratings yet
Exam Ref Dp900 Microsoft Azure Data Fundamentals 2nd Edition Nicola Farquharson pdf download
44 pages
Introduction to Databases, Mysql, Ms Access, Pharmacy Drug Database
No ratings yet
Introduction to Databases, Mysql, Ms Access, Pharmacy Drug Database
59 pages
Dbms PDF
100% (1)
Dbms PDF
4 pages
BSNL It Tool 2011
No ratings yet
BSNL It Tool 2011
45 pages
Extended_Student_Record_Management_System
No ratings yet
Extended_Student_Record_Management_System
7 pages
SQL and PLSQL
No ratings yet
SQL and PLSQL
129 pages
Mastering phpMyAdmin 3 4 for Effective MySQL Management 1st Edition Delisle Marc all chapter instant download
100% (5)
Mastering phpMyAdmin 3 4 for Effective MySQL Management 1st Edition Delisle Marc all chapter instant download
65 pages
2online Organ and Blood Donation Management System
No ratings yet
2online Organ and Blood Donation Management System
122 pages
Oracle Syllabus
No ratings yet
Oracle Syllabus
15 pages
Lab 3
No ratings yet
Lab 3
7 pages
RDBMS Important 5^010 Marks Unit Wise
No ratings yet
RDBMS Important 5^010 Marks Unit Wise
45 pages
dbms lab
No ratings yet
dbms lab
53 pages
SQL Assignment 1
50% (2)
SQL Assignment 1
4 pages
BCA Curriculum PDF
No ratings yet
BCA Curriculum PDF
48 pages
BDA.Unit-2
No ratings yet
BDA.Unit-2
30 pages
ER Modelling: Introduction To Modeling
No ratings yet
ER Modelling: Introduction To Modeling
19 pages
Basic SQL
No ratings yet
Basic SQL
12 pages
Assaignment Data Analysis
No ratings yet
Assaignment Data Analysis
114 pages
Mandar Patil: DBMS Basic Interview Questions
No ratings yet
Mandar Patil: DBMS Basic Interview Questions
6 pages
SQL for Data Science
No ratings yet
SQL for Data Science
107 pages
1.8.2 Database
No ratings yet
1.8.2 Database
15 pages
Database Systems All in One PDF
No ratings yet
Database Systems All in One PDF
395 pages
One Button Automating Feature Engineering
No ratings yet
One Button Automating Feature Engineering
9 pages
Complete MTE Syllabus II Year 2024-2025 15112024
No ratings yet
Complete MTE Syllabus II Year 2024-2025 15112024
2 pages
Assignment_1907719e62f0122a894e5e20d3e25a0a
No ratings yet
Assignment_1907719e62f0122a894e5e20d3e25a0a
3 pages

Advancedchapter 2 2013

Uploaded by

Advancedchapter 2 2013

Uploaded by

CHAPTER TWO

2. Query Processing and Optimization

Query in high-level language (SQL)

Figure 2-1: Typical phases when processing a high-level query.

2.2.1.1. Query Decomposition

converted to Conjunctive (v) or Disjunctive (^) Normal form.

σp∧q∧r(R) =σ p(σq(σr(R))) where p, q and r are predicates

Example: Π fName, lName(σ lName=‘Beech’(Staff)) =σ lName=‘Beech’(Π fName, lName(Staff))

branchNo(Staff)) ⋈ Staff.ranchNo=Branch.branchNo (Π city, branchNo (Branch)))

Staff.lName=Owner.lName Owner = Staff ⋈Staff.staffNo=PropertyForRent.staffNo ∧ Staff.lName=lName

(PropertyForRent ⋈ownerNo Owner)

c.clientNo=v.clientNo ∧ v.propertyNo=p.propertyNo ∧ c.maxRent>=p.rent ∧ c.prefType=p.type ∧ p.ownerNo=‘CO93’((c ×

<FName, LName> ( <DoB<Jan1 1965 WEmpID=EEmpIDPProjID=WProjID  PName=’Ring

Road’> (EMPLOYEEX WORKS_ON X PROJECT))

(DoB<Jan1 1965) (WEmpID=EEmpID) (PProjID=WProjID)(PName=’Ring Road’)

(DoB<Jan1 1965)( (WEmpID=EEmpID)( (PProjID=WProjID)( (PName=’Ring Road’) (EMPLOYEE X

(PProjID=WProjID) (DoB<Jan1 1965)

(PName=’Ring Road’) EMPLOYEE

Using the fifth step, Perform the projection as early as possible.

2.3.3. Cost Estimation Approach to Query Optimization

2.3.3.1. Cost Components for Query Optimization

1. Approach One: (Salary>2000) ( (Position=Supervisor) (Employee))

You might also like