• 검색 결과가 없습니다.

CHAPTER 13

N/A
N/A
Protected

Academic year: 2022

Share "CHAPTER 13"

Copied!
35
0
0

로드 중.... (전체 텍스트 보기)

전체 글

(1)

Intro to DB

CHAPTER 13

QUERY PROCESSING

(2)

Chapter 13: Query Processing

ƒ Overview

ƒ Measures of Query Cost

ƒ Selection Operation

ƒ Sorting

ƒ Join Operation

ƒ Other Operations

ƒ Evaluation of Expressions

(3)

Basic Steps in Query Processing

1. Parsing and translation 2. Optimization

3. Evaluation

(4)

Basic Steps

ƒ Parsing and translation

l h i i l f ( l i l l b )

à translate the query into an internal form (eg., relational algebra) à Parser checks syntax, verifies relations

ƒ Evaluation

ƒ Evaluation

à The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.

ƒ More than one way to evaluate a query

à equivalence of expressions:

SELECT balance FROM account WHERE balance>2500

Πbalancebalance>2500(account)) OR σbalance>2500balance(account))

à Different strategies and algorithms

σ (account) :

sequential scan OR index scan on balance

σ (account) :

sequential scan OR index scan on balance

(5)

Query Plan

ƒ Evaluation primitive

à a relational algebra expression annotated with instructions on how to evaluate ita relational algebra expression annotated with instructions on how to evaluate it

‚ σbalance>2500(account): use index 1

‚ σbalance>2500(account): use table scan

ƒ Query evaluation (execution) plan

à a sequence of primitive operations that can be used to evaluate a query

ƒ Example:

ƒ Example:

à SELECT balance FROM account WHERE balance>2500

Π balance σbalance>2500: seq. scan

Π

σ balance>2500: use index 1 Π balance

account account

(6)

Query Optimization

ƒ Equivalence of Expressions

Gi DB h S Q S i i l h Q

Given a DB schema S, a query Q on S is equivalent to another query Qon S, if the answer sets of Q and Qare the same in any instances of the DB.

Π (σ (customer ⋈ depositor ⋈ branch)) vs Πb_name, assetc_city=“PC”(customer ⋈ depositor ⋈ branch)) vs

Πb_name, asset((σc_city=“PC”(customer)) ⋈ depositor ⋈ branch)

ƒ Query optimization is the process of selecting the most efficient query

l ti pl f i

evaluation plan for a given query

à Most efficient? How do we know?

(7)

Measures of Query Cost

ƒ Cost is generally measured as total elapsed time for answering q er

query

à Many factors contribute to time cost

‚ disk accesses, CPU, or even network communication, ,

ƒ In a centralized system, disk access is the predominant cost

à average-seek-cost, average-block-read-cost, average-block-write-costg g g

ƒ For simplicity we use number of block transfers from/to disk as the cost measure

à ignore the difference in cost between sequential and random I/O à Ignore the difference in cost between disk reads and disk writes

i CPU

à ignore CPU costs

à ignore writing final output to disk

ƒ Real systems take all of these measures into account

ƒ Real systems take all of these measures into account

(8)

Selection Operation

ƒ File scan

à locate and retrieve records that fulfill a selection condition à locate and retrieve records that fulfill a selection condition à Full file scan: retrieve all records of a file (relation)

ƒ A1: linear search

à Scan each file block and test all records on the selection condition

b

à Cost estimate (number of disk blocks scanned) =

b

r

‚ # of blocks containing records from relation r à If selection is on a key attribute, cost = (If selection is on a key attribute, cost (

b b

r /2)/2)

‚ stop on finding record

à Linear search can be applied regardless of

l i di i

‚ selection condition or

‚ ordering of records in the file, or

‚ availability of indices

(9)

Selection Operation (Cont.)

ƒ A2: binary search

A li bl if l i i li i h ib hi h

à Applicable if selection is an equality comparison on the attribute on which file is ordered

à Assume that the blocks of a relation are stored contiguously g y à Cost estimate (number of disk blocks to be scanned):

‚ ⎡log2(br)⎤ — cost of locating the first tuple by a binary search on the blocks

Pl b f bl k i i d h i f l i di i

‚ Plus number of blocks containing records that satisfy selection condition

ƒ A3: primary index on candidate key, equality A3: primary index on candidate key, equality

à Retrieve a single record that satisfies the corresponding equality condition à Cost = HTi + 1

(10)

Selection Operation (Cont.)

ƒ A4: primary index on nonkey, equality

R i l i l d

à Retrieve multiple records

à Records will be on consecutive blocks

à Cost = HTCost HTii + number of blocks containing retrieved records number of blocks containing retrieved records

ƒ A5: equality on search-key of secondary index

à Retrieve a single record if the search-key is a candidate key

‚ Cost = HTi + 1

R i l i l d if h k i did k

à Retrieve multiple records if search-key is not a candidate key

‚ Cost = HTi + number of records retrieved

 Can be very expensive!y p

‚ each record may be on a different block

 one block access for each retrieved record

(11)

Selections Involving Comparisons

ƒ Can implement selections of the form σA≤V (r) or σA ≥ V(r) by using

à a linear file scan or binary search,y ,

à or by using indices in the following ways:

ƒ A6: primary index comparison

ƒ A6: primary index, comparison

‚ (Relation is sorted on A)

‚ For σA V(r) use index to find first tuple ≥ v and scan relation sequentially from there

from there

‚ For σA≤V (r) just scan relation sequentially till first tuple > v; do not use index

ƒ A7: secondary index comparison

ƒ A7: secondary index, comparison

‚ For σA V(r) use index to find first index entry ≥ v and scan index sequentially from there, to find pointers to records.

‚ For σA≤VA≤V (r) just scan leaf pages of index finding pointers to records, till first ( ) j p g d d g p d , entry > v

‚ In either case,

 Retrieving records that are pointed to requires an I/O for each record

 Linear file scan may be cheaper if many records are to be fetched!

(12)

Sorting

ƒ Main memory based algorithms

B bbl i k

à Bubblesort, quicksort, etc.

à Rely on frequent exchange of elements

à Not good for relations that don’t fit in memoryNot good for relations that don t fit in memory

ƒ Use index on sort key

à If one exists, simply access records in order, p y

à Even if one does not exist, we may build one, and then use the index to read the relation in sorted order

I d b ildi + i d i l

‚ Index building cost + index sequential access cost à May be quite effective

ƒ In general external sort-merge is a good choice

ƒ In general external sort-merge is a good choice

(13)

External Sort-Merge

ƒ Let M be the memory (buffer) size in blocks 1 Create sorted runs

1. Create sorted runs.

Let i = 1

repeat until all blocks of relation have been read:

(a) Read next M blocks of relation into memory (a) Read next M blocks of relation into memory (b) Sort the in-memory blocks

(c) Write sorted data to run Ri; increment i.

Let the total number of runs be N

2. Merge the runs (N-way merge). (assume for now that N < M)

1. Use N blocks of memory to buffer input runs, and 1 block to buffer output. Read the first block of each run into its buffer pagep g

2. repeat

1. Select the first record (in sort order) among all buffer pages

2. Write the record to the output buffer. If the output buffer is full write it to disk.p p 3. Delete the record from its input buffer page.

If the buffer page becomes empty then

read the next block (if any) of the run into the buffer.

3 il ll i b ff

3. until all input buffer pages are empty:

(14)

External Sort-Merge (Cont.)

ƒ If N ≥ M, several merge passes are required.

I h i f M 1 d

à In each pass, contiguous groups of M - 1 runs are merged.

à A pass reduces the number of runs by a factor of M -1, and creates runs longer by the same factor. g y

‚ E.g. If M=11, and there are 90 runs, one pass reduces the number of runs to 9 à Repeated passes are performed till all runs have been merged into one.

(15)

External Sort-Merge – Example 1

M=3 M 3

(16)

External Sort-Merge – Example 2

M=3

32 12 47 25 29 63 52 2 30 55 65 5 9 20 15 31 8 50 22 45 10 1 11 24 3 33 17 58 40 26 36 16 4

5 8 9 15 20 31 50 55 65

2 12 25 29 30 32 47 52 63 1 3 10 11 17 22 24 33 45 4 16 26 36 40 58

2 5 8 9 12 15 20 25 29 30 31 32 47 50 52 55 63 65 1 3 4 10 11 16 17 22 24 26 33 36 40 45 58

1 2 3 4 5 8 9 10 11 12 15 16 17 20 22 24 25 26 29 30 31 32 33 36 40 45 47 50 52 55 58 63 65

(17)

External Sort-Merge (Cont.)

ƒ Cost analysis:

T l b f i d ⎡l (b /M)⎤

à Total number of merge passes required: ⎡logM–1(br/M)⎤.

à Disk accesses for initial run creation (as well as in each pass) is 2br

‚ for final pass, we don’t count write cost p ,

 we ignore final write cost for all operations since the output of an operation may be sent to the parent operation without being written to disk

Thus total number of disk accesses for external sorting:

2br + 2br ⎡logM–1(br /M)⎤ - br

/ )⎤

= br( 2 ⎡logM–1(br /M)⎤ + 1)

(18)

Join Operation

ƒ Several different algorithms to implement joins

N d l j i

à Nested-loop join

à Block nested-loop join à Indexed nested-loop joinIndexed nested loop join à Merge-join

à Hash-join

ƒ Choice based on cost estimate

ƒ Running Example

à Number of records of customer: 10,000 depositor: 5000 à Number of blocks of customer: 400 depositor: 100

(19)

Nested-Loop Join

ƒ To compute the theta join r

θ

s

for each tuple tr in r do for each tuple tp ss in s do

if pair (tr, ts) satisfy the join condition θ add trts to the result.

à r is called the outer relation à s the inner relation

ƒ Requires no indices and can be used with any kind of join condition.

ƒ Expensive since it examines every pair of tuples in the two

relations.

(20)

Nested-Loop Join (Cont.)

ƒ Worst case

h i l h ld bl k f h l i

à there is memory only to hold one block of each relation

n

r

∗ b

s

+ b

r

disk accesses.

ƒ Best case

à the smaller relation fits entirely in memory: use it as the inner relation br + bs

ƒ Example

à Worst case

‚ 5,000 ∗ 400 + 100 = 2,000,100 (depositor as outer relation)

‚ 1,000 ∗ 100 + 400 = 1,000,400 (customer as the outer relation), , , ( ) à If smaller relation (depositor) fits entirely in memory

‚ 100 + 400 = 500

(21)

Block Nested-Loop Join

ƒ Variant of nested-loop join

bl k f i l i i i d i h bl k f l i

à every block of inner relation is paired with every block of outer relation

for each block B

r

of r do

f h bl k B f d

for each block B

s

of s do for each tuple t

r

in B

r

do

f h l i B d

for each tuple t

s

in B

s

do

if (t

r

, t

s

) satisfy the join condition

h dd h l

then add t

r

t

s

to the result

ƒ Worst case estimate

à Each block in the inner relation s is read once for each block in the outer relation

à brr ∗ bss + brr

(22)

Block Nested-Loop Join (Cont.)

ƒ Best case

à b

r

+ b

s

block accesses.

ƒ Improvements

à Use M-2 disk blocks as blocking unit for outer relations, and remaining two blocks to buffer inner relation and output

‚ Cost =

⎡b

r

/ (M-2)⎤

b

s

+ b

r

à If equi-join attribute forms a key of inner relation, stop inner q j y , p loop on first match

à Scan inner loop forward and backward alternately, to make use

of the blocks remaining in buffer (with LRU replacement)

(23)

Block Nested-Loop Join - Example

5 q 7 r

M=4

r s

5 q 7 r

5 q 7 r

5 q 7 r

5 q 7 r

5 q 7 r

5 q 7 r

5 q 7 r

A 3 B 10 C 8

5 q 7 r 6 s

6 s 1 t A 3

6 s 1 t C 8

6 s 1 t E 1

6 s 1 t G 8

6 s 1 t I 7 J 2

6 s 1 t K 4

6 s 1 t M 3

6 s 1 t O 9 C 8

D 6 E 1 F 9

6 s 1 t 6 u 1 v

B 10 D 6

D 6 s

F 9 D 6 s E 1 t

H 2 J 2

I 7 r

L 5 I 7 r L 5 q

N 6 N 6 s

P 1 N 6 s P 1 t G 8

H 2 I 7 J 2

2 w 3 x 4 y 8 z

6 u 1 v

6 u 1 v

6 u 1 v

6 u 1 v

6 u 1 v

6 u 1 v

6 u 1 v

6 u 1 v J 2

K 4 L 5 M 3

8 z

2 w 3 x A 3

2 w 3 x C 8

2 w 3 x E 1

2 w 3 x G 8

2 w 3 x I 7

2 w 3 x K 4

2 w 3 x M 3

2 w 3 x O 9 N 6

O 9 P 1

B 10 A 3 x

D 6 A 3 x D 6 u

F 9 E 1 v

H 2 E 1 v H 2 w

J 2 J 2 w

L 5 J 2 w

N 6 J 2 w M 3 x

P 1 N 6 u P 1 v

* Red blocks are flushed

(24)

Block Nested-Loop Join – Example (cont.)

4 y 4 y 4 y 4 y 4 y

8

4 y 4 y 4 y

M=4

r s

8 z 8 z 8 z 8 z 8 z

I 7

8 z 8 z 8 z

A 3 B 10 C 8

5 q 7 r

6 A 3

B 10

C 8 D 6 C 8 z

E 1 F 9 C 8 z

G 8 H 2 C 8 z G 8 z

I 7 J 2

K 4 L 5 K 4 y

M 3 N 6 K 4 y

O 9 P 1 K 4 y C 8

D 6 E 1 F 9

6 s 1 t 6 u

1 v G 8 z

* Red blocks are flushed

G 8 H 2 I 7 J 2

2 w 3 x 4 y 8

D 6 s I 7 r N 6 s A 3 x E 1 v J 2 w N 6 u C 8 z K 4 y

Final result

J 2 K 4 L 5 M 3

8 z

E 1 t L 5 q P 1 t D 6 u H 2 w M 3 x P 1 v G 8 z

y N 6

O 9 P 1

(25)

Hash-Join

ƒ Applicable for equi-joins and natural joins.

ƒ A hash function h is used to partition tuples of both relations

ƒ h maps JoinAttrs values to {0, 1, ..., n}, where JoinAttrs denotes the

ib f d d i h l j i

common attributes of r and s used in the natural join.

à r0, r1, . . ., rn : partitions of r

‚ Each tuple tEach tuple tr ∈∈ r is put in partition rr is put in partition rii where i = h(t [JoinAttrs])where i h(tr [JoinAttrs]).

à s0, s1, . . ., sn : partitions of s

ƒ r tuples in r p

ii

need to be compared with s tuples in s p p

ii

only y

(Note: In the textbook, ri is denoted as Hri si is denoted as Hsi and n is denoted

( , i ri, i si,

as nh. )

(26)

Hash-Join - Example

(27)

Hash-Join Algorithm

1. Partition the relation s using hashing function h.

Wh i i i l i bl k f i d h

à When partitioning a relation, one block of memory is reserved as the output buffer for each partition.

2. Partition r similarly.

2. Partition r similarly.

3. For each i:

(a) Load si into memory (a) Load si into memory

‚ build an in-memory hash index on it using the join attribute (using a different hash function than the earlier one h)

(b) R d th t l i f th di k b

(b) Read the tuples in ri from the disk one by one

‚ For each tuple tr locate each matching tuple ts in si using the in-memory hash index. Output the concatenation of their attributes.

ƒ Relation s is called the build input and

r is called the probe input

(28)

Hash-Join algorithm (Cont.)

ƒ The value n and the hash function h is chosen such that each s

i

sho ld fit in memor

should fit in memory.

ƒ Hash-table overflow occurs in partition s

i

if s

i

does not fit in memory

memory.

à Many tuples in s with same value for join attributes or bad hash function à Overflow resolution can be done in build phasep

‚ Partition si is further partitioned using different hash function.

‚ Partition ri must be similarly partitioned.

F il i h l b f d li

à Fails with large numbers of duplicates

‚ Fallback option: use block nested loops join on overflowed partitions

(29)

Cost of Hash-Join

ƒ Cost of hash join (without recursive partitioning) 3(b + b ) + 4n (4n can be ignored)

3(br bs) 4n (4n can be ignored)

à If the entire build input can be kept in main memory, n can be set to 0 and the algorithm does not partition the relations into temporary files. Cost estimate goes down to b + b

down to br + bs.

ƒ Example:

à customer ⋈ depositor

à memory size: 20 blocks; bdepositor= 100 and bcustomer = 400.

à depositor is build input

à Partition it into 6 partitions each of size less than 20 blocks

à Partition it into 6 partitions, each of size less than 20 blocks

à Similarly, partition customer into 6 partitions each of size about 70

à Therefore total cost:

3(100 + 400) + 4*6 = 1,524 block transfers

à ignores cost of writing partially filled blocks

(30)

Evaluation of Expressions

ƒ So far, we have seen algorithms for individual operations

ƒ How do you evaluate an entire expression tree?

ƒ Materialization

à generate results of an expression whose inputs are relations or are already computed, materialize (store) it on disk. Repeat.

ƒ Pipelining

ƒ Pipelining

à pass on tuples to parent operations even as an operation is being executed

(31)

Materialization

ƒ Evaluate one operation at a time, starting at the lowest-level.

ƒ Use intermedi te res lts t ri liz d into t por ry r l tions to e l te ne t le el

ƒ Use intermediate results materialized into temporary relations to evaluate next-level operations.

ƒ E.g., in figure below

)

g g (

à compute and store

à then compute and store its join with customer

à and finally compute the projections on customer name )

2500(account

balance<

σ

à and finally compute the projections on customer-name.

(32)

Materialization (Cont.)

ƒ Materialized evaluation is always applicable

ƒ Cost of writing results to disk and reading them back can be quite high

à O r cost form las for operations ignore cost of writing res lts to disk à Our cost formulas for operations ignore cost of writing results to disk à Overall cost = Sum of costs of individual operations +

cost of writing intermediate results to disk

(33)

Pipelining

ƒ Evaluate several operations simultaneously

simultaneously

à passing the results of one operation on to the next.

ƒ E.g., in expression tree

à don’t store result of selection à instead, pass tuples directly to

the join

à Similarly don’t store result of à Similarly, don t store result of

join but pass tuples directly to projection

(34)

Pipelining (cont.)

ƒ Much cheaper than materialization

d l i di k

à no need to store a temporary relation to disk.

ƒ Pipelining may not always be possible

à e g sort hash join: m st wait for entire inp t to materialize à e.g., sort, hash-join: must wait for entire input to materialize à Very difficult to achieve a lengthy chain of pipeline

ƒ For pipelining to be effective For pipelining to be effective

à use evaluation algorithms that generate output tuples even as tuples are received for inputs to the operation

(35)

END OF CHAPTER 13

참조

관련 문서

지금 공장이 포화상태로 현재 공장으로는 도저히 제품을 생 산하는데 한계가 있어, 회사 옆 부지에 새로운 제2공장 을

You can use Cost Explorer to see patterns in how much you spend on AWS resources over time, identify areas that need further inquiry, provide recommendations for

The comparison reveals cost saving by 19.2% compared with conventional distribution cost for one head of Ansim Farm Hanu (Korean cattle). Therefore, the estimated distribution

생산요소에 대한 차별적 접근- 접근비용이 낮은 경우 (Differential low cost access to productive inputs) -생산요소(productive input): 생산 및 기업활동을 위해 필요한

Interagency Working Group on Social Cost of Carbon(2015), Technical Support Document:- Technical Update of the Social Cost of Carbon for Regulatory

[18] performed a similar study on support material and observed that the high loading of Pt-Ru (80 wt%) alloy electrocatalyst sup- ported on acetylene black exhibited

government’s assessment of the project’s technical feasibility, time for completion, and cost.. 401.649 Cost Planning for Construction Projects 10.  Scope of work at

– Crew cost: varies by composition of junior and senior members as well as size; also equipment. – Time (duration): by