ADVANCED DATABASES

36  Download (0)

전체 글

(1)

4541.564; Spring 2008

P f S L

ADVANCED DATABASES

Prof. Sang-goo Lee

(14:30pm: Mon & Wed: Room 302-208)

ADVANCED DATABASES

(2)

Syllabus

ƒ Text Books ƒ Exams (tentative dates)

y

Database System Concepts, 5th Edition, A.

Silberschatz, H. F. Korth, and S. Sudarshan, McGraw Hill 2006

( )

‚ Exam 1: 4/7 (Mon)

‚ Exam 2: 5/19 (Mon)

‚ Exam 3: 6/14 (Sat 14:30) McGraw Hill, 2006.

Relational Database Theory, Atzeni & De Antonellis, Benjamin/Cummings, 1993.

‚ Exam 3: 6/14 (Sat, 14:30)

ƒ Term Project/Paper

‚ 2~3 programming assignments (Chap 2)

Database Systems, Atzeni, et al, McGraw Hill, 2000. (Chap 6~7)

p g g g

‚ To be announced later

ƒ Grades

2000. (Chap 6 7)

ƒ Lecture Notes

à will be posted before class at

‚ Exams 1, 2, & 3: 20% each

‚ Term Project/Paper: 30% total

‚ Quizzes, Assignments, http://europa.snu.ac.kr

à username & password required

à Please use only for personal use

Q , g ,

and others: 10%

** A score of 0 in f th

y p

• any one of the exams,

• any one of your term project, term paper, or

• more than 50% of your

Advanced DB (2008-1)

y

assignments/quizzes

will result in F.

(3)

REVIEWS

REVIEWS

(4)

Data, Database ,

ƒ Data

à A formal description of

‚ an entity, event, phenomena, or idea

‚ that is worth recording

ƒ Database

à An integrated collection of

à persistent data

à representing the information of interest

à representing the information of interest

à for various programs that compose the computerized information system of an organization.

à Data are separated from the programs that use them

Advanced DB (2008-1)

(5)

DBMS

ƒ Database Management System g y

à Collection of interrelated data and

à a set of programs to access those data

ƒ Information System

à DB + DBMS + Application programs + utilities pp p g

ƒ File System

à Part of OS Part of OS

à Stores programs, data, documents, or anything

à (in disk) (in disk)

(6)

Data Independence p

ƒ ability to modify a schema in one level without affecting a y y g schema definition in the next higher level

ƒ physical data independence: physical data independence:

à physical level - conceptual level

ƒ logical data independence: logical data independence:

à conceptual level - view level

View 1 ‥‥ View n

Customer Account

Conceptual Level

Customer Account

i t t bl ?

Physical Level

pointer or table?

Advanced DB (2008-1)

(7)

Instances and Schemas

ƒ Similar to types and variables in programming languages yp p g g g g

ƒ Schema – the logical structure of the database

à e.g., the database consists of information about a set of customers and accounts g , and the relationship between them)

à Analogous to type information of a variable in a program

à Physical schema: database design at the physical level

à Logical schema: database design at the logical level

I h l f h d b i l

ƒ Instance – the actual content of the database at a particular point in time

A l h l f i bl

à Analogous to the value of a variable

(8)

Data Models

ƒ The underlying structure of a database y g

ƒ Collection of conceptual tools for describing

à data data

à data relationships

à data semantics

à consistency constraints

ƒ Entity Relationship Model Entity Relationship Model

ƒ Relational Model

Obj O i d M d l

ƒ Object Oriented Model

Advanced DB (2008-1)

(9)

Database Languages

ƒ Data Definition Language (DDL)

g g

g g ( )

à Specifies the DB Schema

‚ create table

‚ drop column

ƒ Data Manipulation Language (DML)

à Operate on the contents of the DB

‚ retrieve, insert, delete, change, etc.

ƒ Query

à a statement requesting the retrieval of information

à query language: part of DML

à data model dependent

(10)

Storage Management g g

ƒ DBMS must effectively and efficiently manage storage (disk) space

ƒƒ Storage manager Storage manager

à a program module that

à provides the interface between

à the low-level data stored in the database and

à th ppli ti n pr r m nd q ri bmitt d t th t m

à the application programs and queries submitted to the system

ƒ Physical Storage

à Physical storage media hierarchy

à Physical storage media hierarchy

à RAID

à Storage Access g

à File Organization

à Storage Structures for Object-Oriented Databases

Advanced DB (2008-1)

(11)

DB users

ƒ DBA (DB Administrator)

à schema definition

à storage structure, access method definition h & h i l i i

à schema & physical organization

à security & authorization

à backup and recovery backup and recovery

ƒ Application programmers

ƒ Sophisticated users Sophisticated users

à use DML

ƒ Naïve users Naïve users

à Use interfaces provided by application programs

(12)

Overall Application interface

Application programs

query Database

scheme

Overall

System DML

il

DDL i t t

Structure

Application programs

Embedded DML precompiler

compiler interpreter

Query object code

Query evaluation

engine

processor

Transaction Buffer

Transaction manager

Buffer manager

Storage File

manager

manager

Indices Statistical data

Advanced DB (2008-1)

Data files

Data dictionary

(13)

Entity Relationship Model

ƒ entity

y p

y

à entity set vs. entity (instance)

à weak entity sets

ƒ relationship

à relationship cardinality p y

à binary, ternary, n-ary

ƒ attribute attribute

à multivalued attributes, derived attributes

ƒ generalization/specialization

ƒ generalization/specialization

à total-partial, exclusive-overlap

i

ƒ aggregation

(14)

ER Diagram g

Advanced DB (2008-1)

(15)

Relational Model

ƒ E. F. Codd, "a Relational Model of Data for Large Shared Data g Banks," Communications of the ACM, June 1970, pp.377-387.

ƒ Uses a single structure called relation Uses a single structure called relation

ƒ Set (& math) oriented model

Ph i l d i d d

ƒ Physical data independence

ƒ Definition of a relation R:

Let D 1 , ..., D n be domains, then R ⊂ D 1 , X ….. X D n

R = { <d 1 , ..., d n > | d 1D 1 , ..., d nD n }

(16)

Relations and Tables

ƒ a tuple in a relation represents relationship among set of values p p p g

ƒ Implemented as tables

Relation R = { <a 1 , b 1 , c 1 > <a 2 , b 2 , c 2 > <a , b , c > } Relation R { <a 1 , b 1 , c 1 >, <a 2 , b 2 , c 2 >, , <a n , b n , c n > }

=> Table R

Name Address Telephone

column (field, attribute)

row (record, tuple)

HS Kim Suwon 323-3232

KS Lee Busan 323-5454

Name Address Telephone

ow ( eco d, p e)

KS Lee Busan 323 5454

MH Choi Seoul 553-3235

KH Na Yongin 545-5488

Advanced DB (2008-1)

(17)

Relational Database

ƒ A relational database

à a set of relations

à a collection of tables

ƒ Keys y

à superkey, candidate key, and primary key

à keys are constraints on allowable relation instances for a given schema y g

(18)

Relational Algebra

ƒ Query language

g

Q y g g

à Network, Hierarchical : navigational language

à Relational

‚ relational algebra

‚ relational calculus

‚ SQL

‚ QUEL

à Algebra : operators and operands

à Relational algebra

‚ operands : relations

‚ operators : fundamental operators + additional operators

Advanced DB (2008-1)

(19)

Formal Definition

ƒ A basic expression in the relational algebra consists of either one of the following:

à A relation in the database

A l

à A constant relation

ƒ Let E 1 and E 2 be relational-algebra expressions; the following are ll l ti l l b p i

all relational-algebra expressions:

à E 1 ∪ E 2

à E 1 – E 2

à E 1 x E 2

à σ p (E 1 ), P is a predicate on attributes in E 1

à ∏ s (E 1 ), S is a list consisting of some of the attributes in E 1

ρ (E ), N is the new name for the result of E

(20)

Additional Operations p

We define additional operations that do not add any power to the p y p relational algebra, but that simplify common queries.

ƒ Set intersection

ƒ Natural join Natural join

ƒ Division A i

ƒ Assignment

Advanced DB (2008-1)

(21)

SQL Q

ƒ Structured Query Language Structured Query Language

ƒ IBM's System R project : Sequel

ƒ RC-based: SQL is declarative

D DD

ƒ DML & DDL

à SELECT / UNION / INTERSECT / EXCEPT

à INSERT / DELETE / UPDATE

à CREATE / DROP / ADD

à CREATE / DROP / ADD

(22)

Integrity Constraints g y

ƒ Integrity Constraints (IC) are rules that the data in the DB must abide by

ƒ IC defines the semantics of the DB

ƒ Domain Constraints

à restricts the values of a column

ƒ Referential Integrity

à Foreign Key Constraint

à Let r1(R1), r2(R2) be relations with primary keys k1 & k2, respectively. α⊆R2 is a foreign key referencing k1 if it is required that for every t2∈r2 there must be a foreign key referencing k1 if it is required that for every t2∈r2, there must be a tuple t1∈r1 such that t1[k1] = t2[α]

à Π α (r2) ⊆ Π k1 (r2)

Advanced DB (2008-1)

(23)

Integrity Constraints g y

ƒ Assertion

à a general constraint expressed as ∀x P(x)

‚ but in SQL as ∃x (¬P(x)) Q ( ( ))

CREATE ASSERTION sum-constraint CHECK (NOT EXISTS (SELECT * FROM branch

WHERE (SELECT SUM( t) FROM l WHERE (SELECT SUM(amount) FROM loan

WHERE loan.b_name=branch.b_name)

>= (SELECT SUM(amount) FROM account WHERE account.b_name=branch.b_name)))

ƒ Trigger

à Action tied to a DB event (insert/delete/update)

DEFINE TRIGGER overdraft ON UPDATE OF account T b

(IF NEW T.balance < 0

(24)

Functional Dependencies p

ƒ Basic Concept

R: relation scheme. Let α⊆R, β⊆ R Functional dependency α→β holds on R,

if i l l l i (R) f ll i f l 1 d 2 i

if in any legal relation r(R), for all pairs of tuples t1 and t2 in r, if t1[α]=t2[α] then t1[β]=t2[β]

ƒ Keys

ƒ Keys

ƒ Trivial FD

ƒ I f R l

ƒ Inference Rules

à Reflexivity, Transitivity, Augmentation

ƒ Closure & Cover

ƒ Closure & Cover

Advanced DB (2008-1)

(25)

Enforcing IC g

ƒ DB states

à DB state = DB instance

à DB is in a valid state if it satisfies all Integrity constraints

à All assertions are tested when created

à Modification to db is only allowed if it does not cause violation

DB Update DB

state 1 state 2

t1

(26)

Object-Oriented Data Model j

ƒ Object Structure Object Structure

à an object corresponds to an entity in ER model

à an object encapsulates data and behavior j p

à all interactions to an object are made via messages.

An object has

à variables : contain data for object (attribute in ER model)

à messages : interaction with rest of the world

à methods : procedures that implement the messages

Advanced DB (2008-1)

(27)

Classes & Inheritance

ƒ Object Class

à a group of similar objects

à same messages, methods, variables

⇒ l bj (i ) i i (i )

⇒ class : object (instance) = entity set : entity (instance)

ƒ Inheritance

Person

Employee Customer

Officer Teller Secretary

ƒ Complex (Composite) Objects

à Objects that contain other objects

à Aggregation (containment) hierarch

à Aggregation (containment) hierarchy

(28)

Object Identity & Persistence

ƒ Object Identity

j y

j y

à identity in relational model - key (by value)

à identity in file system - file name (by name)

à identity in OO system - file generated id (built-in)

‚ An object retains its identity even if some or all variables or definitions of

h d h i

methods change over time

ƒ Storage and access of persistent objects

à How do you store methods?

à How do you find objects in a database?

H d i bj ?

à How do you store a composite object?

Advanced DB (2008-1)

(29)

Object-Relational Database j

ƒ Extended relational model to support pp

à Nested relations

à Complex types

à Object orientation

ƒ Most commercial DBMS claim to be OR

à Oracle, DB2, Informix, …

(30)

Nested Relation

ƒ Relational model

à First Normal Form: all attributes have atomic domains

ƒ Nested relational model Nested relational model

à Domains may be atomic or relation valued

‚ tuple (complex structure) p ( p )

‚ set (multiset)

‚ list (special set)

Advanced DB (2008-1)

(31)

Querying with Complex Types

Q y g p yp

ƒ Relation valued attributes

à An expression evaluating to a relation

à can appear anywhere that a relation name my appear

Select B.name, Y.name

from docs as B B author list as Y from docs as B, B.author_list as Y

ƒ Path expression (dot notation) is used

à composite attributes / references

create table phd_students

( d i f( l )) d l

(advisor ref(people)) under people

select phd_students.advisor.name

from phd_students

(32)

OO vs OR

ƒ OR

à declarative and limited power of (extended) SQL (compared to PL)

à data protection and good optimization

d h R l d l k d li d i i

à extends the Rel. model to make modeling and querying easier

ƒ OO

à ffi i nt in mpl m in m m p r ti n f p r i t nt d t

à efficient in complex main mem. operations of persistant data

à susceptible to data corruption

ƒ Relational Relational

à simple data types, good query language, high protection

Advanced DB (2008-1)

(33)

Indexingg

ƒ Ordered Indices

ƒ B+-Tree Index Files

ƒ B Tree Index Files

ƒ B-Tree Index Files

ƒ Static Hashing

ƒ Dynamic Hashing

ƒ Comparison of Ordered Indexing and Hashing p g g

ƒ Index Definition in SQL

ƒ Multiple Key Access

ƒ Multiple-Key Access

(34)

Query Processing & Optimization

Q y g p

ƒ Measures of Query Cost Q y

ƒ Evaluation of individual operations

à Select sort join Select, sort, join

ƒ Evaluation of Expressions

E i l f i

ƒ Equivalence of expressions

ƒ Cost based optimization

Advanced DB (2008-1)

(35)

Transactions

ƒ Transaction

à a collection of operations that performs a single logical function in a database application

à programmer is responsible for writing “correct” transactions

ƒ DBMS must ensure the atomicity and durability of each transaction (at least)

à Atomicity: all-or-nothing

à Consistency: should not introduce inconsistencies

à Isolation: should not interfere with each other

à Durability: effect should be persistent

(36)

Concurrency Control & Recovery y y

ƒ Serializabilityy

ƒ Lock-based protocols

ƒ Time stamp based protocols

ƒ Time-stamp based protocols

ƒ Multiple granularity

ƒ Log-Based Recovery

ƒ Shadow Paging g g

Advanced DB (2008-1)

수치

Updating...

참조

관련 주제 :