• 검색 결과가 없습니다.

Chapter 5

N/A
N/A
Protected

Academic year: 2021

Share "Chapter 5"

Copied!
12
0
0

로드 중.... (전체 텍스트 보기)

전체 글

(1)

Data and File Structures Chapter 5

Managing Files of Records

(2)

Outline

• Record Access

• More About Record Structures

• File Access and File Organization

(3)

Record Access: Keys

• When looking for an individual record, it is convenient to identify the record with a key based on the record’s content.

• e.g., the Ames record.

• Keys should uniquely define a record and be unchanging.

• Primary key is the key that is used identify a record uniquely.

• Records can also be searched based on a secondary key.

• Generally, Secondary keys do not uniquely identify a record.

(4)

Sequential Search

• Sequential Search

– Look at records sequentially until matching record is found.

– Time is in O(n) for n records.

• Improving Sequential Search Performance with Record Blocking.

– We learned in chapter 3 that the major cost of a disk

access is a seeking time to the right location on the disk.

– We can improve the performance by reading in a block of several records  record blocking.

– But, the cost of searching is still O(n)

(5)

Sequential Search – cont’d

• Sequential access has two advantages.

– It is extremely easy to program.

– It requires the simplest of file structures.

• When is Sequential Search Useful?

– ASCII files in which you are searching for some pattern.

e.g., grep.

– Files with few records (e.g., 10 records).

– Files that hardly ever need to be searched (e.g., tape files).

– Files in which you want all records with a certain

secondary key values, where a large number of matches is expected

(6)

Direct Access

• With direct access, we can seek directly to the beginning of the record.

– Time is in O(1) for n records.

• How do we know where the beginning of the required record is?

– It may be in an Index.

• discussed in a different lecture.

– We know the relative record number (RRN).

• E.g., first record has RRN 0, the next has RRN 1, and so forth

(7)

Direct Access by RRN

• RRN is not useful when working with variable length-records.

– The access is still sequential.

– To get the record we want, we have to read

sequentially through the file, counting records as we go.

– Time is in O(n)

(8)

Direct Access by RRN – cont’d

• However, with fixed-length records, RRN is useful.

– If the records are all the same length, we can use a record’s RRN to calculate the byte offset of the

start of the record relative to the start of the file.

– E.g., RRN = 30, record length = 100 bytes:

• Byte offset = 30 * 100 = 3000

(9)

Record Structure

• For direct access, we have to fix the length of records.

• There are 2 approaches for a fixed-length record.

– Fixed-length Fields in a fixed-length record.

• simple but problematic.

– Variable-length Field in a fixed-length record.

• more efficient use of a fixed amount of space.

(10)

2 approaches to field structure

within a fixed length record

(11)

Record Structure – cont’d

• Header Records

– often used at the beginning of the file to hold

some general information about a file to assist in future use of the file.

– Useful to include information such as the length of records, the date and time of the file’s most recent update, the name of the file, and so on.

– Help make a file a self-describing object

(12)

File Access and File Organization: A Summary

• File organization depends on what use you want to make of the file.

• Since using a file implies accessing it, file access and file organization are intimately linked.

• Example:

– though using fixed-length records makes direct access easier,

– if the documents have very variable lengths, fixed-length records is not a good solution:

– the application determines our choice of both access and organization.

참조

관련 문서

the records have a unique identifier field or field combination called the primary

If the source of If the source of wave moves, the wave moves, the wave length of the wave length of the moving source will moving source will appear shorter or

It has been observed that different kinds of images have varying texture and structure complexions Therefore, the fixed patch size may not provide equally good results.

– even if task phase never make critical instant – execution times are smaller than the given values – inter-release time is longer than the given

Operation of gear, vane, piston pumps Flow rate delivered by pumps.. Flow rate

9 The BJT is a device containing three adjoining, alternately doped regions, with the middle region being very narrow compared to the diffusion length.. heavy

If two or more longest chains of the same length, the parent is the chain with the greatest number of subs..

§ Null-path-length balancing: comparing the null-path-length of each of the two sub-trees (the length to the closest null sub- tree/empty node). § Weight balancing: