Course Notes for Comp 419 - Relational Database Design

Relational Database Design

(read 7.1 - 7.3)

Why Some Designs Are Bad

Suppose that instead of relations customer, purchased and flyrod, the flyshop uses a database design with only one relation called purchases.

Purchases-scheme = (name, address, cust-num, flyrod-stock-num, date, manufacturer, length, line-weight)

The relation purchases is equivalent to: customer purchased flyrod

Problems associated with this design:

repitition of information
- customer names and addresses are repeated for each purchase by the same customer. This wastes storage and makes updates more difficult - there is more chance for error, and updates take longer.
- similarly, information for each flyrod model must be repeated each time one is purchased
- note that customer and flyrod information is independent. That is, cust-num {name, address}, but there are no functional dependencies between cust-num and any attributes from flyrod. Similarly, there are no functional dependencies between flyrod-stock-num and any attributes of customer.
inability to store information.
Suppose a customer hasn't purchased a flyrod (yet). Options:
- insert the customer information into purchases, with null values for all non-customer attributes. Problems:
  - null values cause problems in querying
  - won't work in SQL, because some primary key attributes would be null
- wait for the customer to purchase a flyrod, then insert the customer information into purchases. Problems:
  - the customer may never purchase a flyrod
  - we may lose the piece of paper that the customer information is written on.

Appropriate decomposition often helps solve problems associated with repitition of information and inability to store information.

Return to Table of Contents

Decomposition

Is more decomposition helpful always helpful?

Decompose flyrod into relations make and model:

make =
model =

If the flyrod relation is:

flyrod
manufacturer length line-weight flyrod-stock-num

G. Loomis 8.5 6 1

Orvis 9.5 9 2

Thomas and Thomas 9.5 3 4

flyrod
manufacturer	length	line-weight	flyrod-stock-num
G. Loomis	8.5	6	1
Orvis	9.5	9	2
Thomas and Thomas	9.5	3	4

then:

make
flyrod-stock-num manufacturer length

1 G. Loomis 8.5

2 Orvis 9.5

4 Thomas and Thomas 9.5

make
flyrod-stock-num	manufacturer	length
1	G. Loomis	8.5
2	Orvis	9.5
4	Thomas and Thomas	9.5

model
length line-weight

8.5 6

9.5 9

9.5 3

model
length	line-weight
8.5	6
9.5	9
9.5	3

If we want to know the length of the flyrod make with flyrod-stock-num of 3, we need to compute: make model

flyrod-stock-num manufacturer length line-weight

1 G. Loomis 8.5 6

2 Orvis 9.5 9

2 Orvis 9.5 3

4 Thomas and Thomas 9.5 9

4 Thomas and Thomas 9.5 3

flyrod-stock-num	manufacturer	length	line-weight
1	G. Loomis	8.5	6
2	Orvis	9.5	9
2	Orvis	9.5	3
4	Thomas and Thomas	9.5	9
4	Thomas and Thomas	9.5	3

Now flyrod-stock-num is no longer a superkey!

Note that flyrod make model, but make model has tuples that are not in flyrod. We no longer know if the flyshop stocks one or two flyrod models. Even though make model has more tuples than flyrod, we've actually lost information that was contained in the flyrod relation!

A decomposition that results in loss of information is a lossy decomposition (or lossy-join decomposition). A decomposition that does not result in loss of information is a lossless-join decomposition.

Clearly, a database design should not have a lossy decomposition.

Note that if length line-weight or length {flyrod-stock-num, manufacturer}, then this decomposition would not be lossy (because all tuples in make model result from tuples of make and manufacturer that agree on length).

Formally, a set of relation schemes is a decomposition of a relation scheme R if:

R =

(all attributes of R appear in at least one )

If r(R), it is always true that:

...

because for any :

...

The decomposition of R is a lossless-join decomposition if for all relations r(R) (that is, the relations r satisfy all of the functional dependencies on R),

...

We can test for lossy decompositions as follows. Let be a decomposition of a relation scheme R with set of functional dependencies F. This decomposition is a lossless-join decomposition if at least one of:

is logically implied by F (i.e., in

Consider a relation scheme:

Cust-purchases-scheme = (name, address, cust-num, flyrod-stock-num, date)

with F = {cust-num name, address}

If we decompose Cust-purchases-scheme into:

Customer-scheme = (name, address, cust-num)
Purchased-scheme = (cust-num, flyrod-stock-num, date)

then Customer-scheme Purchased-scheme = {cust-num}

To show that this is a lossless-join decomposition, we need to show cust-num {name, address, cust-num} or cust-num {cust-num, flyrod-stock-num, date}

Proof (of cust-num {name, address, cust-num}):

cust-num name, adress (given)
cust-num cust-num (reflexivity)
cust-num name, address, cust-num (1, 2, union rule)

Thus, this decomposition is a lossless-join decomposition.

Return to Table of Contents