Suppose that instead of relations customer, purchased and flyrod, the flyshop uses a database design with only one relation called purchases.
Purchases-scheme = (name, address, cust-num, flyrod-stock-num, date, manufacturer, length, line-weight)
The relation purchases is equivalent to: customer purchased flyrod
Problems associated with this design:
Suppose a customer hasn't purchased a flyrod (yet). Options:
Appropriate decomposition often helps solve problems associated with repitition of information and inability to store information.
Is more decomposition helpful always helpful?
Decompose flyrod into relations make and model:
make =
model =
If the flyrod relation is:
manufacturer | length | line-weight | flyrod-stock-num |
---|---|---|---|
G. Loomis | 8.5 | 6 | 1 |
Orvis | 9.5 | 9 | 2 |
Thomas and Thomas | 9.5 | 3 | 4 |
then:
flyrod-stock-num | manufacturer | length |
---|---|---|
1 | G. Loomis | 8.5 |
2 | Orvis | 9.5 |
4 | Thomas and Thomas | 9.5 |
length | line-weight |
---|---|
8.5 | 6 |
9.5 | 9 |
9.5 | 3 |
If we want to know the length of the flyrod make with flyrod-stock-num of 3, we need to compute: make model
flyrod-stock-num | manufacturer | length | line-weight |
---|---|---|---|
1 | G. Loomis | 8.5 | 6 |
2 | Orvis | 9.5 | 9 |
2 | Orvis | 9.5 | 3 |
4 | Thomas and Thomas | 9.5 | 9 |
4 | Thomas and Thomas | 9.5 | 3 |
Now flyrod-stock-num is no longer a superkey!
Note that flyrod make model, but make model has tuples that are not in flyrod. We no longer know if the flyshop stocks one or two flyrod models. Even though make model has more tuples than flyrod, we've actually lost information that was contained in the flyrod relation!
A decomposition that results in loss of information is a lossy decomposition (or lossy-join decomposition). A decomposition that does not result in loss of information is a lossless-join decomposition.
Clearly, a database design should not have a lossy decomposition.
Note that if length line-weight or length {flyrod-stock-num, manufacturer}, then this decomposition would not be lossy (because all tuples in make model result from tuples of make and manufacturer that agree on length).
Formally, a set of relation schemes is a decomposition of a relation scheme R if:
R =
(all attributes of R appear in at least one )
If r(R), it is always true that:
...
because for any :
...
The decomposition of R is a lossless-join decomposition if for all relations r(R) (that is, the relations r satisfy all of the functional dependencies on R),
...
We can test for lossy decompositions as follows. Let be a decomposition of a relation scheme R with set of functional dependencies F. This decomposition is a lossless-join decomposition if at least one of:
Consider a relation scheme:
Cust-purchases-scheme = (name, address, cust-num, flyrod-stock-num, date)
with F = {cust-num name, address}
If we decompose Cust-purchases-scheme into:
Customer-scheme = (name, address, cust-num)
Purchased-scheme = (cust-num, flyrod-stock-num, date)
then Customer-scheme Purchased-scheme = {cust-num}
To show that this is a lossless-join decomposition, we need to show cust-num {name, address, cust-num} or cust-num {cust-num, flyrod-stock-num, date}
Proof (of cust-num {name, address, cust-num}):
Thus, this decomposition is a lossless-join decomposition.