Functional dependencies are useful for finding normal forms, which are criteria for good database designs. The process of putting a database design into a normal form is called normalization.
Boyce-Codd normal form (BCNF) is a very desirable form for relational database schemes because a relational database scheme in BCNF has minimal redundancy.
A relation scheme R is in BCNF with respect to a set of functional dependencies F if for all functional dependencies in where R and R (i.e. all functional dependencies in the restriction of to R), at least one of the following holds:
A database design is in BCNF if each relation scheme of the design is in BCNF.
Examples:
Consider Customer-scheme, Purchased-scheme, and Flyrod-scheme from our running example. F = {cust-num {name, address}, flyrod-stock-num {manufacturer, length, line-weight}}. To check that this design is in BCNF:
Thus, the database design is in BCNF.
Another example:
Let Customer-scheme = {cust-num, name, house-num, street, city, state, zip}, and F = {cust-num {name, house-num, street, city, state, zip}, {house-num, street, city, state} zip, zip state}.
Is this scheme in BCNF?
No, because {house-num, street, city, state} is not a superkey for Customer-scheme, and {house-num, street, city, state} zip is not trivial. A similar situation exists for zip state.
This design has the potential for repition, because zip will always be the same if the rest of the address is the same, and the state will always be the same if the zip is the same. One way to remove this redundancy is to decompose relation schemes that are not in BCNF.
The BCNF Decomposition Algorithm:
inputs: a relation scheme R and a set of functional dependencies F output: a lossless-join decomposition of R that is in BCNF
result := {R} compute while there is a relation scheme in result that is not in BCNF do choose such that is on and = {} and is not a superkey for result := (result - ) () () end
To see that result is a lossless-join decomposition, note that and ( ) ( ) is . Clearly, ( ).
To use this algorithm to decompose the modified Customer-scheme:
initially, result := {cust-num, name, house-num, street, city, state, zip}
choose zip state
The resulting decomposition is
{{cust-num, name, house-num, street, city, zip}, {zip, state}}. Let
= {cust-num, name, house-num, street, city, zip} and
= {zip, state}
Hence, this decomposition is in BCNF.
If we choose to apply {house-num, street, city, state} zip first, the first decomposition we consider is:
{{cust-num, name, house-num, street, city, state}, {house-num, street, city, state, zip}}.
Because zip state is still on the second scheme in the decomposition, it is not in BCNF. Hence, we decompose it again to produce:
{{cust-num, name, house-num, street, city, state}, {house-num, street, city, zip}, {zip, state}}.
This scheme is in BCNF. In this case, we would probably prefer the first decomposition (or we may even decide the price of avoiding the minor redundancy present here is too high and abandon BCNF).
In both of these decompositions, the functional dependency cust-num {name, house-num, street, city, state, zip} is not in the restriction of F to the decomposition (F'). Are these decompositions dependency preserving?
Proof: (that the first decomposition is not dependency preserving)
We need to show:
F' = {{zip state}}, and so in F' is just {cust-num}, while in F is {cust-num, name, house-num, street, city, state, zip}. Hence, .
In fact, neither decomposition is dependency preserving.
In relational database design, the usual goal is to achieve all of:
Clearly, this goal can not always be acheived, because some relation schemes can not be put into BCNF without giving up dependency preservation. In such a situation, we often prefer a weaker normal form that keeps dependency preservation.
A relation scheme R is in third normal form (3NF) with respect to a set of functional dependencies F if for all functional dependencies in that are in the restriction of to R, at least one of the following holds:
A relational database scheme is in 3NF when all relation schemes in the database scheme are in 3NF.
Note that the first two conditions are the same as for BCNF. Hence, every relation scheme that is in BCNF is also in 3NF. The reverse is not true, because the third condition permits some schemes that are not in BCNF.
Example: consider the decomposition:
{{cust-num, name, house-num, street, city, state}, {house-num, street, city, state, zip}}
with F = {cust-num {name, house-num, street, city, state, zip}, {house-num, street, city, state} zip, zip state}.
Is this decomposition in 3NF?
Characteristics of this decomposition:
{cust-num, name, house-num, street, city, state} {house-num, street, city, state, zip} = {house-num, street, city, state},
and {house-num, street, city, state} {house-num, street, city, state, zip}
The 3NF Decomposition Algorithm:
inputs: A set of functional dependencies F and a relation scheme R
output: a lossless join, dependency preserving decomposition in 3NF
Example: consider the scheme {cust-num, name, house-num, street, city, state, zip} with F = {cust-num {name, house-num, street, city, state, zip}, {house-num, street, city, state} zip, zip state}.
One canonical cover is = {cust-num {name, house-num, street, city, state}, {house-num, street, city, state} zip, zip state}.
All attributes in zip state occur in {house-num, street, city, state} zip, so the decomposition found in step two is: {{cust-num, name, house-num, street, city, state}, {house-num, street, city, state, zip}}
The first scheme in the decomposition contains a candidate key (cust-num), and the schemes in the decomposition have all of the attributes of the original relation. Hence, the algorithm is done. Note that this is much easier than checking that a decomposition is in 3NF directly.
For this example, another canonical cover = {cust-num {name, house-num, street, city, zip}, {house-num, street, city, state} zip, zip state}.
Applying the algorithm with this canonical cover yields the decomposition: {{cust-num, name, house-num, street, city, zip}, {house-num, street, city, state, zip}}
Hence, decomposition into 3NF is not unique.
Summary: the preferred database design is all of:
If this can not be achieved (because BCNF decompositions do not always preserve dependencies), the usual choice is: