Functional dependencies are useful for finding **normal forms**, which
are criteria for good database designs. The process of putting a
database design into a normal form is called **normalization**.

Boyce-Codd normal form (BCNF) is a very desirable form for relational database schemes because a relational database scheme in BCNF has minimal redundancy.

A relation scheme R is in BCNF with respect to a set of functional
dependencies *F* if for all functional dependencies
in where R and
R (i.e. all functional dependencies in the restriction
of to R), at least one of the following holds:

- is a trivial functional dependency (i.e. )
- is a superkey for scheme R

A database design is in BCNF if each relation scheme of the design is in BCNF.

Examples:

Consider Customer-scheme, Purchased-scheme, and Flyrod-scheme from our
running example. *F* = {cust-num {name, address},
flyrod-stock-num {manufacturer, length, line-weight}}.
To check that this design is in BCNF:

- For scheme Customer-scheme, the only nontrivial functional dependencies in that are in the restriction to Customer-scheme have cust-num as a member of the left hand side of the . Since cust-num is clearly a superkey for Customer-scheme, this scheme is in BCNF.
- For scheme Purchased-scheme, only trivial functional in are in the restriction to Purchased-scheme. Hence, this scheme is in BCNF.
- For scheme Flyrod-scheme, the only nontrivial functional dependencies in that are in the restriction to Flyrod-scheme have flyrod-stock-num as a member of the left hand side of the . Since flyrod-stock-num is clearly a superkey for Flyrod-scheme, this scheme is in BCNF.

Thus, the database design is in BCNF.

Another example:

Let Customer-scheme = {cust-num, name, house-num, street, city, state, zip},
and
*F* = {cust-num {name, house-num, street, city, state, zip},
{house-num, street, city, state} zip,
zip state}.

Is this scheme in BCNF?

No, because {house-num, street, city, state} is not a superkey for Customer-scheme, and {house-num, street, city, state} zip is not trivial. A similar situation exists for zip state.

This design has the potential for repition, because zip will always be the same if the rest of the address is the same, and the state will always be the same if the zip is the same. One way to remove this redundancy is to decompose relation schemes that are not in BCNF.

**The BCNF Decomposition Algorithm:**

inputs: a relation scheme R and a set of functional dependenciesFoutput: a lossless-join decomposition of R that is in BCNF

result := {R} compute while there is a relation scheme in result that is not in BCNF do choose such that is on and = {} and is not a superkey for result := (result - ) () () end

To see that result is a lossless-join decomposition, note that and ( ) ( ) is . Clearly, ( ).

To use this algorithm to decompose the modified Customer-scheme:

initially, result := {cust-num, name, house-num, street, city, state, zip}

choose zip state

The resulting decomposition is
{{cust-num, name, house-num, street, city, zip}, {zip, state}}. Let
= {cust-num, name, house-num, street, city, zip} and
= {zip, state}

- For , the only nontrivial functional dependencies in that are in the restriction to have cust-num as a member of the left hand side. Since cust-num is a superkey for , this scheme is in BCNF.
- For , the only nontrivial functional dependencies in that are in the restriction to have zip as a member of the left hand side. Since zip is a superkey for , this scheme is in BCNF.

Hence, this decomposition is in BCNF.

If we choose to apply {house-num, street, city, state} zip first, the first decomposition we consider is:

{{cust-num, name, house-num, street, city, state}, {house-num, street, city, state, zip}}.

Because zip state is still on the second scheme in the decomposition, it is not in BCNF. Hence, we decompose it again to produce:

{{cust-num, name, house-num, street, city, state}, {house-num, street, city, zip}, {zip, state}}.

This scheme is in BCNF. In this case, we would probably prefer the first decomposition (or we may even decide the price of avoiding the minor redundancy present here is too high and abandon BCNF).

In both of these decompositions, the functional dependency
cust-num {name, house-num, street, city, state, zip} is not
in the restriction of *F* to the decomposition (*F*'). Are these
decompositions dependency preserving?

Proof: (that the first decomposition is not dependency preserving)

We need to show:

*F*' = {{zip state}}, and so in
*F*' is just {cust-num}, while in *F* is
{cust-num, name, house-num, street, city, state, zip}.
Hence, .

In fact, neither decomposition is dependency preserving.

In relational database design, the usual goal is to achieve all of:

- BCNF
- lossless join
- dependency preservation

Clearly, this goal can not always be acheived, because some relation schemes can not be put into BCNF without giving up dependency preservation. In such a situation, we often prefer a weaker normal form that keeps dependency preservation.

A relation scheme R is in third normal form (3NF) with respect to a set
of functional dependencies *F* if for all functional dependencies
in that are in the restriction of
to R, at least one of the following holds:

- is trivial
- is a superkey for R
- each attribute A is contained in a candidate key for R

A relational database scheme is in 3NF when all relation schemes in the database scheme are in 3NF.

Note that the first two conditions are the same as for BCNF. Hence, every relation scheme that is in BCNF is also in 3NF. The reverse is not true, because the third condition permits some schemes that are not in BCNF.

Example: consider the decomposition:

{{cust-num, name, house-num, street, city, state}, {house-num, street, city, state, zip}}

with
*F* = {cust-num {name, house-num, street, city, state, zip},
{house-num, street, city, state} zip,
zip state}.

Is this decomposition in 3NF?

- For {cust-num, name, house-num, street, city, state}, the only nontrivial functional dependences in that are on this scheme are those with cust-num as a member of the left hand side of the . As cust-num is a superkey for this scheme, these functional dependencies satisfy the second condition for 3NF.
- For {house-num, street, city, state, zip}, there are two kinds of
nontrivial functional dependencies in that are on this scheme:
- those with {house-num, street, city, state} as a subset of the left hand side of the . As {house-num, street, city, state} is a superkey for this scheme, such functional dependencies satisfy the second condition for 3NF.
- functional dependencies of the form {zip} {state}, where . For any such functional dependency, ( {state}) - ( {zip}) = {state}. Because state {house-num, street, city, state} which is a candidate key for this scheme, such functional dependencies satisfy the third condition for 3NF.

Characteristics of this decomposition:

- it is in 3NF.
- it is not in BCNF (we tested this earlier).
- it has more redundancy than our previous decomposition into BCNF
- it is a lossless join decomposition, because:
{cust-num, name, house-num, street, city, state} {house-num, street, city, state, zip} = {house-num, street, city, state},

and {house-num, street, city, state} {house-num, street, city, state, zip}

- it is not dependency preserving (proof?)

**The 3NF Decomposition Algorithm:**

inputs: A set of functional dependencies *F* and a relation scheme R

output: a lossless join, dependency preserving decomposition in 3NF

- compute (a canonical cover of
*F*) - for each functional dependency in , if for some other functional dependency in , then is a scheme in the decomposition
- if none of the schemes found in step 2 contains a candidate key for
R, choose any candidate key
*K*and add scheme*K*to the decomposition - for any attributes A R not contained in any scheme in the decomposition thus far computed, add {A} to the decomposition

Example: consider the scheme
{cust-num, name, house-num, street, city, state, zip}
with
*F* = {cust-num {name, house-num, street, city, state, zip},
{house-num, street, city, state} zip,
zip state}.

One canonical cover is = {cust-num {name, house-num, street, city, state}, {house-num, street, city, state} zip, zip state}.

All attributes in zip state occur in {house-num, street, city, state} zip, so the decomposition found in step two is: {{cust-num, name, house-num, street, city, state}, {house-num, street, city, state, zip}}

The first scheme in the decomposition contains a candidate key (cust-num), and the schemes in the decomposition have all of the attributes of the original relation. Hence, the algorithm is done. Note that this is much easier than checking that a decomposition is in 3NF directly.

For this example, another canonical cover = {cust-num {name, house-num, street, city, zip}, {house-num, street, city, state} zip, zip state}.

Applying the algorithm with this canonical cover yields the decomposition: {{cust-num, name, house-num, street, city, zip}, {house-num, street, city, state, zip}}

Hence, decomposition into 3NF is not unique.

Summary: the preferred database design is all of:

- BCNF
- lossless join
- dependency preservation

If this can not be achieved (because BCNF decompositions do not always preserve dependencies), the usual choice is:

- 3NF
- lossless join
- dependency preservation