Integrity Constraints

(read 6.1, 6.2, 6.5, 6.6)

An integrity constraint is an arbitrary predicate pertaining to a database. We've already seen two forms of integrity constraints:

As testing integrity constraints can be time consuming, care must be taken when deciding what kinds of integrity constraints to enforce. Many older database systems explicitly limit the constraints that can be enforced, but SQL-92 does not.

Domain Constraints

In the relational model, each attribute of a relation has an associated domain. A domain constraint simply asserts that each attribute value of an inserted tuple is an element of that attribute's domain (type). This kind of constraint is easily tested each time a tuple is inserted.

Domain information also allows queries to be checked for sensibility. For example, the query:

select *
from customer, flyrod
where cust-num = manufacturer

compares a string and a number. Such an error should be found when the query is entered, not when it is executed.

SQL allows some further domain constraints to be expressed in the create table command:

Examples using check:

A domain for manufacturers, assuming that the flyshop only stocks flyrods by particular manufacturers. Note that the check clause can be used in a create domain command, as well as in a create table command.

create domain manufacturer-type char(30)
check(value in
("Orvis", "G. Loomis", "Winston", "Sage"))

The keyword constraint can be used to give a name to such a constraint. This allows the DBMS to report which constraint was violated (by name) when an error occurs.

create domain manufacturer-type char(30)
constraint legal-manufacturers check(value in
("Orvis", "G. Loomis", "Winston", "Sage"))

In either case, the domain manufacturer-type would then be used as the domain of manufacturer in the flyrod relation.

Example: put bounds on flyrod lengths.

create domain flentype real
constraint flyrod-len check(value > 0 and value <= 20.0)

Example: make sure the flyrod-stock-num can't be null.

create domain fsntype int
constraint fsnkey check(value not null and value >= 0)

This kind of constraint is useful for primary key attributes.

Return to Table of Contents

Referential Integrity

In the relations purchased and flyrod, we expect that any flyrod-stock-num appearing in a tuple of purchased also appears in a tuple of flyrod. In relational algebra:

tex2html_wrap_inline87

On the other hand, there may well be flyrod models that no one has purchased. That is, we do not require that:

tex2html_wrap_inline89

Note that if:

tex2html_wrap_inline91

then some tuples of flyrod or purchased (or both) will not participate in flyrod tex2html_wrap_inline93 tex2html_wrap_inline95 purchased.

A tuple that does not participate in a natural join is called a dangling tuple. Dangling tuples may indicate a consistency problem in the database.

Referential integrity constraints specify exactly when dangling tuples indicate a problem. To define referential integrity, we need the concept of a foreign key.

Let tex2html_wrap_inline97 and tex2html_wrap_inline99 , with tex2html_wrap_inline101 the primary key of tex2html_wrap_inline103 and tex2html_wrap_inline105 . The set of attributes tex2html_wrap_inline107 is a foreign key referencing tex2html_wrap_inline101 if for every tuple tex2html_wrap_inline111 , there is a tuple tex2html_wrap_inline113 such that:

tex2html_wrap_inline115

Equivalently: tex2html_wrap_inline117

A constraint of this form is called a referential integrity constraint (or a subset dependency). Clearly, either tex2html_wrap_inline119 or tex2html_wrap_inline101 and tex2html_wrap_inline107 are compatible.

Example: attribute flyrod-stock-num of purchased is a foreign key referencing the flyrod-stock-num attribute of flyrod, and:

tex2html_wrap_inline87

is a referential integrity constraint.

Example: attribute cust-num of purchased is a foreign key referencing the cust-num attribute of customer, and:

tex2html_wrap_inline127

is a referential integrity constraint.

Example: Any relation derived from an n-ary relationship set (E-R model) contains n foreign keys.

Example: attribute flyrod-stock-num of flyrod is NOT a foreign key referencing the flyrod-stock-num attribute of purchased, because flyrod-stock-num is not the primary key of purchased.

Return to Table of Contents

Testing Referential Integrity Constraints

To ensure that referential integrity constraints are preserved, the constraints must be tested each time the database is modified.

Consider a referential integrity constraint of the form:

tex2html_wrap_inline117

Return to Table of Contents

Referential Integrity in SQL

When a table is created in SQL, the following constraints can be specified as part of the create table statement:

Example: creating the purchased relation:

create table purchased (
cust-num int not null,
flyrod-stock-num int not null,
date char(20) not null,
primary key (cust-num, flyrod-stock-num, date),
foreign key (cust-num) references customer,
foreign key (flyrod-stock-num) references flyrod)

The implied referential integrity constraints are:

tex2html_wrap_inline127

tex2html_wrap_inline87

If the attribute names are different in the foreign relation, they can be listed explicitly in the foreign key clause.

When a referential integrity constraint is violated, the default action is for the system to reject the modification that caused the violation. In SQL-92, a foreign key clause can specify that other actions are taken to resolve the violation. These actions are modifications to one or more relations.

Example: suppose that a deletion from the flyrod relation violates one of the referential integrity constraints for purchased. Rather than having the system report an error, we can specify in the create table statement that the offending tuples in purchased are deleted:

create table purchased (
...
foreign key (flyrod-stock-num) references flyrod
on delete cascade)

Now, deleting a tuple in flyrod causes all tuples in purchased that mention the deleted flyrod to be deleted as well.

Other possible actions when a referential integrity constraint is violated:

When a candidate key is specified using unique, the system will check that any tuple inserted does not duplicate an existing tuple on those attributes. Unless specifically stated using not null, attributes listed in a unique clause may be null.

Return to Table of Contents

Functional Dependencies

Functional dependencies are a class of integrity constraints that will be important in finding good database designs.

Let R be a relation scheme, and tex2html_wrap_inline177 , tex2html_wrap_inline179 . The functional dependency tex2html_wrap_inline181 holds on R if for all legal relations r(R), for all tuples tex2html_wrap_inline187 , tex2html_wrap_inline189 , if tex2html_wrap_inline191 , then tex2html_wrap_inline193 .

Example: for relations on Customer-scheme, each cust-num is associated with at most one name: cust-num tex2html_wrap_inline195 name

Example: if K is a superkey for relations on scheme R, then for any tex2html_wrap_inline179 , tex2html_wrap_inline203 . In the customer relation:

cust-num tex2html_wrap_inline195 {name, address, cust-num}

cust-num tex2html_wrap_inline195 {name, address}

and so on.

Example: if Customer-scheme had attributes house-num, street, city, state and zip instead of address:

{house-num, street, city, state} tex2html_wrap_inline195 zip

because each house is in only one zip code.

Example:

A B C
1 5 3
2 6 4
3 7 4
1 4 3

In this relation, does A tex2html_wrap_inline195 B hold?

From top to bottom, label the tuples tex2html_wrap_inline139 through tex2html_wrap_inline215 . Then, tex2html_wrap_inline139 [A] = tex2html_wrap_inline215 [A], but tex2html_wrap_inline139 [B] tex2html_wrap_inline223 tex2html_wrap_inline215 [B]. In other words, the answer is no, because the first and fourth tuples have the same value of A, but different values of B.

Does A tex2html_wrap_inline195 C hold?

Yes, because the only tuples that agree on A are tex2html_wrap_inline139 and tex2html_wrap_inline215 , and tex2html_wrap_inline139 [C] = tex2html_wrap_inline235 .

Does AB tex2html_wrap_inline195 C hold? (AB is shorthand for {A, B}.)

Yes, trivially, because tex2html_wrap_inline239 [AB] tex2html_wrap_inline223 tex2html_wrap_inline243 [AB] for tex2html_wrap_inline245 .

A trivial functional dependency tex2html_wrap_inline181 is a functional dependency in which tex2html_wrap_inline249 . Trivial f.d.s always hold, because if two tuples agree on all attributes in tex2html_wrap_inline107 , they can not disagree on a subset of tex2html_wrap_inline107 .

Functional dependencies are associated with a relation scheme, rather than with an instance. We can NOT assume that tex2html_wrap_inline181 is a f.d. on R just because it holds for some r(R). A f.d. must hold for all r(R).

Functional dependencies can be specified in SQL-92 (using the check clause), but this is not how they are usually used. Rather, f.d.s are usually used only in designing the schemes of the relations in the database.

Return to Table of Contents

Closure of a Set of Functional Dependencies

A given set of f.d.s may imply that other f.d.s hold. For example, if A tex2html_wrap_inline195 B and B tex2html_wrap_inline195 C, does A tex2html_wrap_inline195 C hold?

proof of A tex2html_wrap_inline195 B tex2html_wrap_inline271 B tex2html_wrap_inline195 C tex2html_wrap_inline275 A tex2html_wrap_inline195 C

Let tex2html_wrap_inline139 and tex2html_wrap_inline131 be any two tuples s.t. tex2html_wrap_inline139 [A] = tex2html_wrap_inline131 [A]. (If there are no such tuples, then A tex2html_wrap_inline195 C holds by default.)
Since tex2html_wrap_inline139 [A] = tex2html_wrap_inline131 [A], tex2html_wrap_inline139 [B] = tex2html_wrap_inline131 [B] by definition of A tex2html_wrap_inline195 B
Since tex2html_wrap_inline139 [B] = tex2html_wrap_inline131 [B], tex2html_wrap_inline139 [C] = tex2html_wrap_inline131 [C] by definition of B tex2html_wrap_inline195 C
Thus, if tex2html_wrap_inline139 [A] = tex2html_wrap_inline131 [A], then tex2html_wrap_inline139 [C] = tex2html_wrap_inline131 [C]
A tex2html_wrap_inline195 C by definition of functional dependency

The closure of a set of f.d.s F (denoted F tex2html_wrap_inline319 ) is the set of all f.d.s logically implied by the f.d.s of F.

Given a set of f.d.s F, F tex2html_wrap_inline319 can be computed using Armstrong's axioms. For sets of attributes tex2html_wrap_inline107 , tex2html_wrap_inline325 , tex2html_wrap_inline327

Theorem: Armstrong's axioms are sound and complete.

Soundness: if we conclude that tex2html_wrap_inline181 from a set of f.d.s F using Armstrong's axioms (correctly!), then F implies tex2html_wrap_inline181

Completeness: using only Armstrong's axioms, it is possible to compute all of F tex2html_wrap_inline319 from a set of f.d.s F. That is, we can find all f.d.s logically implied by F using only Armstrong's axioms.

Proof of soundness:

We've already argued that transitivity and reflexivity are sound.

Proof of Augmentation:

Assume tex2html_wrap_inline181 and tex2html_wrap_inline353 for arbitrary tuples tex2html_wrap_inline139 and tex2html_wrap_inline131 . To show: tex2html_wrap_inline359

From tex2html_wrap_inline353 , tex2html_wrap_inline191 and tex2html_wrap_inline365 by reflexivity.

By tex2html_wrap_inline181 and tex2html_wrap_inline191 , tex2html_wrap_inline193 .

From tex2html_wrap_inline193 and tex2html_wrap_inline365 , tex2html_wrap_inline359 by definition of [ ].

Thus, tex2html_wrap_inline337 by definition of f.d.

Proof of completeness: omitted.

To ease computing F tex2html_wrap_inline319 , a number of additional rules are useful. The soundness of these rules can be proved using Armstrong's axioms.

Proof of soundness: union rule. Given tex2html_wrap_inline181 and tex2html_wrap_inline343 , show tex2html_wrap_inline387

  1. tex2html_wrap_inline343 (given)
  2. tex2html_wrap_inline409 (1, augmentation)
  3. tex2html_wrap_inline411 (2, definition of a set)
  4. tex2html_wrap_inline181 (given)
  5. tex2html_wrap_inline415 (4, augmentation)
  6. tex2html_wrap_inline387 (3, 5, transitivity)

Example: Given R = {A, B, C, D, E, G} and F = {AB tex2html_wrap_inline195 C, BC tex2html_wrap_inline195 D, D tex2html_wrap_inline195 EG, BE tex2html_wrap_inline195 C}, is AB tex2html_wrap_inline195 EG in F tex2html_wrap_inline319 ?

  1. AB tex2html_wrap_inline195 C (given)
  2. BC tex2html_wrap_inline195 D (given)
  3. ABB tex2html_wrap_inline195 D (1, 2, psuedotransitivity with tex2html_wrap_inline107 = AB, tex2html_wrap_inline325 = C, tex2html_wrap_inline327 = B, tex2html_wrap_inline443 = D)
  4. AB tex2html_wrap_inline195 D (3, definition of set)
  5. D tex2html_wrap_inline195 EG (given)
  6. AB tex2html_wrap_inline195 EG (4, 5, transitivity)

Simpler examples:

by D tex2html_wrap_inline195 EG and decomposition, D tex2html_wrap_inline195 E and D tex2html_wrap_inline195 G

by D tex2html_wrap_inline195 E, BE tex2html_wrap_inline195 C and psuedotransitivity, BD tex2html_wrap_inline195 C

Return to Table of Contents

Closure of Attribute Sets

Given a set tex2html_wrap_inline107 of attributes of R and a set of f.d.s F, we need a way to find all of the attributes of R that are functionally determined by tex2html_wrap_inline107 . This set of attributes is called the closure of tex2html_wrap_inline107 under F and is denoted tex2html_wrap_inline469 . Finding tex2html_wrap_inline469 is useful because:

An algorithm for computing tex2html_wrap_inline469 :

 
result :=  tex2html_wrap_inline107 

repeat
temp := result
for each f.d. tex2html_wrap_inline341 in F do
if tex2html_wrap_inline489 result then
result := result tex2html_wrap_inline491
until temp = result

Example: Given R = {A, B, C, D, E, G} and F = {AB tex2html_wrap_inline195 C, BC tex2html_wrap_inline195 D, D tex2html_wrap_inline195 EG, BE tex2html_wrap_inline195 C}, compute AB tex2html_wrap_inline319

stage result using
1 AB
2 ABC AB -> C
3 ABCD BC -> D
4 ABCDEG D -> EG
5 ABCDEG

Thus, AB is a superkey for R, because AB tex2html_wrap_inline195 ABCDEG

Correctness of the algorithm:
Need to show soundness (result tex2html_wrap_inline505 ) and completeness ( tex2html_wrap_inline507 result)

Proof of soundness: by induction on iterations of the loop
basis: tex2html_wrap_inline509 by reflexivity
induction hypothesis: result at stage i, denoted resulttex2html_wrap_inline513 , satisfies resulttex2html_wrap_inline515
induction step: Consider a f.d. tex2html_wrap_inline341 where tex2html_wrap_inline489 resulttex2html_wrap_inline513 . Then resulttex2html_wrap_inline523 by reflexivity, and so resulttex2html_wrap_inline525 by transitivity. Hence, resulttex2html_wrap_inline525 resulttex2html_wrap_inline513 by the union rule. Since resulttex2html_wrap_inline515 by the I.H., tex2html_wrap_inline327 resulttex2html_wrap_inline515 by definition. As tex2html_wrap_inline327 resulttex2html_wrap_inline513 = resulttex2html_wrap_inline541 , resulttex2html_wrap_inline543 .

Proof of completeness ( tex2html_wrap_inline507 result)
(by contradiction) Let A tex2html_wrap_inline547 R be such that A tex2html_wrap_inline549 and A tex2html_wrap_inline551 result. Construct a relation r(R) with 2 tuples. The tuples agree on all attributes of result, but disagree on all other attributes of R.

lemma: r satisfies F
proof: (by contradiction) Assume r does not satisfy F. Then r must not satisfy some f.d. tex2html_wrap_inline341 in F. This can only happen if both tuples in r agree on all attributes of tex2html_wrap_inline325 and disagree on some attribute of tex2html_wrap_inline327 . If both agree on all attributes of tex2html_wrap_inline325 , then tex2html_wrap_inline489 result by the construction of r. But, if tex2html_wrap_inline489 result and tex2html_wrap_inline341 , then tex2html_wrap_inline567 result by the definition of the algorithm. However, the tuples disagree on some attribute of tex2html_wrap_inline327 , so tex2html_wrap_inline571 result by the construction of r. This is a contradiction, so the assumption that r does not satisfy F must be false, i.e. r satisfies F.
end of lemma

We have A tex2html_wrap_inline549 , so tex2html_wrap_inline575 A is logically implied by F. Thus, any relation that satisfies F satisfies tex2html_wrap_inline575 A.
Thus, r satisfies tex2html_wrap_inline575 A by the lemma.
Hence, the tuples of r agree on A by transitivity (since the tuples in r must certainly agree on tex2html_wrap_inline107 ).
Thus, A tex2html_wrap_inline547 result by the construction of r. This contradicts the original assumption, which must then be false. Hence, tex2html_wrap_inline507 result.

Correctness of the algorithm:
by soundness and completeness, result tex2html_wrap_inline587 result
Hence, result = tex2html_wrap_inline469

Return to Table of Contents

Canonical Cover of a Set of Functional Dependencies

If a relation must satisfy a set of f.d.s F, then we need to test that each f.d. is satisfied each time the relation is updated or tuples are inserted. For efficiency, we want the smallest set of f.d.s F tex2html_wrap_inline591 possible such that F tex2html_wrap_inline319 = F tex2html_wrap_inline595 .

A canonical cover F tex2html_wrap_inline591 of a set of f.d.s F is a set of f.d.s such that F tex2html_wrap_inline595 = F tex2html_wrap_inline319 and for each f.d. tex2html_wrap_inline181 in F tex2html_wrap_inline591

  1. removing any attribute A from tex2html_wrap_inline107 changes F tex2html_wrap_inline595 (no extraneous attributes in tex2html_wrap_inline107 )
  2. removing any attribute B from tex2html_wrap_inline325 changes F tex2html_wrap_inline595 (no extraneous attributes in tex2html_wrap_inline325 )
  3. there is no additional f.d. tex2html_wrap_inline619 in F tex2html_wrap_inline591 such that tex2html_wrap_inline107 = tex2html_wrap_inline625

Point 3 can be addressed using the union rule - combine tex2html_wrap_inline181 and tex2html_wrap_inline619 into tex2html_wrap_inline631

Points 1 and 2 must be checked directly. This is done by checking each f.d. in F for extraneous attributes, removing any such attributes found, combining f.d.s as in point 3 above where possible, and continuing this process until no more attributes can be removed.

Example: Given F = {CE tex2html_wrap_inline195 A, C tex2html_wrap_inline195 A, C tex2html_wrap_inline195 B, AB tex2html_wrap_inline195 C, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 A}, find F tex2html_wrap_inline591 .

Use the union rule to produce: {CE tex2html_wrap_inline195 A, C tex2html_wrap_inline195 AB, AB tex2html_wrap_inline195 C, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 A}

To check if E is extraneous in CE tex2html_wrap_inline195 A, we check that C tex2html_wrap_inline195 A in F tex2html_wrap_inline319 , and that if we replace CE tex2html_wrap_inline195 A by C tex2html_wrap_inline195 A, CE tex2html_wrap_inline195 A is still in the closure.

To check if C tex2html_wrap_inline195 A is in F tex2html_wrap_inline319 , we compute C tex2html_wrap_inline319 .

CAB - no need to go further as we have A

To check if CE tex2html_wrap_inline195 A is still in the closure, we compute CE tex2html_wrap_inline319 in the set of f.d.s: {C tex2html_wrap_inline195 A, C tex2html_wrap_inline195 AB, AB tex2html_wrap_inline195 C, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 A}

CEA - no need to go further as we have A

Hence, E is extraneous. The current set of f.d.s is: {C tex2html_wrap_inline195 A, C tex2html_wrap_inline195 AB, AB tex2html_wrap_inline195 C, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 A}

We can use the union rule again, producing: {C tex2html_wrap_inline195 AB, AB tex2html_wrap_inline195 C, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 A}

To check if the A is extraneous in C tex2html_wrap_inline195 AB, we check if C tex2html_wrap_inline195 B is in F tex2html_wrap_inline319 , and if C tex2html_wrap_inline195 AB is in F tex2html_wrap_inline319 if we replace it by C tex2html_wrap_inline195 B

Computing C tex2html_wrap_inline319 : CAB - no need to go further as we have B

To see if C tex2html_wrap_inline195 AB is in the closure of: {C tex2html_wrap_inline195 B, AB tex2html_wrap_inline195 C, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 A}

CBA - done

Hence, the A is extraneous, and the current set of f.d.s is: {C tex2html_wrap_inline195 B, AB tex2html_wrap_inline195 C, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 A}

To check if A is extraneous in AB tex2html_wrap_inline195 C, we check if B tex2html_wrap_inline195 C is in F tex2html_wrap_inline319

BAC - done

and if AB tex2html_wrap_inline195 C is in the closure of: {C tex2html_wrap_inline195 B, B tex2html_wrap_inline195 C, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 A}

ABC - done

Hence, A is extraneous, and the current set of f.d.s is: {C tex2html_wrap_inline195 B, B tex2html_wrap_inline195 C, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 A}

Apply the union rule: {C tex2html_wrap_inline195 B, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 AC}

To check if A is extraneous in B tex2html_wrap_inline195 AC:

Is B tex2html_wrap_inline195 C in F tex2html_wrap_inline319 ?

BAC - done

Is B tex2html_wrap_inline195 AC in the closure of: {C tex2html_wrap_inline195 B, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 C}?

BC - done

No, so A is NOT extraneous:

To check if C is extraneous in B tex2html_wrap_inline195 AC:

Is B tex2html_wrap_inline195 A in F tex2html_wrap_inline319 ?

BAC - yes!

Is B tex2html_wrap_inline195 AC in the closure of: {C tex2html_wrap_inline195 B, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 A}?

BA - done

No, so C is NOT extraneous. Hence,

F tex2html_wrap_inline591 = {C tex2html_wrap_inline195 B, A tex2html_wrap_inline195 B, B tex2html_wrap_inline195 AC}

Note that the canonical cover is NOT unique. Other possible values for F tex2html_wrap_inline591 for this particular F are:

{A tex2html_wrap_inline195 BC, B tex2html_wrap_inline195 A, C tex2html_wrap_inline195 A}

{C tex2html_wrap_inline195 B, A tex2html_wrap_inline195 C, B tex2html_wrap_inline195 A}

Any of these sets of f.d.s is much more efficient to test than the original F.

Return to Table of Contents