SQL-92 provides various kinds of joins (including natural join) that can
be used in the from
clause of a query.
Example - compute the natural join of customer and purchased:
select *
from
(customer natural inner join
purchased)
An inner join is a join in which tuples from one or both of the argument relations may not participate in the result. The notion of natural join from relational algebra is a kind of inner join because, for example, any customer tuple of a customer who has not purchased a flyrod does not appear in any tuple of the result of the above query.
Example: Suppose we have relations customer2 and purchased2 as follows:
name | address | cust-num |
---|---|---|
Tim Wahls | E258 | 1 |
Linda Null | E258 | 3 |
Elvis Presley | Graceland | 2 |
cust-num | flyrod-stock-num | date |
---|---|---|
1 | 3 | 9/4/1994 |
6 | 2 | 9/10/1994 |
3 | 3 | 9/3/1994 |
5 | 2 | 8/8/1992 |
3 | 2 | 9/2/1994 |
3 | 1 | 9/1/1994 |
The result of:
select *
from
(customer2 natural inner join
purchased2)
is the usual natural join, except that attributes are reordered:
cust-num | name | address | flyrod-stock-num | date |
---|---|---|---|---|
1 | Tim Wahls | E258 | 3 | 9/4/1994 |
3 | Linda Null | E258 | 3 | 9/3/1994 |
3 | Linda Null | E258 | 2 | 9/2/1994 |
3 | Linda Null | E258 | 1 | 9/1/1994 |
Order of attributes:
An outer join is a join that guarantees that all tuples from 1 (or both) of the argument relations appear in the result of the join. This is accomplished by:
In a left outer join, tuples from the left relation (first argument) that do not participate in the join are padded as above and added to the result. The right outer join is symmetric. In a full outer join, all tuples from both argument relations that did not participate in the join are padded and added to the result.
The result of:
select *
from
(customer2 natural right outer join
purchased2)
is:
cust-num | name | address | flyrod-stock-num | date |
---|---|---|---|---|
1 | Tim Wahls | E258 | 3 | 9/4/1994 |
3 | Linda Null | E258 | 3 | 9/3/1994 |
3 | Linda Null | E258 | 2 | 9/2/1994 |
3 | Linda Null | E258 | 1 | 9/1/1994 |
6 | null | null | 2 | 9/10/1994 |
5 | null | null | 2 | 8/8/1992 |
The result of:
select *
from
(customer2 natural full outer join
purchased2)
is:
cust-num | name | address | flyrod-stock-num | date |
---|---|---|---|---|
1 | Tim Wahls | E258 | 3 | 9/4/1994 |
3 | Linda Null | E258 | 3 | 9/3/1994 |
3 | Linda Null | E258 | 2 | 9/2/1994 |
3 | Linda Null | E258 | 1 | 9/1/1994 |
2 | Elvis Presley | Graceland | null | null |
6 | null | null | 2 | 9/10/1994 |
5 | null | null | 2 | 8/8/1992 |
Note that tuples from the inner join appear first, then "left over" tuples from the left relation, and finally "left over" tuples from the right relation.
Example: find the names and addresses of all customers who have not purchased a flyrod.
select
name, address
from
(customer2 natural left outer join
purchased2)
where
flyrod-stock-num is
null
The full set of join types in SQL-92 is:
inner join
left outer join
right outer join
full outer join
cross join
union join
The cross join
is equivalent to cartesian product. The
union join
is computed by adding n nulls to the end of
each tuple of the left relation, where n is the number of attributes
in the right relation, adding m nulls to the beginning of each tuple
of the right relation, where m is the number of attributes in the
left relation, and unioning the result.
SQL-92 provides several kinds of joins in addition to natural joins. The possible join conditions are:
natural
on
P
where P is a predicate
using
(A1, A2, ..., An)
where A1, A2, ..., An are attributes common to the relations being joined
A join condition must be used with an outer join, but is optional with an inner join.
The using
condition provides a join similar to a natural
join, except that joining occurs only on the attributes listed.
Example:
select *
from
(customer2 inner join
purchased2
using
(cust-num))
is equivalent to:
select *
from
(customer2 natural inner join
purchased2)
Example:
select *
from
(customer inner join
employee
using
(name))
joins tuples of customer and purchased where the name attributes match, but the address attributes may be different. The scheme of the result of this query is:
(name, address, cust-num, address, social-security-num, salary)
In other words, attributes mentioned in the using
clause appear
only once in the result.
The on
condition provides a theta join - only tuples
that satisfy the specified condition are joined.
Example: Find the natural join of customer and purchased.
select
name, address, customer.cust-num, flyrod-stock-num, date
from
(customer inner join
purchased
on
customer.cust-num = purchased.cust-num)
Listing attributes in the select
clause is necessary because
the scheme of the result of a theta join is the concatenation of the schemes
of the argument relation. In this case, the result would have had duplicate
cust-num attributes.
Note that the previous query is equivalent to:
select
name, address, customer.cust-num, flyrod-stock-num, date
from
customer, purchased
where
customer.cust-num = purchased.cust-num
Example: Find the manufacturer and length of all flyrods except the shortest flyrod (or flyrods if there is a tie).
select distinct
f1.manufacturer, f1.length
from
flyrod as
f1 inner join
flyrod
as
f2 on
f1.length > f2.length
If the join condition for an inner join is omitted, the result is the cartesian product. Hence, the following are all equivalent:
select *
from
r, s
select *
from
(r inner join
s)
select *
from
r cross join
s
The union join
can also be expressed using a theta join:
select *
from
r union join
s
is equivalent to:
select *
from
r full outer join
s on
false
The ability to use SQL commands in general (C, Cobol) programs is necessary because:
Embedded SQL refers to SQL commands that appear in host language programs, where the host language may be C, Cobol, etc. Special syntax is used to mark embedded SQL statements. Before the program can be compiled, it must be fed through a preprocessor that replaces the embedded SQL statements with system commands.
Information is transferred between normal host language commands and embedded SQL commands through host language variables. Any variable used this way must be declared in a special program section. The following syntax is typical when C is the host language:
EXEC SQL begin declare section; float len, clen; char manuf[40], n[40], addr[50]; EXEC SQL end declare section;
In the embedded SQL statements, these variables are preceded by : to distinguish them from SQL variables or constants.
If the embedded SQL is a select
statement, the host program
must declare a cursor that can be used to step through the result one
tuple at a time. For example, if we want to find the names and addresses of
everyone who has purchased a flyrod over a particular length, along with the
length of that flyrod, we declare a
cursor and associate it with the appropriate query:
printf("Enter the length: "); scanf("%f", &clen); EXEC SQL declare c cursor for select name, address, length from customer, purchased, flyrod where customer.cust-num = purchased.cust-num and purchased.flyrod-stock-num = flyrod.flyrod-stock-num and length > :clen;
The ; after clen
ends the EXEC SQL
section.
To use this query, we use an open
statement to initialize the
cursor, a fetch
statement to get the next tuple, and a
close
statement to close the query, so that it could be
reopened if the query were repeated. A whenever
statement
can be used to check for the error condition of no tuples remaining in
the result. The complete program follows:
#include <stdio.h> main() { int count = 0; double sum = 0.0; EXEC SQL begin declare section; float len, clen; char n[40], addr[50]; EXEC SQL end declare section; printf("Enter the length: "); scanf("%f", &clen); EXEC SQL declare c cursor for select name, address, length from customer, purchased, flyrod where customer.cust-num = purchased.cust-num and purchased.flyrod-stock-num = flyrod.flyrod-stock-num and length > :clen; printf("Printing names and addresses,\n"); printf("computing average flyrod length."); EXEC SQL open c; EXEC SQL whenever NOTFOUND goto alldone; while (1) { /* infinite loop */ EXEC SQL fetch c into :n, :addr, :len; printf("name: %s; address: %s\n", n, addr); count++; sum += len; } alldone: EXEC SQL close c; printf("The average length is: %lf\n", sum/count); }
SQL commands that do not return a result (i.e. update
,
insert
, delete
) do not require a cursor, and so
can simply be executed.
Example - an insertion into flyrod:
EXEC SQL begin declare section; int fsn, lw; float len; char manuf[40]; EXEC SQL end declare section; printf("Enter the manufacturer: "); gets(manuf); printf("Enter the length: "); scanf("%f", &len); printf("Enter the line-weight: "); scanf("%d", &lw); printf("Enter the stock-num: "); scanf("%d", &fsn); EXEC SQL insert into flyrod values (:manuf, :len, :lw, :fsn);
SQL-92 extensions: