Advanced SQL

Joined Relations

Join Types

SQL-92 provides various kinds of joins (including natural join) that can be used in the from clause of a query.

Example - compute the natural join of customer and purchased:

select *
from (customer natural inner join purchased)

An inner join is a join in which tuples from one or both of the argument relations may not participate in the result. The notion of natural join from relational algebra is a kind of inner join because, for example, any customer tuple of a customer who has not purchased a flyrod does not appear in any tuple of the result of the above query.

Example: Suppose we have relations customer2 and purchased2 as follows:

customer2
name address cust-num
Tim Wahls E258 1
Linda Null E258 3
Elvis Presley Graceland 2

purchased2
cust-num flyrod-stock-num date
1 3 9/4/1994
6 2 9/10/1994
3 3 9/3/1994
5 2 8/8/1992
3 2 9/2/1994
3 1 9/1/1994

The result of:

select *
from (customer2 natural inner join purchased2)

is the usual natural join, except that attributes are reordered:

cust-num name address flyrod-stock-num date
1 Tim Wahls E258 3 9/4/1994
3 Linda Null E258 3 9/3/1994
3 Linda Null E258 2 9/2/1994
3 Linda Null E258 1 9/1/1994

Order of attributes:

An outer join is a join that guarantees that all tuples from 1 (or both) of the argument relations appear in the result of the join. This is accomplished by:

  1. computing the inner join
  2. padding tuples that did not participate with nulls appropriately
  3. adding the padded tuples to the result

In a left outer join, tuples from the left relation (first argument) that do not participate in the join are padded as above and added to the result. The right outer join is symmetric. In a full outer join, all tuples from both argument relations that did not participate in the join are padded and added to the result.

The result of:

select *
from (customer2 natural right outer join purchased2)

is:

cust-num name address flyrod-stock-num date
1 Tim Wahls E258 3 9/4/1994
3 Linda Null E258 3 9/3/1994
3 Linda Null E258 2 9/2/1994
3 Linda Null E258 1 9/1/1994
6 null null 2 9/10/1994
5 null null 2 8/8/1992

The result of:

select *
from (customer2 natural full outer join purchased2)

is:

cust-num name address flyrod-stock-num date
1 Tim Wahls E258 3 9/4/1994
3 Linda Null E258 3 9/3/1994
3 Linda Null E258 2 9/2/1994
3 Linda Null E258 1 9/1/1994
2 Elvis Presley Graceland null null
6 null null 2 9/10/1994
5 null null 2 8/8/1992

Note that tuples from the inner join appear first, then "left over" tuples from the left relation, and finally "left over" tuples from the right relation.

Example: find the names and addresses of all customers who have not purchased a flyrod.

select name, address
from (customer2 natural left outer join purchased2)
where flyrod-stock-num is null

The full set of join types in SQL-92 is:

The cross join is equivalent to cartesian product. The union join is computed by adding n nulls to the end of each tuple of the left relation, where n is the number of attributes in the right relation, adding m nulls to the beginning of each tuple of the right relation, where m is the number of attributes in the left relation, and unioning the result.

Return to Table of Contents

Join Conditions

SQL-92 provides several kinds of joins in addition to natural joins. The possible join conditions are:

A join condition must be used with an outer join, but is optional with an inner join.

The using condition provides a join similar to a natural join, except that joining occurs only on the attributes listed.

Example:

select *
from (customer2 inner join purchased2 using (cust-num))

is equivalent to:

select *
from (customer2 natural inner join purchased2)

Example:

select *
from (customer inner join employee using (name))

joins tuples of customer and purchased where the name attributes match, but the address attributes may be different. The scheme of the result of this query is:

(name, address, cust-num, address, social-security-num, salary)

In other words, attributes mentioned in the using clause appear only once in the result.

The on condition provides a theta join - only tuples that satisfy the specified condition are joined.

Example: Find the natural join of customer and purchased.

select name, address, customer.cust-num, flyrod-stock-num, date
from (customer inner join purchased on customer.cust-num = purchased.cust-num)

Listing attributes in the select clause is necessary because the scheme of the result of a theta join is the concatenation of the schemes of the argument relation. In this case, the result would have had duplicate cust-num attributes.

Note that the previous query is equivalent to:

select name, address, customer.cust-num, flyrod-stock-num, date
from customer, purchased
where customer.cust-num = purchased.cust-num

Example: Find the manufacturer and length of all flyrods except the shortest flyrod (or flyrods if there is a tie).

select distinct f1.manufacturer, f1.length
from flyrod as f1 inner join flyrod as f2 on f1.length > f2.length

If the join condition for an inner join is omitted, the result is the cartesian product. Hence, the following are all equivalent:

select *
from r, s

select *
from (r inner join s)

select *
from r cross join s

The union join can also be expressed using a theta join:

select *
from r union join s

is equivalent to:

select *
from r full outer join s on false

Return to Table of Contents

Embedded SQL

The ability to use SQL commands in general (C, Cobol) programs is necessary because:

Embedded SQL refers to SQL commands that appear in host language programs, where the host language may be C, Cobol, etc. Special syntax is used to mark embedded SQL statements. Before the program can be compiled, it must be fed through a preprocessor that replaces the embedded SQL statements with system commands.

Information is transferred between normal host language commands and embedded SQL commands through host language variables. Any variable used this way must be declared in a special program section. The following syntax is typical when C is the host language:

EXEC SQL begin declare section;
  float len, clen;
  char manuf[40], n[40], addr[50];
EXEC SQL end declare section;

In the embedded SQL statements, these variables are preceded by : to distinguish them from SQL variables or constants.

If the embedded SQL is a select statement, the host program must declare a cursor that can be used to step through the result one tuple at a time. For example, if we want to find the names and addresses of everyone who has purchased a flyrod over a particular length, along with the length of that flyrod, we declare a cursor and associate it with the appropriate query:

  printf("Enter the length: ");
  scanf("%f", &clen);
EXEC SQL 
  declare c cursor for
  select name, address, length
  from customer, purchased, flyrod
  where customer.cust-num = purchased.cust-num 
    and purchased.flyrod-stock-num = flyrod.flyrod-stock-num
    and length > :clen;

The ; after clen ends the EXEC SQL section.

To use this query, we use an open statement to initialize the cursor, a fetch statement to get the next tuple, and a close statement to close the query, so that it could be reopened if the query were repeated. A whenever statement can be used to check for the error condition of no tuples remaining in the result. The complete program follows:

#include <stdio.h>

main() {
  int count = 0; double sum = 0.0;

EXEC SQL begin declare section;
  float len, clen;
  char n[40], addr[50];
EXEC SQL end declare section;

  printf("Enter the length: ");
  scanf("%f", &clen);

EXEC SQL 
  declare c cursor for
  select name, address, length
  from customer, purchased, flyrod
  where customer.cust-num = purchased.cust-num 
    and purchased.flyrod-stock-num = flyrod.flyrod-stock-num
    and length > :clen;

  printf("Printing names and addresses,\n");
  printf("computing average flyrod length.");
  
EXEC SQL open c; 
EXEC SQL whenever NOTFOUND goto alldone;

  while (1) { /* infinite loop */
EXEC SQL fetch c into :n, :addr, :len;
    printf("name: %s; address: %s\n", n, addr);
    count++;
    sum += len;
  }

alldone:
  EXEC SQL close c;

  printf("The average length is: %lf\n", sum/count);
}

SQL commands that do not return a result (i.e. update, insert, delete) do not require a cursor, and so can simply be executed.

Example - an insertion into flyrod:

EXEC SQL begin declare section;
  int fsn, lw; 
  float len;
  char manuf[40];
EXEC SQL end declare section;

  printf("Enter the manufacturer: ");
  gets(manuf);
  printf("Enter the length: ");
  scanf("%f", &len);
  printf("Enter the line-weight: ");
  scanf("%d", &lw);
  printf("Enter the stock-num: ");
  scanf("%d", &fsn);

EXEC SQL insert into flyrod
         values (:manuf, :len, :lw, :fsn);

SQL-92 extensions:

Return to Table of Contents