Comp 419
Database Design
Exam 1 (100 pts)

  1. (8 pts) What is data abstraction? What is the relationship between data abstraction and conceptual data models?
  2. (5 pts) What is the difference between the conceptual level and the view level?
  3. (8 pts) What is a data definition language used for? At what level of (data) abstraction is it used?
  4. (5 pts) What is the difference between a query language and a data manipulation language?
  5. (6 pts) Give an example of a consistency constraint on a database (in English -- you don't need to invent any notation) and a reasonable action that might be taken when that constraint is violated.
  6. Consider the following E-R diagram for a genealogy database:

    1. (6 pts) Find all candidate keys for each entity set and designate a primary key for each. Document any assumptions about entities that affect the choice of candidate keys.
    2. (8 pts) Compute a superkey for each relationship set using the algorithm given in class. Find all candidate keys that are subsets of this superkey. Document any assumptions that affect the choice of candidate keys.
    3. (5 pts) Give the relation schemes of the relations that these entity and relationship sets would be reduced to.
    4. (8 pts) What does the arrowhead on the link between child and marriage mean? What does the presence of the arrowhead say about the relationship between children and marriages?
  7. You are a technical expert employed by the local police department. The department has just found the secret hide-out of a gang of bank robbers. The leader of the gang has been using a relational database to store information about banks, robberies, and the members of the gang. You are called in to extract information from the database that will help in catching the robbers.

    The database has relations bank-robbed (which stores information about banks that the gang has already robbed), bank- to-rob (which stores information about banks that the gang plans to rob), member (which stores information about gang members), and has-robbed (which stores information about which gang members have participated in robberies of what banks). Note that the gang may be planning to rob a bank that they have already robbed one or more times in the past. The schemes of these relations are:

    Bank-robbed-scheme = (name: string, address:string, number-of-depositors:integer)
    Bank-to-rob-scheme = (name: string, address:string, number-of-depositors:integer)
    Member-scheme = (name:string, address:string, years-in-prison:integer, banks-at:string, skill1:skilltype, skill2:skilltype)
    Has-robbed-scheme = (member-name:string, bank-name:string, date:string, amount:real)

    where attribute banks-at of member is the name of the bank where the gang member does his or her banking, and skilltype is an enumerated type consisting of {lock-picking, shooting, driving, counting-money, public-relations}.

    1. Give relational algebra and SQL expressions for the following queries:
      1. (5 pts) Find the names and addresses of all gang members who have spent more than 10 years in prison.
      2. (7 pts) Find the names and addresses of all banks that the gang has never robbed, but plans to rob someday.
      3. (7 pts) Find all skills of all members of the gang. The resulting relation should have only one attribute (column).
      4. (9 pts) Find the name and address of each bank robber who has robbed his or her own bank (i.e. the bank he or she banks at).
    2. (5 pts) If {name} is a candidate key for member and we have tuple variables t and s such that t member and s member and t[name] = s[name], what must the relationship between tuple t and tuple s be?
    3. (8 pts) Comment on the design of this relational database. (Is the gang leader a competent database designer?) In your answer, consider design issues such as the potential for duplication of information, how well the design permits storage of information that appears to be of interest, if null values would be required very often, and any other relevant design issues. (I'll give extra credit for especially good answers.)
  8. (3 pts) If relation tex2html_wrap_inline16 contains 10 tuples and relation tex2html_wrap_inline18 contains 5 tuples, how many tuples does tex2html_wrap_inline20 contain?
  9. (5 pts) Why isn't set intersection a fundamental operation in relational algebra? (You don't need to prove that it isn't fundamental - just explain why it isn't.)
  10. (8 pts) What is the difference between a data definition language (DDL) and a data manipulation language (DML)? Is the create table command part of SQL's DDL, DML, both, or neither?
  11. Consider a relational database containing relations with the following schemes:
        Customer-scheme = (name, address, cust-num)
        Flyrod-scheme = (manufacture, length, line-weight, flyrod-stock-num)
        Purchased-scheme = (cust-num, flyrod-stock-num, date)
    1. (3 pts) Express the following SQL query in relational algebra:
        select manufacturer, line-weight, date
        from customer, purchased, flyrod
        where customer.cust-num = purchased.cust-num
              and purchased.flyrod-stock-num = flyrod.flyrod-stock-num
              and name = "Elvis Presley"
    2. (6 pts) Write a relational algebra query that produces a relation containing just the cust-nums of every customer who has purchased at least one flyrod made by each of the manufacturers represented in the flyrod relation. (Hint: first write a query that returns a relation associating a customer's cust-num with all of the manufacturers of flyrods that the customer has purchased.)
    3. Give the SQL command (or sequence of commands) needed to:
      1. (6 pts) Create the flyrod relation. Use any reasonable domains for the attributes. Your answer should specify that flyrod-stock-num is the primary key for relations on Flyrod-scheme, and should also specify the domain constraint that is always associated with any primary key in SQL.
      2. (4 pts) Delete all flyrod models that no one has purchased from the flyrod relation.
  12. A company that manufactures lava lamps uses a relational database to keep track of what parts it has in stock, suppliers of parts, and which suppliers supply what parts. The schemes of these relations are:
      Part-scheme = (part-num, description, groovyness, quantity)
      Supplier-scheme = (supplier-name, address, phone-number)
      Supplies-scheme = (part-num, supplier-name, address, cost)
    The primary key for relations on Part-scheme is part-num, and the primary key for relations on Supplier-scheme is {supplier-name, address}.
    1. The management of the lava lamp company wants to study the relationship between the groovyness and the cost of parts.
      1. (5 pts) Use SQL to define a view that associates the groovyness of each part with its cost. Note that the same part can cause multiple tuples to be part of this view if that part has multiple suppliers.
      2. (5 pts) Should database modifications be allowed through this view? Why or why not?
      3. (5 pts) Assume that groovyness is expressed as an integer between 1 and 10. Use SQL to write a query that returns the average cost of a part at each level of groovyness, and the groovyness associated with that cost. (Hint: the view you defined should be useful for doing this.)

    2. Now assume that the management of the lava lamp company decides that each supplier can supply at most one kind of part - i.e. that the company will buy at most one kind of part from each supplier, even if that supplier has other kinds of parts. Thus, the company will not have two different kinds of parts in stock that were supplied by the same supplier, and will not even store the information that a supplier has other kinds of parts and could potentially supply these parts.
    3. (5 pts) Under this assumption, if we view supplies as a relationship set from supplier to part (E-R model), what is the mapping cardinality of this relationship set?
    4. (5 pts) Given your mapping cardinality from the previous part, what should the primary key of relations on Supplies-scheme be?