Creating a table in SQL - some examples
create table contact(
ID integer,
LName varchar,
FName varChar,
Relationship varChar,
HomeID integer
);
Filling the table, with SQL commands
insert into contact values (1,'Harris', 'Andy', 'me', 1);
insert into contact values (2,'Harris', 'Heather', 'wife', 1);
insert into contact values (3,'Harris', 'Elizabeth', 'daughter',
1);
insert into contact values (4,'Harris', 'Matthew', 'son', 1);
insert into contact values (5,'Moon', 'Judy', 'Mother in law',
2);
insert into contact values (6,'Moon', 'Chris', 'Brother in law',
2);
Querying the table
SELECT * FROM contact;
grabbing a subset of fields
SELECT Fname, relationship FROM contact;
grabbing a subset of records
SELECT * from contact WHERE LName = 'Harris';
combining field, record selection
SELECT FName, LName FROM contact
WHERE LName = 'Harris';
Using Query by Example to get same result:
The flat file database approach is easy to understand, and it is handy for simple databases, but it does not really reflect the way larger datasets work in the real world. The flat file model reflects a world where everything is of the same type, and there's no interaction between things that are not of the same record definition.
Here's what our contact database OUGHT to look like:
Notice that all the Harris' live at one address, and the Moons live at another address. This of course could be done in a flat file database, but it would have some problems. There are several people at each address, so there is a lot of data being duplicated. In addition, the typing load on the person entering the data is filled with repetition of the same data. There's lots of room for error here. If the Harris family moved, we will have to find and change four different addresses, and type the new address correctly.
Many databases contain this kind of information, that is very repetitive.
States, account types, and lots of similar kinds of information could be
pre-defined. Take a look at our original database again, and see
how the address is configured:
The home address is stored as an INTEGER!?!?!
There's only two addresses, so it would be much better if we entered each address in the system only ONE time, then determined a way to incorporate it into the tables automatically. This concept is called Normalization.
Let's build a new table of Homes. It looks like this:
The table is very simple. It contains only an ID, which is of course the key field, for this table, and a text field containing the actual address. This kind of index - value table is often called a lookup table or a data dictionary. It gives the database designer a way to store data that will be repeated and preserver referential integrity.
Now you can see the relationship between the tables. In the contact table, there is a homeID Field. This field contains nothing more than an index to the value of the address in the Home table. Contact.homeID is an example of a foreign key. A foreign key is a field who's purpose is to contain values that are the keys of other tables. In order for the rule of referential integrity to be upheld, we must provide an element in the Home table for every value that appears in the HOMEID field of the contact database. Each table in a relational database will have a primary key. Some will also have a foreign key (or maybe several) which will point to the primary keys of OTHER tables in the system.
If you were to look back at our goal:
You see that it looks a lot like the contact database, except that we
have replaced the HomeID field with the appropriate field from the Home
table.
It looks promising. Let's see what it produces:
Yikes!
There's something really wrong here.
If you look closely, you will notice that there are two entries for
each element in the contact table, or 6 entries for each element in the
home table. In other words, if there are n elements in one
table, and m elements in another, we will have m x n
possible values here.
That kind of join is incredibly crude. It is called a union, because it works like a union operation from mathematics. Remember in set theory, a union of two sets includes all the elements of both sets. That is not what we want here. What we are looking for is more like an intersection.
In database terms, we are looking for something called an inner join.
Here's the syntax in SQL:
SELECT contact.FName, contact.LName, contact.relationship, Home.address
FROM contact INNER JOIN home ON contact.homeID = home.id
ORDER BY contact.ID;
The main thing that has changed is the FROM line. Rather than simply telling the system that we want to grab from two tables, we need to inform it that the tables are related. (thus, it is a relational database!) We need to give it some help figuring out how the relationship works.
contact INNER JOIN home informs us of the general type
of relationship. If these two tables had a field with the same name,
the rest of the relationship might be automatically generated, but you
should not rely on it. The rest of the line:
ON contact.homeID = home.id
tells us how to pair up the records. It is an inner join because
we are looking only for the records where there is a matching value in
these two fields.
Once we generate an inner join relationship, the relationship will often be invisible to the end user.
Generating relationships in GUI DBMS environments
In Access and similar modern DBMS environments, there is often a tool
which allows you to set up relationships between tables.
Here's how it looks in access with the database we've been playing with:
What we want to do is define a relationship between contact.homeID and
home.ID. What you do is select one of the fields, then drag until
you are on the other. I'll drag from contact.homeID to home.ID.
Here's the results:
Now you will see a little line between the HomeID field of contact
and the ID field of home. This is a standard notation for a relationship.
In fact, this is an example of a one to many relationship.
Each ID element in home will potentially have many HomeIDs in contact referring
to it.This is often denoted with an infinity symbol near the 'many' field
in the relationship.
Once you have defined a join, things are a little different in the query
engine.
Notice that the tables are now shown with the join, and, when we run
the query, we automatically get an inner join. Just for fun, let's
look in SQL view and see what we get...
SELECT contact.FName, home.address
FROM contact INNER JOIN home ON contact.HomeID = home.ID;
Haw!! Just like when we did it by hand. When GUI interfaces allow you to play with relationships, they are just designing joins.
Had we done exactly the same QBE on the tables without the relationship,
we would have gotten the union of the tables, rather than the join.
You can also generate the appropriate select statement by hand in your code. In our example, if they ask for the address, you will need to generate the results as a join query. If not, no join will be needed. Once you have generated the query, you attach it to the recordsource of the data control.
In both of these cases, the results will be READ ONLY. A query
is a VIRTUAL table. You cannot write to it directly, only read from
it. If they want to edit the values, we will need some other controls.
We'll look at them next time.