Login | Register
Nerd ParadiseArtisanal tutorials since 1999
In a simple SELECT query, you gather information from one table. However, this is rarely enough information for doing anything useful. If you have a table with a list of users and a table with a list of forum posts, surely you'd like to see the username next to each post. One way of doing this is selecting all the posts you want to display on a page with one query, looping through the results with a for loop, and for each row, making another SELECT query to find the user name for the user ID that's associated with the post.

If you do this, your database administrator will hunt you down in the middle of the night and bludgeon you to death with a sack full of blown-out CPU fans from the MySQL server that he had to replace numerous times because of your poorly written code.

And that is why we use SQL Joins.

A join will join data from one table to the data in another table in one query. All nice and neat.

However there are several kinds of joins. Which do you use? What the hell is a left join? Which join will get you fired? This article is an answer to all those questions and more. There's also a lovely example at the end involving dirt bikes and elves.

Cross Join

If you are using a Cross Join, chances are you are tackling your problem in the wrong way. A horribly horribly wrong way. Cross joins take each row from one table and joins them onto EVERY OTHER ROW from another table. Basically, you're multiplying one table by the other. If there are 1000 rows in table foo and 1000 rows in table bar, then the following query:
SELECT * FROM foo, bar

will return ONE MILLION rows. Every single combination of couplings between the two tables. Yes, it's probably best you go back to the drawing board and figure out a better way to solve whatever it is you're working on. Off you go, then.

Inner Join

This is probably the most common join. An inner join usually has some sort of clause where you want to join tables from one table to another that share a certain value on a column in each table. For example, you have a list of users (that each have a unique ID#) and a list of blog comments that users make. The comment table will have a column called user_id indicating which user made the post. The user table will most obviously also have this user_id column. If you're writing a PHP page to view comments, you'll want to use a INNER JOIN here.
SELECT
  u.name,
  c.text
FROM comment c
INNER JOIN user u ON (u.user_id = c.user_id)
ORDER BY
  c.timestamp

If the user_id of the comment exists in the user table, then that row will be returned. If for some reason a user gets deleted from the user table, the comment will not be a part of these query results.

If for some reason there are two users with the same ID# (surely this would never happen in this example) then the row on the left table will be duplicated for however many matches there are in the right table.

Left Join

(Also known as a Left Outer Join)
OH NO! For some reason you accidentally deleted a user in the above example who had previously made comments. Must have been some pretty nasty comments. The comment that belonged to that user will no longer be returned. But suppose you wanted to still show the comment? This is a situation where you want to use a LEFT JOIN.
SELECT
  u.name,
  c.text
FROM comment c
LEFT JOIN user u ON (u.user_id = c.user_id)
ORDER BY
  c.timestamp

As you join tables onto other tables, imagine joining them from left to right. The first table is at the far left and you stick the results from the next joined table to the right of it. In a LEFT JOIN, you KEEP all the results from the table on the left no matter if the entry exists in the table on the right. So if you deleted a user, the comment will still appear. However, you'll just have null results. In the PHP code, you could display this entry as a "deleted user" but you'll still have the comment information.

Right Join

(Also known as a Right Outer Join)
Right joins, as the name implies, are mirror images of the left joins. Suppose you have a table B that you RIGHT JOIN onto table A. If there are rows in B that do not have entries in table A (the original table), then those will get included in the query results. Right joins aren't all too common since usually you think of things as a left join where your primary data that your query focuses on is on the far left.

Full Outer Join

Suppose you wrote a query as a left join. Then you rewrote it as a right join. If you were to take the UNION of both the left and right join, it would be a FULL OUTER JOIN. If entries from table a have a match in table b (according to whichever clause you provide), then those entries will be paired together in the result. All other entries from both tables that do not have a match in the other table will also appear individually in the query result. Also not very common.

Other Join Terminology

Natural Join - A natural join is just an inner join. A NATURAL JOIN does not have an ON clause. Instead, the SQL interpreter implies the clause by looking at columns in both tables that have the same name.
SELECT f.foo, b.bar FROM foo f INNER JOIN bar b ON (b.id = f.id)

...is the same as:
SELECT f.foo,b.bar FROM foo f NATURAL JOIN bar b

(assuming that foo and bar both have "id" and don't have any other columns that share a name)

Outer Join - generic name for the category of joins that include the left join, right join, and full outer join. The common characteristic amongst them is that joining a table can produce null columns in the results whereas an inner join will not.

Now let's work with an example.

Santa Join Clause

Santa's all up on this Web 2.0 stuff and has ordered his IT department to build a website where good girls and boys can track their toys and even see which elves are building their toys.

This database has 5 tables.

kid
- kid_ID (INT)
- name (VARCHAR)
- address (VARCHAR)
- is_naughty (BOOL)


request
- kidID (INT)
- toyID (INT)


toy
- toyID (INT)
- name (VARCHAR)
- weight (DOUBLE)
- size (ENUM small|medium|large)


specialization (which elves work on which toys)
- toyID (INT)
- elfID (INT)


elf
- elfID (INT)
- name (VARCHAR)


show_toys_being_produced.php

SELECT
  t.name AS 'toy',
  e.name AS 'elf'
FROM request r
INNER JOIN toy t ON (t.toy_id = r.toy_id)
INNER JOIN specialization s ON (s.toy_id = r.toy_id)
INNER JOIN elf e ON (e.elf_id = s.elf_id)
WHERE
  r.kid_id = $logged_in_user_id

This will return a row for each toy the kid requested N times where N is the number of elves that specialize in creating that toy. Because we're using an INNER JOIN here, if no elves specialize in creating dirt bikes and billy requested a dirt bike, he will not see it on this page. However, if he requested a PS3 and 12 elves build PS3's, then PS3 will be returned by this query 12 times. You can flatten this query on the PHP side and print the results in a hierarchical form so the kids don't get confused and think they're getting 12 PS3's.

view_toys_requested.php

This query is identical to the one above, except I changed the elf queries to LEFT JOINs
SELECT
  t.name AS 'toy',
  e.name AS 'elf'
FROM request r
INNER JOIN toy t ON (t.toy_id = r.toy_id)
LEFT JOIN specialization s ON (s.toy_id = r.toy_id)
LEFT JOIN elf e ON (e.elf_id = s.elf_id)
WHERE
  r.kid_id = $logged_in_user_id

Now Billy will see his dirt bike listed but there will be no elf listed with it. Now he won't think you forgot about the dirt bike. It is more clear that he is simply not getting it because Santa is short staffed because his dirt bike engineer is busy helping the IT team write SQL/PHP.

get_off_your_lazy_elven_asses.php

Suppose Santa wanted an internal page for himself to track which toys are not being produced and at the same time see which elves are sitting on their asses not producing things. Behold, the power of the FULL OUTER JOIN!
SELECT
  t.name AS 'toy',
  e.name AS 'elf'
FROM toy t
LEFT JOIN specialization s ON (t.toy_id = s.toy_id)
FULL OUTER JOIN elf e ON (e.elf_id = s.elf_id)

The specialization table is LEFT JOIN'd onto the toy table because you want toys that aren't being worked on to appear in the results. However, you also want to join the elf table onto the specialization table. You don't want to use a LEFT JOIN for this since it would omit the elves not working on anything. You don't want to use a RIGHT JOIN because it would omit the toys that aren't being worked on but show all the elves. The FULL OUTER JOIN is the only one that will show both.