Joins: inner, outer, left, right

In (My)SQL, join is a means for combining records from two tables into a single set which can be either returned as is or used in another join. In order to perform the operation a join has to define the relationship between records in either table, as well as the way it will evaluate the relationship. The relationship itself is created through a set of conditions that are part of the join and usually are put inside ON clause. The rest is determined through a join type, which can either be an inner join or an outer join.

The SQL clauses that set the respective join type in a query are [INNER] JOIN and {LEFT | RIGHT} [OUTER] JOIN. As you can see the actual keywords INNER and OUTER are optional and can be omitted, however outer joins require specifying the direction – either left or right (more on that later).

Examples of queries using joins:
SELECT * FROM users JOIN files ON files.owner_id = users.id WHERE users.id = 1;
SELECT * FROM users LEFT JOIN files ON files.owner_id = users.id WHERE users.id = 1;

Inner join, outer join

The primary difference between the two basic types (each has several subtypes) is in making the decision whether joining of two rows was successful or not, which essentially determines whether the combined row can be returned or not.

Inner joins require that a row from the first table has a match in the second table based on the join conditions. In means that the first query from the example above will only return any rows if files table contains at least one record where owner_id is 1 (has to be equal to users.id by the join conditions and users.id is filtered in WHERE to accept only that one value). Otherwise it will return no rows at all, even if users contains a valid user record. Assuming there are two users, but only one has any files:

mysql> SELECT * FROM users JOIN files ON files.owner_id = users.id WHERE users.id = 1;
+----+--------------+---------+----+----------+------------------+
| id | name         | enabled | id | owner_id | filename         |
+----+--------------+---------+----+----------+------------------+
|  1 | Albin Kolano |       1 |  1 |        1 | audit report.doc |
+----+--------------+---------+----+----------+------------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM users JOIN files ON files.owner_id = users.id WHERE users.id = 2;
Empty set (0.01 sec)

Outer joins, on the other hand, consider a join successful even if no records from the second table meet the join conditions (i.e. whether there are any matches or not). In such case outer join sets all values in the missing columns to NULL. The second query from the example will return rows whenever there are matches in users and regardless of the contents of files table.

mysql> SELECT * FROM users LEFT JOIN files ON files.owner_id = users.id WHERE users.id = 1;
+----+--------------+---------+------+----------+------------------+
| id | name         | enabled | id   | owner_id | filename         |
+----+--------------+---------+------+----------+------------------+
|  1 | Albin Kolano |       1 |    1 |        1 | audit report.doc |
+----+--------------+---------+------+----------+------------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM users LEFT JOIN files ON files.owner_id = users.id WHERE users.id = 2;
+----+-------------------+---------+------+----------+----------+
| id | name              | enabled | id   | owner_id | filename |
+----+-------------------+---------+------+----------+----------+
|  2 | Nadzieja Surowiec |       1 | NULL |     NULL | NULL     |
+----+-------------------+---------+------+----------+----------+
1 row in set (0.00 sec)
Left join, right join

Unlike inner joins, outer joins require that the join direction is specified. Inner join is a symmetrical and bi-directional relationship, which means A JOIN B produces the same result as B JOIN A. That is not true for outer joins, because they accept when for a record in A there is no matching record in B and in such case the reverse operation is impossible as it would have to start with the non-existing record in B. This is the reason why setting the direction is necessary. A LEFT JOIN B finds matches for rows from table A in table B, while A RIGHT JOIN B finds matches for records from B in A.

In practice there is very little or even no real purpose for using RIGHT JOIN and in majority of cases everyone just sticks to using LEFT JOIN whenever they need outer join.

When does the join type matter?

Choosing the appropriate type depends on the logic you are trying to implement.

You have to use inner join when mandatory pieces of information are located in both tables and partial information is considered incomplete or even useless. The case of this could be listing user’s files based on the earlier example:

mysql> SELECT *
    -> FROM   users
    ->        JOIN files
    ->        ON     files.owner_id = users.id
    -> WHERE  users.name = 'Nadzieja Surowiec'
    ->        AND users.enabled = 1;
Empty set (0.00 sec)

The query finds the user’s record in users table and verifies that they are allowed to use the service through the value of users.enabled column and then searches for their files in files table. If there are no matches in either table, the query does not return any result, which is the correct behavior. If outer join was used in this case, a useless partial result could be returned or even incorrect result:

mysql> SELECT *
    -> FROM   users
    ->        LEFT JOIN files
    ->        ON     files.owner_id = users.id
    -> WHERE  users.name = 'Nadzieja Surowiec'
    ->        AND users.enabled = 1;
+----+-------------------+---------+------+----------+----------+
| id | name              | enabled | id   | owner_id | filename |
+----+-------------------+---------+------+----------+----------+
|  2 | Nadzieja Surowiec |       1 | NULL |     NULL | NULL     |
+----+-------------------+---------+------+----------+----------+
1 row in set (0.00 sec)

mysql> SELECT COUNT(1)
    -> FROM   users
    ->        LEFT JOIN files
    ->        ON     files.owner_id = users.id
    -> WHERE  users.name = 'Nadzieja Surowiec'
    ->        AND users.enabled = 1;
+----------+
| COUNT(1) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)

The application relying on such queries not only would not be able to make anything out of such file information where file data is all set to NULL values, but also it would have to include additional and in fact redundant logic to filter out such results. The row count in this case is correct, as the query returned a single row, but it does not represent how many files the user has, so it is not a valid information that the application could use.

Outer join must be used to perform a join with a table, which holds information that is only optional for the result.

In our example we are working with a query that lists user’s files and we already established that the join between users and files has to be inner join. But let’s give our users the opportunity to choose a custom icon for any file if they want to. The information could be kept in a separate table called file_icon.

Now, for each listed file we also want to see if user has set a custom icon for that file and return the icon name if they have set it. The icon information is entirely optional, so we want the query to return rows regardless of whether there is an entry for the given file in file_icon or not. Therefore we have to use outer join for this particular task.

mysql> SELECT *
    -> FROM   users
    ->        JOIN files
    ->        ON     files.owner_id = users.id
    ->        LEFT JOIN file_icon
    ->        ON     file_icon.file_id = files.id
    -> WHERE  users.name = 'Albin Kolano'
    ->        AND users.enabled = 1;
+----+--------------+---------+----+----------+------------------+---------+------------------+
| id | name         | enabled | id | owner_id | filename         | file_id | icon_image       |
+----+--------------+---------+----+----------+------------------+---------+------------------+
|  1 | Albin Kolano |       1 |  1 |        1 | audit report.doc |       1 | MS-Word-Icon.png |
|  1 | Albin Kolano |       1 |  2 |        1 | stats-201104.xls |    NULL | NULL             |
+----+--------------+---------+----+----------+------------------+---------+------------------+

The outer join allowed us to grab the complete list of user’s files and along with their icons if any were set. If we used inner join instead, the result would be missing the record of stats-201104.xls file.

To Be Continued

More on dealing with practical problems when designing join queries will be covered in a separate post.

[MySQL Health Check]
About Maciej Dobrzanski

A MySQL consultant with the primary focus on systems, databases and application stacks performance and scalability. Expert on open source technologies such as Linux, BSD, Apache, nginx, MySQL, and many more. @linkedin

Comments

  1. emcomments says:

    The first query “SELECT * FROM users JOIN files ON files.owner_id = users.id WHERE users.name = 1;” makes no sense as users.name is a string in all the example results.

  2. Jay says:

    Thanks for sharing :)

Speak Your Mind

*