Execute Program

SQL: Left and Right Joins

Welcome to the Left and Right Joins lesson!

This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!

  • In an earlier lesson, we saw an example where users can register with discount codes. The discount codes were optional: some users register with a code, so they have a foreign key to the discount. Other users register without a code, so the foreign key is null.

  • Suppose that we need a report that lists our users, along with the discount codes that they used. The report should include all users, whether they used a discount code or not.

  • We could write a loop in JavaScript: loop over each user, selecting any discounts for that user. That would cause an N+1 query problem: we'd query all of the users, followed by N individual queries to get each user's discount, if any. (Imagine that N is 10,000,000, to see why this is a problem: too many queries!)

  • As usual, SQL has an answer. In this case, it's a join. However, the type of join that we've seen so far doesn't help us. If we select from users JOIN discounts ON users.id = discounts.user_id, we'll only get results for users who have corresponding discounts. (The users.id = discounts.user_id will never be true for a user who has no corresponding discount row.)

  • That doesn't solve our problem because we want to include all users, even if they have no discount. Here's an example of that solution, even though it doesn't solve our problem:

  • >
    exec(`
    CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    discount_id INTEGER REFERENCES discounts(id) NULL UNIQUE
    );
    CREATE TABLE discounts (id INTEGER PRIMARY KEY, discount_code TEXT NOT NULL);
    -- Amir registered with a discount (both Amir and the discount get ID 1).
    INSERT INTO discounts (discount_code) VALUES ('free-month');
    INSERT INTO users (name, discount_id) VALUES ('Amir', 1);
    -- Betty registered with no discount.
    INSERT INTO users (name, discount_id) VALUES ('Betty', NULL);
    SELECT
    users.name AS name,
    discounts.discount_code AS discount_code
    FROM users JOIN discounts
    ON users.discount_id = discounts.id
    `);
    Result:
    [{name: 'Amir', discount_code: 'free-month'}]Pass Icon
  • Betty existed, but she wasn't in our query's results. To fix that, we can tell the database to include users in the report even when there's no matching discount. This is called a left join: users LEFT JOIN discounts. The "left" means "include rows from the left table (users), even if there's no corresponding row from the right table (discounts)".

  • With a LEFT JOIN, both Amir and Betty will show up in the results. Amir will have his discount_code, but Betty's discount_code will be NULL (which will be represented as JavaScript's null).

  • >
    exec(`
    CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL,
    discount_id INTEGER REFERENCES discounts(id) NULL UNIQUE
    );
    CREATE TABLE discounts (id INTEGER PRIMARY KEY, discount_code TEXT NOT NULL);
    -- Amir registered with a discount (both Amir and the discount get ID 1).
    INSERT INTO discounts (discount_code) VALUES ('free-month');
    INSERT INTO users (name, discount_id) VALUES ('Amir', 1);
    -- Betty registered with no discount.
    INSERT INTO users (name, discount_id) VALUES ('Betty', NULL);
    SELECT
    users.name AS name,
    discounts.discount_code AS discount_code
    FROM users LEFT JOIN discounts
    ON users.discount_id = discounts.id
    `);
    Result:
    [{name: 'Amir', discount_code: 'free-month'}, {name: 'Betty', discount_code: null}]Pass Icon
  • LEFT JOIN adds one small tweak to our mental model of joins. Inner joins (the simple kind that we've seen without LEFT) still work in the same way. But here's how the database executes a left join (the new part is the "If no discounts matched this user" section):

    • For each user:
      • For each discount:
        • If this user's discount_id matches this discount's id:
          • Return a row with all of the user's columns and all of the discount's columns.
      • If no discounts matched this user:
        • Return a row with all of the user's columns, and NULLs in all of the missing discount's columns.
  • As usual, this is only a mental model, but it's a correct one. In reality, the database will use a more efficient method that gives exactly the same result.

  • You might wonder whether LEFT JOIN implies that there's also a RIGHT JOIN. There is! In a RIGHT JOIN, rows from the right table are always included, even if there's no corresponding row from the left table.

  • Left and right joins are much less common than the basic inner joins that we've seen before, where every row must exactly match the ON. But when you need a left or right join, you'll be glad that it's available. They're useful when you want a list of records (like users) along with other corresponding records (like discounts) that may or may not exist.

  • (As is often the case, SQLite is a bit weird here. It supports left joins, but not right joins. Fortunately, we can convert right joins into left joins by flipping the order of the tables.)

  • Like with basic inner joins, left and right joins will produce any combination of rows that match. For example, suppose that we select from users LEFT JOIN comments ON users.id = comments.user_id.

  • Every comment in the database will show up in the results. If Amir has written 100 comments, he'll show up 100 times: once with each of his comments. But because it's a LEFT JOIN, every user is guaranteed to show up at least once. Betty will show up even if she's never written a comment.

  • Here's a code problem:

    Use a join to get a list of usernames and comment texts for every comment in the system. Make it a left join so that it also includes users who have never written a comment. (They'll have a null comment text.) Select only the name and comment_text columns.

    exec(`
    CREATE TABLE users (
    id INTEGER PRIMARY KEY,
    name TEXT NOT NULL
    );
    CREATE TABLE comments (
    user_id INTEGER REFERENCES users(id) NOT NULL,
    comment_text TEXT NOT NULL
    );

    -- Amir has written two comments.
    INSERT INTO users (name) VALUES ('Amir');
    INSERT INTO comments (
    user_id,
    comment_text
    ) VALUES (1, 'Ms. Fluff needs a bath!');
    INSERT INTO comments (
    user_id,
    comment_text
    ) VALUES (1, 'Ms. Fluff strongly dislikes water.');

    -- Betty has written no comments.
    INSERT INTO users (name) VALUES ('Betty');
    SELECT
    users.name AS name,
    comments.comment_text AS comment_text
    FROM users LEFT JOIN comments
    ON users.id = comments.user_id
    `);
    Goal:
    [{name: 'Amir', comment_text: 'Ms. Fluff needs a bath!'}, {name: 'Amir', comment_text: 'Ms. Fluff strongly dislikes water.'}, {name: 'Betty', comment_text: null}]
    Yours:
    [{name: 'Amir', comment_text: 'Ms. Fluff needs a bath!'}, {name: 'Amir', comment_text: 'Ms. Fluff strongly dislikes water.'}, {name: 'Betty', comment_text: null}]Pass Icon
  • You'll sometimes see left and right joins called LEFT OUTER JOINs and RIGHT OUTER JOINs. Fortunately, "outer" doesn't change the meaning; it's just a more precise name for the same idea.

  • In this course, we only address inner ("simple") joins and left joins. However, there's a surprising array of additional join types. For 99% of your practical work, inner joins will be sufficient. Left joins will probably get you through the final 1%.

  • We don't recommend focusing on other join types until you've spent a lot of time using SQL databases on projects. However, if you'd like a taste of how deep the join rabbit hole goes, we think that this article is thorough and understandable without belaboring the point.