SQL: Constraint Analysis
Welcome to the Constraint Analysis lesson!
This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!
What if a foreign key column is null? By default, that's allowed. A cat with a null
owner_idmeans "this cat has no owner".>
exec(`CREATE TABLE people (id INTEGER PRIMARY KEY, name TEXT NOT NULL);CREATE TABLE cats (owner_id INTEGER REFERENCES people(id),name TEXT NOT NULL);INSERT INTO cats (owner_id, name) VALUES (null, 'Ms. Fluff');SELECT * FROM cats;`);Result:
The database lets us do this because we might want it! For example, imagine that we're building a site where users can register with discount codes. Some users register with a discount code, but others don't. We want to track which users registered with which discount codes.
This sounds simple, but there are a few subtle decisions to be made. Let's do some analysis to come up with appropriate database constraints.
Our database will have two tables,
discounts (id, discount_code)andusers (id, discount_id, name). We'll analyze the three most common types of constraints: foreign keys, nullability, and uniqueness.- First: should users'
discount_idbe a foreign key todiscounts? Yes! Otherwise users could reference discounts that don't exist, which doesn't make sense.
- First: should users'
- Second: should every user have a
discount_id, or is it OK for some of them to be null? Some users will register without using a discount code. For those users,discount_idshould be null. Sodiscount_idshould be nullable.
- Second: should every user have a
- Third: can two users reference the same discount? This is a more subtle decision.
In some systems, discount codes can be used many times; in others, they can only be used once. We'll decide that discount codes can only be used once. We can guarantee that by making users'
discount_idunique. If no two users can have the samediscount_id, then no discount code can be used twice.That gives us three constraint decisions:
- Users'
discount_idis a foreign key to discounts. - Users'
discount_idis nullable. - Users'
discount_idis unique.
- Users'
>
exec(`CREATE TABLE discounts (id INTEGER PRIMARY KEY, discount_code TEXT NOT NULL);CREATE TABLE users (id INTEGER PRIMARY KEY,name TEXT NOT NULL,discount_id INTEGER REFERENCES discounts(id) NULL UNIQUE);`);Result:
Now let's make sure that these tables work as intended. We'll run four different checks.
(Each of these examples uses the tables above, but the tables always begin empty. If you think that a statement will cause an error, you can type
erroras its output. Remember that executing a successfulINSERTreturns[].)First, users can't reference discounts that don't exist.
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
exec(`INSERT INTO users (name, discount_id) VALUES ('Amir', 1234)`);Result:
Error: FOREIGN KEY constraint failed
Second, users can register with no discount at all (discount_id can be null).
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
exec(`INSERT INTO users (name, discount_id) VALUES ('Amir', null)`);Result:
[]
Third, users can reference discounts that do exist.
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
exec(`INSERT INTO discounts (id, discount_code) VALUES (1, 'one-time-use-a1b2c3');INSERT INTO users (name, discount_id) VALUES ('Amir', 1);`);Result:
[]
Fourth, two users can't reference the same discount. (Amir uses the discount, then shares it with Betty, who tries to use it.)
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
exec(`INSERT INTO discounts (id, discount_code) VALUES (1, 'one-time-use-a1b2c3');INSERT INTO users (name, discount_id)VALUES('Amir', 1),('Betty', 1);`);Result:
Error: UNIQUE constraint failed: users.discount_id
As you gain experience with SQL, you'll learn patterns about when certain constraints are appropriate. For example, users' email addresses are usually
TEXT NOT NULL UNIQUE. Anis_adminflag will usually beBOOLEAN NOT NULL DEFAULT FALSE. And so on.Foreign keys are more tricky. For foreign keys, it's a good idea to always slow down and explicitly ask the questions that we asked above:
- "Is it OK for a row to have no value for this foreign key?"
- "Is it OK for two rows to have the same value for this foreign key?"
Here's another example where we ask the same questions, but get very different answers. Users can post comments. Each comment has a
user_idreferencing the user who posted it. Now, the analysis questions:- First: should
user_idbe a foreign key fromcommentstousers? Yes! We don't want the database to think that a comment was posted by someone who doesn't exist.
- First: should
- Second: should every comment have a
user_id, or is it OK for some of them to have a nulluser_id? Every comment is posted by someone (unless the computer is haunted by ghosts). So comments'user_idshould beNOT NULL.
- Second: should every comment have a
- Third: can two comments reference the same user? Like before, this one is more subtle.
If only one comment can reference a given user, then trying to insert a second comment by the same user will error. A system with that constraint would only allow each user to post exactly one comment, ever. We want users to post multiple comments, so the comments'
user_idforeign key should not be unique.Again, we have three constraint decisions. But when compared to the user/discount example, we've made opposite decisions about nullability and uniqueness:
- Comments'
user_idis a foreign key to users (the same as user/discount). - Comments'
user_idis not nullable (the opposite of user/discount). - Comments'
user_idis not unique (the opposite of user/discount).
- Comments'
>
exec(`CREATE TABLE users (id INTEGER PRIMARY KEY,name TEXT NOT NULL);CREATE TABLE comments (user_id INTEGER REFERENCES users(id) NOT NULL,comment_text TEXT NOT NULL);`);Result:
Now let's check each of those constraints, like we did for the user/discount example. (Each of these examples uses the tables above, but the tables always begin empty.)
First, comments can't reference users that don't exist.
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
exec(`INSERT INTO comments (user_id, comment_text)VALUES (1, 'Ms. Fluff needs a bath!');`);Result:
Error: FOREIGN KEY constraint failed
Second, comments must reference a user (user_id can't be null).
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
exec(`INSERT INTO comments (user_id, comment_text)VALUES (null, 'Ms. Fluff needs a bath!');`);Result:
Error: NOT NULL constraint failed: comments.user_id
Third, comments can reference users that exist.
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
exec(`INSERT INTO users (id, name) VALUES (1, 'Amir');INSERT INTO comments (user_id, comment_text)VALUES (1, 'Ms. Fluff needs a bath!');`);Result:
[]
Fourth, two comments can reference the same user.
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
exec(`INSERT INTO users (id, name) VALUES (1, 'Amir');INSERT INTO comments (user_id, comment_text)VALUES(1, 'Ms. Fluff needs a bath!'),(1, 'Ms. Fluff strongly dislikes water.');`);Result:
[]
Our user/discount example is what's often called a "one-to-one" relationship. Each user relates to zero or one discounts, and each discount relates to zero or one users. A user never has multiple discounts, and a discount is never used by multiple users.
The user/comment example is a "one-to-many" relationship. One user can have many comments, but a comment is only related to one user.
Shorthand terms like one-to-one and one-to-many are useful for talking to other developers, but the database doesn't know what they mean. The database only knows about constraints. Even if you use shorthand with other humans, you have to make decisions about the individual constraints in your tables:
- Should this column be a foreign key?
- Can this column be null?
- Can multiple rows have the same value for this column?