Why You Should Learn SQL
Posted on
The SQL language is old, strange, and important. We'll take those in reverse order:
SQL is important. Unlike many other types of databases, SQL databases can give strong guarantees of data correctness.
If we have a regular, user-facing bug in our application, some users will see an error. When we fix the bug, the users stop seeing the error and everything is fine.
But if we have a bug that inserts incorrect data into the database, then fixing the bug doesn't fix the problem. Even after the fix, the database is full of incorrect data, so we have to do something to clean it up. Here's a concrete example:
We never expected orders to have a null phone_number, but a temporary application bug inserted some nulls there. Now our order history page is rendering those orders incorrectly. The business reports used by the management team throw errors because they don't expect nulls, so the business's cashflow is now invisible. Our analytics continue to run, but they produce incorrect numbers because every order with a null phone_number is defaulting to country code 1 (the USA and Canada).
There's no way to retroactively get correct phone numbers for all of those orders!
We have two possible ways to mitigate this problem once it's happened. First, we can adapt the rest of the system to correctly handle the nulls. That's a lot of work to handle a case that was never supposed to happen; we're only considering this because of a bug we already fixed.
The second option is to make up fake phone numbers for all of the affected orders. Said another way: we can intentionally put incorrect data into the database, complicating attempts to contact those customers.
Neither of those options is desirable. However, there is a third option: tell the database that the phone_number is never allowed to be null.
In a SQL database like PostgreSQL, that non-nullability is guaranteed. If we try to insert a null, the database rejects it. If PostgreSQL allows us to insert a null in that column, it's a bug in PostgreSQL itself. (PostgreSQL is stable and well-tested; you should never expect to find a bug like this.)
OK: SQL is important because our data is important. It's also very strange, but in ways that make it powerful.
Most popular languages are imperative: code starts at the top of the function and executes line by line until the bottom. This is our default for a good reason: it's easy to think about how the code executes, at least within a single function.
SQL isn't like that at all. In SQL, we describe the results that we want declaratively. For example, in an imperative language we might say: "Loop over each cat. If a cat's age is over 5, add it to the oldCats array. Then return the oldCats array." In SQL, we say something more akin to "Give me all of the cats whose age is over 5."
What happens when we execute these two versions of the code? The imperative version says what the computer should do, step by step. It's difficult for a compiler to optimize it because it's so rigid.
The SQL version doesn't say what order things should happen in; it just describes the result. That lets the database optimize it aggressively. The database will slice up the query into its smallest parts, reorder it as needed, throw some parts away if they're not needed, replace them with more efficient versions, etc. This all happens without any input from us as the programmers, and it's guaranteed not to change the results.
In everyday situations, that optimization turns minutes-long queries into single-digit milliseconds. In extreme cases, it turns years-long queries into seconds.
Compilers for imperative languages have optimizers too. But SQL databases' optimizers are much more powerful because SQL lets us say what data we want, rather than how to get it.
OK: SQL is important and strange. It's also old: it first appeared in 1974. That definitely shows, so let's not kid ourselves. In SQL, we say things like "ALTER TABLE users ALTER COLUMN email DROP NOT NULL".
In English, that means "allow users' email addresses to be null from now on". The SHOUTING PARTS are SQL syntax, whereas "users" and "email" are parts of our data. This shouting convention comes from THE OLD TIMES.
You can see that SQL is unusually wordy. JavaScript has 64 keywords: "if", "function", "for", etc. C has 34. SQLite has 140 keywords. But SQLite is "lite", like its name says. PostgreSQL 11 has 760 keywords. That's the exact number of words that you've read in this article so far, not including this sentence!
OK, so SQL is important, which makes it worth learning. But it's also strange and old, which can make it offputting.
That's a great reason for us to teach Execute Program to teach you SQL. We build complex knowledge models of tools like SQL, regular expressions, and TypeScript, which allows Execute Program to teach the topics in thousands of different orders depending on your strengths. Unlike most learning tools, you'll rarely go a full minute without completing an interactive code example. Those examples are then reviewed automatically, on exponentially increasing intervals, so you don't forget them.
You can try our first SQL lesson without creating an account. It starts lightly, at the very beginning, and ramps up from there. Our favorite lesson comes late in the course, when you write a real SQL injection exploit to attack your own database. We hope that you enjoy it!
Gary Bernhardt