Regular Expressions: Basic Character Sets
Welcome to the Basic Character Sets lesson!
This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!
With "or" expressions, we can recognize a whole set of characters.
>
/^c(a|o|u)t$/.test('cat');Result:
true
This gets tiresome if we need many options in the "or". Fortunately, we can use a character set to simplify it. The set
[aou]is equivalent to(a|o|u).>
/^c[aou]t$/.test('cat');Result:
true
>
/^c[aou]t$/.test('cot');Result:
true
>
/^c[aou]t$/.test('cet');Result:
false
What if we want to allow any string of lower case letters? We'd have to write
/(a|b|c|d|e|and so on. Instead, we can write another character set.>
/[abcdefghijklmnopqrstuvwxyz]/.test('a');Result:
true
>
/[abcdefghijklmnopqrstuvwxyz]/.test('g');Result:
true
That was shorter, but still wordy. We can specify an entire range of characters by using
-.>
/[a-z]/.test('g');Result:
true
>
/[1-3]/.test('1');Result:
true
>
/[1-3]/.test('a');Result:
false
>
/[1-3]/.test('2');Result:
true
As usual, we escape special characters when we want them to be literal. This range contains only one character, an escaped
]written as\].>
/[\]]/.test(']');Result:
true
Character sets can be negated to mean "everything not in the set".
We negate with
^, a character that we already saw. Normally it means "beginning of line". But inside [square brackets], it means "negate the character set". (There are only so many symbols on a keyboard, so some get reused.)>
/[^a]/.test('a');Result:
false
>
/[^a]/.test('5');Result:
true
Negation applies to the entire character set. The regex
/[^ab]/means "any character other than a or b".>
/[^ab]/.test('a');Result:
false
>
/[^ab]/.test('c');Result:
true
>
/[^ab]/.test('_');Result:
true
Negation also applies to ranges.
>
/[a-z]/.test('h');Result:
true
>
/[^a-z]/.test('h');Result:
false
>
/[^a-z]/.test('5');Result:
true
Character sets match exactly one character in the string. (This is like character classes, which also match only one character.) To match more than one character, we can use
+or*.>
/^[a-z]$/.test('cat');Result:
false
>
/^[a-z]+$/.test('cat');Result:
true