Execute Program

Python in Detail: Set Operations

Welcome to the Set Operations lesson!

This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!

  • We've seen the basics of sets. In this lesson, we'll explore some common set operations.

  • Sometimes we want an empty set. We can't write it as {}, because {} is an empty dict!

  • >
    type({})
    Result:
    dictPass Icon
  • >
    {} == set()
    Result:
    FalsePass Icon
  • Fortunately, set() works.

  • >
    len(set())
    Result:
    0Pass Icon
  • >
    set() == set([])
    Result:
    TruePass Icon
  • There are three different ways to remove set elements, each with a different purpose. First, .remove removes a specific value. (Remember, a set can't have duplicate values, so there's never a question about which occurrence of a value to remove.)

  • >
    numbers = {1, 3, 5}
    numbers.remove(3)
    numbers
    Result:
    {1, 5}Pass Icon
  • If the value isn't part of the set, .remove raises KeyError.

  • >
    numbers = {1, 3, 5}
    numbers.remove(4)
    numbers
    Result:
    KeyError: 4Pass Icon
  • Second, sometimes we want to remove a value, but we don't want the KeyError if it's not in the set. We can use the .discard method for that.

  • >
    numbers = {1, 3, 5}
    numbers.discard(3)
    numbers
    Result:
    {1, 5}Pass Icon
  • >
    numbers = {1, 3, 5}
    numbers.discard(4)
    numbers
    Result:
    {1, 3, 5}Pass Icon
  • Third, sometimes we want to remove an element and get its value. The .pop method does that.

  • >
    numbers = {1, 3, 5}
    total = 0
    while len(numbers) > 0:
    total += numbers.pop()
    total
    Result:
    9Pass Icon
  • Sets are unordered, so the example above can process them in any order. It could be 1+3+5, or 1+5+3, or 3+1+5, etc.

  • The pop method is most useful when we need to iteratively process all of the elements in a set, one by one, removing them as we go. But we can't rely on order, because the order is unpredictable.

  • Sets are also iterable, like lists and dicts. For example, we can rewrite the code above using a for loop. This way, we don't have to worry about .pop or len(numbers) > 0.

  • >
    numbers = {1, 3, 5}
    total = 0
    for number in numbers:
    total += number
    total
    Result:
    9Pass Icon
  • It's even simpler to use the built-in sum function, which also consumes an iterator.

  • >
    numbers = {1, 3, 5}
    sum(numbers)
    Result:
    9Pass Icon
  • We shouldn't change a set during iteration. A full explanation of the reason is out of scope here. But as a simple summary: the iterator remembers where it is in the iteration process, but that's based on the data that existed when iteration began. Changing the data invalidates the iterator's understanding of where it is.

  • Sometimes, Python can detect the change and raise an error, like in the example below. Unfortunately, there are other cases where Python won't notice. The safest thing is to never mutate any data structure, including sets, while iterating over it.

  • >
    numbers = {1, 3, 5}
    total = 0
    for number in numbers:
    total += number
    numbers.discard(number)
    total
    Result:
    RuntimeError: Set changed size during iterationPass Icon
  • Python supports the common set operations that you may have seen in other languages or in mathematics. For example, set_a | set_b is a union: a new set with all elements that occur in either set_a or in set_b. As always, there are no duplicates in the resulting set.

  • >
    set_a = {1, 4, 6}
    set_b = {1, 2, 3}
    set_a | set_b
    Result:
    {1, 2, 3, 4, 6}Pass Icon
  • set_a & set_b is an intersection. It includes any element that occurs in both sets.

  • >
    set_a = {1, 2, 6}
    set_b = {1, 2, 3}
    set_a & set_b
    Result:
    {1, 2}Pass Icon
  • set_a - set_b is a difference: all elements in set_a that are not in set_b. Or, we can think of it as set_a, but with all of set_b's elements removed. When set_b has elements that aren't in set_a, like 4 and 6 in the example below, they don't affect the difference.

  • >
    set_a = {1, 2, 3}
    set_b = {2, 4, 6}
    set_a - set_b
    Result:
    {1, 3}Pass Icon
  • Finally, set_a ^ set_b is a symmetric set difference: all elements that are in one of the sets, but not both.

  • Symmetric set difference is equivalent to (set_a | set_b) - (set_a & set_b). But that's twice the code!

  • >
    set_a = {1, 2, 3}
    set_b = {0, 2, 4}
    set_a ^ set_b
    Result:
    {0, 1, 3, 4}Pass Icon
  • Now that we understand Python's set operations, a future lesson will show how sets are used in practice.