Python in Detail: Frozen Sets
Welcome to the Frozen Sets lesson!
This lesson is shown as static text below. However, it's designed to be used interactively. Click the button below to start!
We've seen that sets, lists, and dicts are not hashable because they're mutable. Both dictionary keys and set elements must be hashable, which means we can't use sets as keys, or have sets as set elements. But what if we really want a set of sets, or (less likely) a dictionary where the keys are sets?
The
frozensetbuilt-in lets us do that. It creates an immutable set, which cannot change after it's created.(Remember, passing an iterable to a
set(...)creates a set with each element.)>
some_set = frozenset([1, 2, 3])3 in some_setResult:
True
There are no methods to change a
frozenset, so trying to call one raises an exception.>
some_set = frozenset([1, 2, 3])some_set.remove(2)Result:
AttributeError: 'frozenset' object has no attribute 'remove'
Sets are iterable, so we can build a
frozensetfrom an existing set.>
some_set = frozenset({6, 4, 1})some_setResult:
frozensets are also iterable, so we can create a regular set from afrozenset.>
frozen_set = frozenset([1, 2])some_set = set(frozen_set)some_set.add(3)some_setResult:
{1, 2, 3}Two Python frozen sets (or regular sets) can be equal, even if they're different objects in memory. For example, we can build a frozen set containing the numbers 1 and 2. Then we can build a second frozen set that also contains the numbers 1 and 2. These sets are not identical, but they are equal.
>
set_1 = frozenset([1, 2])set_2 = frozenset([2, 1])- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
set_1 is set_2Result:
False
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
set_1 == set_2Result:
True
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
# This is a regular, mutable set.set_3 = set([1, 2])(set_1 is set_3, set_1 == set_3)Result:
(False, True)
The two sets also hash to the same value, even though they're separate objects. That's required by Python's hash rules: equal values must have the same hash.
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
hash(set_1) == hash(set_2)Result:
True
Because these two
frozensets have the same hash and are equal, we can use them interchangeably as set elements or dictionary keys.- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
set_of_sets = {set_1}set_2 in set_of_setsResult:
True
- Note: this code example reuses elements (variables, etc.) defined in earlier examples.
>
a_dict = {set_1: 5}a_dict[set_2]Result:
5
This may be surprising depending on which programming languages you're familiar with. For example, JavaScript's sets don't work in this way! Neither do JavaScript's maps, which are like Python's dicts. In JavaScript, both sets and maps operate according to identity, not equality. Neither of the two examples above would work in JavaScript.
This shows us one reason that Python has so many rules around hashability and equality. Learning the rules does require some extra effort. But in return we can think about natural value equality, where
{1}equals{1}even if they're different set objects at different locations in memory. Many Python programmers view this as a significant benefit over other languages.Finally, when using set operators like
|and&, we can mixfrozensetandset. The result will have the type of the left value in the union (eithersetorfrozenset).>
union_set = frozenset([1, 2]) | {1, 2}type(union_set)Result:
>
union_set = {1, 2} | frozenset([1, 2])type(union_set)Result:
set
Here's a code problem:
We're building a software system that identifies fruits by their physical properties. For example, a round red fruit is probably an apple, and a long yellow fruit is probably a banana.
The code below builds a dictionary,
fruit_properties, mapping fruit properties to the fruits' names. For example, when we see a round red fruit, we'll callidentify(("round", "red")).Currently, there's a problem: the
identifyfunction only works when we pass the properties in exactly the right order. Callingidentify(("round", "red"))works, but callingidentify(("red", "round"))doesn't work.To fix the bug, use
frozensets as the dictionary keys, rather than tuples. Sets don't care about order, so we can look for a{"red", "round"}fruit or a{"round", "red"}fruit.A hint: you'll need to make small changes to existing code that creates the dictionary keys, and then access them in the
identifyfunction. You don't need to add any new lines of code!fruit_properties = [(("round", "red"), "apple"),(("yellow", "long"), "banana"),(("green", "round"), "kiwi"),(("blue", "small", "round"), "blueberry"),(("thorny", "controversial"), "durian"),]properties_to_fruit = {}for properties, fruit_name in fruit_properties:properties_to_fruit[frozenset(properties)] = fruit_namedef identify(properties):return properties_to_fruit[frozenset(properties)]assert identify(("round", "red")) == "apple"assert identify(("yellow", "long")) == "banana"assert identify(("blue", "small", "round")) == "blueberry"# The ordering of the properties shouldn't matter.assert identify(("round", "small", "blue")) == "blueberry"assert identify(("round", "green")) == "kiwi"# Sometimes we pass in the properties as other iterables (not tuples).assert identify(["round", "red"]) == "apple"assert identify({"round", "green"}) == "kiwi"- Goal:
None
- Yours:
None