Sets are, alongside with Lists and Tuples, the third element of the “Holy data-structure in Python trinity”. Now it is time to learn them, so let’s dive into it.
What are sets? |
Differences between Set and List |
When should I use sets? |
Creating sets |
Modifying sets |
Comparing sets |
Union |
Subset and Superset |
Summary |
Challenges |
What are Sets?
Sets are just another data structure: We use them to store a collection of items. Every item is unique (No two elements can share the same value and type).
Most of the time we can use Lists and Sets to do the same, but there are some differences between both.
Differences between Set and List
The structure of Sets are equal to Lists: A collection of items, but there are some differences:
- Every element in the Set is unique. Duplicates are removed.
- Every element in the Set is immutable: Cannot be changed.
- They are unordered. You can create a Set with Item1, Item2, Item3 and Item4, but if you print it the order is not the same. We can not create a Set storing the days of the week or the months of the year and keep the order.
When should I use sets?
If we have Tuples, Lists and now Sets, when should we use them?
We should use them when we need uniqueness in the elements of a collection of elements.
Imagine we have the ‘fridge_items’ list and each time we buy a new item, we add it to the fridge. We would have a list like this:
fridge_items = ['tomato', 'tomato', 'tomato', 'tomato', 'water_bottle', 'water_bottle', ...]
But we want to know what items we have, not how many from each item. We can turn that list into a set instead and here’s what we would have:
fridge_items = {‘tomato’, ‘water_bottle’, ‘banana’, ‘apple’, ‘pulled_pork’, ‘cheese’, …}
So, when we want a collection of unique elements, where the order is not important, we use Sets.
Creating Sets
To create a Set we need a collection of items wrapped around braces:
fridge_items = {'tomato', 'water_bottle', 'banana', 'apple', 'pulled_pork', 'cheese'}
Remember when I said that a Set contains unique elements? Let’s create a set with same elements to see what happens:
my_crazy_set = {1, 2, 3, 4, 4, 4, 5, 6, '6', 'hello', 'hello'} print(my_crazy_set) > {1, 2, 3, 4, 5, 6, '6', 'hello'}
Hm, interesting. But pretty reasonable.
On creation, 4 is repeated 3 times, so we only display it once. 6 and 6 are twice! But only as ‘int’ and a ‘string’: One is a number and the other is a text. The string ‘hello’ was also twice so Python removes the extra instance.
We can also create Sets from Lists:
my_crazy_set = set([1, 2, 3, 4, 4, 4, 4, 5, 6, '6', 'hello', 'hello']) print(type(my_crazy_set)) > set my_list = [1, 2, 3, 4, 5] print(type(my_list)) > list my_set = set(my_list) print(type(my_set)) > set
Sets are immutable, right? Let’s try that.
my_set = {1, 2, 3, 4, 5} my_set[2] = 334 --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-87-4bd7f38c581f> in <module> ----> 1 my_set[2] = 334 TypeError: 'set' object does not support item assignment
We got an error. I wasn’t lying then 😊
Changing the value of an element is not supported. But that does not mean we can’t modify a Set.
Modifying sets
We can not change the value of an element, but we can add and remove them, thanks to its methods. The names are self-explanatory, so let’s take a look at them:
my_set = {'bananas', 'coconut', 'cherry', 'oranges'} my_set.add('lemons') # Adds 'lemons' to the set in a random order print(my_set) > {'bananas', 'cherry', 'coconut', 'lemons', 'oranges'} my_set.pop() # Removes a random element and returns it > {'bananas', 'coconut', 'lemons', 'oranges'} popped_fruit = my_set.pop() # As pop() returns a value, we can store it in a variable print(popped_fruit) > 'oranges' my_set.discard('coconut') # Removes the specified element print(my_set) > {'bananas', 'lemons'} my_set.remove('lemons') # Removes the specified element...again? > {'bananas'} # remove() removes the element specified, as discard(), but if doesn't found the element, throws an error. my_set.remove('lemons') --------------------------------------------------------------------------- KeyError Traceback (most recent call last) <ipython-input-117-1f439aaff308> in <module> ----> 1 my_set.remove('lemons')
Sets modifying methods are simple. But let’s recap:
- add() adds a new element to the set in a random order
- pop() removes a random element
- discard(element_to_remove) removes the specified element (without throwing an error)
- remove(element_to_remove) removes the specified element (throwing an error)
Comparing sets
Let’s repeat the same process, with two sets now:
set1 = {'hi', 'hello', 'goodbye', 'bye'} set2 = {'hi', 'hello', 'welcome', 'see ya!'} set1.intersection(set2) # Returns elements in both sets > {'hello', 'hi'} set1.difference(set2) # Returns the set1 elements not in set2 > {'bye', 'goodbye'} set2.difference(set1) > {'see ya!', 'welcome'} set1.symmetric_difference(set2) # Returns unique elements in each set > {'bye', 'goodbye', 'see ya!', 'welcome'} set3 = {1, 2, 3, 4} set1.isdisjoint(set3) # Returns true or false, if both sets are unrelated > True set1.isdisjoint(set2) > False
Recap:
- intersection() returns common elements in both sets
- difference() returns uncommon elements in the first set compared to the second one
- symmetric_difference() returns uncommon elements
isdisjoint () returns a boolean. True if both sets don’t have any common element, False if they share at least one.
Union
The union() method returns a new set containing the unique elements of both sets. Any repeated element is removed, as usual.
set1 = {'hi', 'hello', 'goodbye', 'bye'} set2 = {'hi', 'hello', 'welcome', 'see ya!'} set1.union(set2) > {'bye', 'goodbye', 'hello', 'hi', 'see ya!', 'welcome'}
Subset and Superset
Subset and Superset methods checks if one set is the smaller part of the other and if one is the bigger part of the other, respectively.
set1 = {1, 2, 3, 4, 5} set2 = {1, 2, 3} set2.issubset(set1) # Every element of set2 is in set1 > True set1.issubset(set2) # Not every element of set1 is in set2 > False set1.issuperset(set2) # set1 contain every element of set2 > True set2.issuperset(set1) # set2 does not contain every element of set1 > False
Summary
We have learnt that Sets:
- Are an unordered collection of elements where each element is unique
- Its values are immutable
- Should be used when we want a collection of elements with no repeated values and when the order of each element is not important.
- Have methods to add and remove elements
- Can be compared with others
- It is possible to merge two of them
- We can compare if one is a sub-set of other
Challenges
We have learnt a lot, so let’s put it to a use.
Challenge 1: Replace a value from the set
As you can tell, I’m not a native English speaker. I did a mistake while writing my grocery list. Can you fix it for ‘Ice cream’?
grocery_list = {'Bananas', 'Apples', 'Pizza', 'I scream', 'Lettuce'}
Challenge 2: Count how many unique items there are
Our music band has a lot of instruments in its inventory, but how many (unique??) instruments are?
band_instruments = ['trumpet', 'saxo', 'accordion', 'clarinet', 'clarinet', 'saxo', 'accordion', 'trumpet', 'accordion', 'tuba', 'trumpet', 'accordion', 'clarinet', 'trumpet', 'saxo', 'clarinet', 'tuba']
Challenge 3: Display your working days of the week
We have two sets of elements. The workable days of the week and the vacations days. Remove the common elements from the working days with the vacation days to display a set with the days we have to work,
Extra points if for using few lines!
workable_days = {'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'} vacation_days = {'Saturday', 'Sunday', 'Friday', 'Wednesday'} # Result print(working_days) > {'Monday', 'Thursday', 'Tuesday'}
Take your time, try different things. Re-read sections of this post to see what options do you have in your toolbox. Google if you need to.