Python’s set is a powerful collections data type that offers several performance benefits over list objects in certain cases. The set.update
method provides a convenient way to add multiple items to a set in Python. This provides a notable benefit over the set.add()
method used to add single items.
Python is one of the world’s most popular programming languages for good reason—it provides powerful built-in data types to handle common problems. The set
datatype can be used for performant collections where uniqueness is required. Python’s sets
are more performant than lists when it comes to lookups but slightly less performant when iterating over elements.
Quick Introduction to Python’s Set Datatype
Python sets provide several methods by which elements can be added, removed, or retrieved. The most straightforward (ish) method is the set.add()
method. The official Python documentation describes sets in the following way (R):
A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.
Some notable takeaways are no duplicate elements, membership testing, and fancy mathematical operations. Those are considered the defining characteristics of Python’s set
datatype.
Python’s set.update Method
To really get an idea of Python’s set.update
method one needs to compare with the set.add
method. This comparison will stratify the use-cases and underlying functionality by which set.update
should be chosen. So—we’re focused on set.update
but I’ll be including a few comparative examples along the way.
Adding Single Elements
Below is a quick example of creating a set and adding elements via the set.add
method. Note the second example’s error indicating that only single items can be added this way. That’s the hallmark indication that one should consider the update method. Otherwise, one must iterate over all elements intended to be added in a loop (last example). See below:
# Create a set literal sample_set = {1, 2, 3} >>> {1, 2, 3} # Add a single element sample_set.add(4) >>> {1, 2, 3, 4} # Add multiple elements sample_set.add(5, 6, 7) >>> TypeError: add() takes exactly one argument (3 given) # A possible work-around for item in [5, 6, 7]: sample_set.add(5, 6, 7) >>> {1, 2, 3, 4, 5, 6, 7}
set()
notation. This syntax cannot be used to create empty sets as it will create an empty dictionary object.Adding Multiple Elements
The addition of multiple elements to a set
is where the set.update()
method comes in. Its use is straightforward, but there’s some notable context in which one must be careful.
# Create a set literal sample_set = {1, 2, 3} >>> {1, 2, 3} # Add multiple elements sample_set.update(4, 5, 6) >>> TypeError: add() takes exactly one argument (3 given) # Add multiple elements, from and iterable sample_set.update([4, 5, 6]) >>> {1, 2, 3, 4, 5, 6}
Dictionary Objects
Note here that the set.update method requires an iterable of multiple items Qualifying structures include Lists
, Tuples
, and even Dictonary
objects. Also, note that the set.update() method will use only the keys from a Dictionary
object. For example:
# Create a new set set_one = {1, 2, 3} # Update with dictionary literal set_one.update({4: "a", 5: "b", 6: "c"}) # Note only dictionary keys being added >>> {1, 2, 3, 4, 5, 6}
Python will also allow the creation of a dictionary object using multiple keys, though duplicate values will be ignored. This may cause some confusion and is worth keeping in mind particularly when working with dictionary objects and set objects. For example:
# Create a new dictionary object dict_obj = {4: "a", 5: "b", 6: "c", 6: "d"} # Data discarded for duplicate keys >>> {4: 'a', 5: 'b', 6: 'd'}
Tuple Objects
Tuples are another case where one needs to be careful. Whereas tuples are added in their entirety using the add() method—the update adds individual elements. If the membership of a tuple is important the add() method is more appropriate. For example:
# Create a new set object set_one = {1, 2, 3} >>> {1, 2, 3} # Update with tuple value set_one.update((4, 5, 6, 6)) >>> {1, 2, 3, 4, 5, 6} # Update with iterable containing tuple set_one.update(((7, 8, 9, 9), )) >>> {1, 2, 3, 4, 5, 6, (7, 8, 9, 9)}
In the last example, note that a tuple added as a member of an iterable is added in its entirety. This illustrates that the set.update()
method is not recursive.
String Objects
Adding strings to sets come with a few edge cases to be aware of also. Most notably, the possible interpretation of strings as character
iterables. Remember that the update()
method takes an iterable as its argument—the add method takes a single item. When passing a string of multiple characters to the update method, it’ll be interpreted as an iterable and treated as such. Consider the following example:
# Create a new set object set_one = {1, 2, 3} # Using the add method, add a string set_one.add("tabs") >>> {'tabs', 1, 2, 3} # Using the update method, add a string set_one.update("spaces") >>> {1, 2, 3, 'a', 'tabs', 'c', 'e', 's', 'p'} # Using the update method and extra iterable set_one.update(["EOL"]) >>> {1, 2, 3, 's', 'p', 'c', 'e', 'tabs', 'EOL', 'a'}
These examples illustrate two defining characteristics of the set.update() method in my opinion:
- The method takes an iterable as an argument;
- The method does not recurse into arguments.
Final Thoughts
It’s clear that Python’s set.update()
method has some preferred use cases. Generally, it’s preferred when one needs to add multiple items to a set object. Sets provide much more performant membership testing—especially for large datasets—though slightly less performant iterability.
Keeping in mind the use and edge cases listed here can help better assess when performance gains of sets outweigh losses compared to other collection data types such as lists. Also, almost certainly, keeping in mind the use cases, dynamics, and edge-case concerns of the set.update()
method will prove beneficial when wrangling data in any project.