Using Python's set.update() Method for Fun & Profit

Python’s set is a powerful collections data type that offers several performance benefits over list objects in certain cases. The set.update method provides a convenient way to add multiple items to a set in Python. This provides a notable benefit over the set.add() method used to add single items.

Table of Contents show

Python is one of the world’s most popular programming languages for good reason—it provides powerful built-in data types to handle common problems. The set datatype can be used for performant collections where uniqueness is required. Python’s sets are more performant than lists when it comes to lookups but slightly less performant when iterating over elements.

Quick Introduction to Python’s Set Datatype

Python sets provide several methods by which elements can be added, removed, or retrieved. The most straightforward (ish) method is the set.add() method. The official Python documentation describes sets in the following way (R):

A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.

Some notable takeaways are no duplicate elements, membership testing, and fancy mathematical operations. Those are considered the defining characteristics of Python’s set datatype.

Python’s set.update Method

To really get an idea of Python’s set.update method one needs to compare with the set.add method. This comparison will stratify the use-cases and underlying functionality by which set.update should be chosen. So—we’re focused on set.update but I’ll be including a few comparative examples along the way.

Adding Single Elements

Below is a quick example of creating a set and adding elements via the set.add method. Note the second example’s error indicating that only single items can be added this way. That’s the hallmark indication that one should consider the update method. Otherwise, one must iterate over all elements intended to be added in a loop (last example). See below:

# Create a set literal
sample_set = {1, 2, 3}
>>> {1, 2, 3}

# Add a single element
sample_set.add(4)
>>> {1, 2, 3, 4}

# Add multiple elements
sample_set.add(5, 6, 7)
>>> TypeError: add() takes exactly one argument (3 given)

# A possible work-around
for item in [5, 6, 7]:
    sample_set.add(5, 6, 7)
>>> {1, 2, 3, 4, 5, 6, 7}

Note: I’ve created a set using literal notation rather than the set() notation. This syntax cannot be used to create empty sets as it will create an empty dictionary object.

Adding Multiple Elements

The addition of multiple elements to a set is where the set.update() method comes in. Its use is straightforward, but there’s some notable context in which one must be careful.

# Create a set literal
sample_set = {1, 2, 3}
>>> {1, 2, 3}

# Add multiple elements
sample_set.update(4, 5, 6)
>>> TypeError: add() takes exactly one argument (3 given)

# Add multiple elements, from and iterable
sample_set.update([4, 5, 6])
>>> {1, 2, 3, 4, 5, 6}

Dictionary Objects

Note here that the set.update method requires an iterable of multiple items Qualifying structures include Lists, Tuples, and even Dictonary objects. Also, note that the set.update() method will use only the keys from a Dictionary object. For example:

# Create a new set
set_one = {1, 2, 3}

# Update with dictionary literal
set_one.update({4: "a", 5: "b", 6: "c"})

# Note only dictionary keys being added
>>> {1, 2, 3, 4, 5, 6}

Python will also allow the creation of a dictionary object using multiple keys, though duplicate values will be ignored. This may cause some confusion and is worth keeping in mind particularly when working with dictionary objects and set objects. For example:

# Create a new dictionary object
dict_obj = {4: "a", 5: "b", 6: "c", 6: "d"}

# Data discarded for duplicate keys
>>> {4: 'a', 5: 'b', 6: 'd'}

Tuple Objects

Tuples are another case where one needs to be careful. Whereas tuples are added in their entirety using the add() method—the update adds individual elements. If the membership of a tuple is important the add() method is more appropriate. For example:

# Create a new set object
set_one = {1, 2, 3}

>>> {1, 2, 3}

# Update with tuple value
set_one.update((4, 5, 6, 6))

>>> {1, 2, 3, 4, 5, 6}

# Update with iterable containing tuple
set_one.update(((7, 8, 9, 9), ))

>>> {1, 2, 3, 4, 5, 6, (7, 8, 9, 9)}

In the last example, note that a tuple added as a member of an iterable is added in its entirety. This illustrates that the set.update() method is not recursive.

String Objects

Adding strings to sets come with a few edge cases to be aware of also. Most notably, the possible interpretation of strings as character iterables. Remember that the update() method takes an iterable as its argument—the add method takes a single item. When passing a string of multiple characters to the update method, it’ll be interpreted as an iterable and treated as such. Consider the following example:

# Create a new set object
set_one = {1, 2, 3}

# Using the add method, add a string
set_one.add("tabs")

>>> {'tabs', 1, 2, 3}

# Using the update method, add a string
set_one.update("spaces")

>>> {1, 2, 3, 'a', 'tabs', 'c', 'e', 's', 'p'}

# Using the update method and extra iterable
set_one.update(["EOL"])

>>> {1, 2, 3, 's', 'p', 'c', 'e', 'tabs', 'EOL', 'a'}

These examples illustrate two defining characteristics of the set.update() method in my opinion:

The method takes an iterable as an argument;
The method does not recurse into arguments.

Final Thoughts

It’s clear that Python’s set.update() method has some preferred use cases. Generally, it’s preferred when one needs to add multiple items to a set object. Sets provide much more performant membership testing—especially for large datasets—though slightly less performant iterability.

Keeping in mind the use and edge cases listed here can help better assess when performance gains of sets outweigh losses compared to other collection data types such as lists. Also, almost certainly, keeping in mind the use cases, dynamics, and edge-case concerns of the set.update() method will prove beneficial when wrangling data in any project.