Set is an unordered, mutable, iterable collection datatype that can store heterogeneous elements. Set does not allow to store duplicate elements in it. Set in Python supports mathematical operations like union, intersection, difference, and symmetric difference, subset and super-set. In this article, we will learn about Union operation on Sets in Python.
Union of Sets
In general mathematics Set theory and in programming, the Union of Sets refers to an operation that combines all the elements from two or more sets into a single set. Union of set returns the resulting set. Resulting set contains any element that appears in any of the original sets. Set does not allow to store duplicate elements, so Union operation also remove duplicates.
It also means, Union of Set is an operation that merges elements of two or more sets into a single set (result-set), ensuring that each element appears only once in the final resulting set. Keys points from Union of Set
- Union operation on Sets, includes every element from each set, but avoid duplication of elements.
- Sets are unordered collections, so the order of elements in the result set is not defined.
- Sets does not allow duplicate elements. Union result set removes any duplicate elements.
Union of Sets – Mathematical Definition
For two sets A
and B
, the union of A
and B
, denoted A ∪ B
and is defined as:
A ∪ B = { x ∣ x ∈ A or x ∈ B}
Set A
Set B
A U B
In Python, we have multiple ways to perform Union operation on the Sets.
- Using
update()
method –update()
method is called on a Set and other Sets are passed as an argument.update()
method does not return any value, but it update the original Set with union elements. - Using
union()
method –union()
method returns the result-set with union of Set elements. - Using
or (|)
operator –or (|)
operator returns the result-set with union of Set elements.
union()
method
Union of Sets – using union()
method will return the result-set with all unique elements which are present in all sets.- Syntax –
set_1.union(set_2)
# join of sets - using union() method - remove duplicates
print("join of sets - using union() method - remove duplicates")
course_set_1 = {'Python','AWS','Java','Azure','ML'}
course_set_2 = {'Python', 'NumPy', 'Pandas', 'Matplotlib'}
courses = course_set_1.union(course_set_2) # union operation
print(course_set_1)
print(course_set_2)
print("course_set_1 U course_set_2 - ", courses)
In this program, we have defined two sets referenced by variables ‘course_set_1’ and ‘course_set_2’. We are using built-in union()
method on first set and second set is passed as an argument to union()
method. union()
method will return the union set. We are getting the returned Set in variable ‘courses’. At end we are printing the elements of all sets.
Program Output
join of sets - using union() method - remove duplicates
{'Azure', 'Java', 'Python', 'AWS', 'ML'} # output course_set_1
{'NumPy', 'Python', 'Pandas', 'Matplotlib'} # output course_set_2
course_set_1 U course_set_2 - {'NumPy', 'Java', 'Pandas', 'Matplotlib', 'AWS', 'ML', 'Azure', 'Python'} # union_set
From the program output, there is no change in the elements of ‘course_set_1’ and ‘course_set_2’. After the Union operation, we got a result set ‘courses’. Result set contains elements from both the sets but avoids duplicate elements.
union()
method with 3 sets
Using union()
method can take single or multiple Sets on which we want to perform Union operation. There are two ways we can do Union of more than 2 Sets.
set_1.union(set_2).union(set_3)
– In this case we are doing union operation two times. First with sets set_1 and set_2 and then with their result_set and set_3.set_1.union(set_2, set_3)
– In this case we are passing multiple sets to theunion()
method. So, 1 time union operation can be done with multiple sets.
# using union() method with three sets
print("join of sets - using union() method with three sets")
number_set_1 = {10, 20, 30, 40}
number_set_2 = {30, 40, 50, 60}
number_set_3 = {60, 70, 80, 90}
number_union_1 = number_set_1.union(number_set_2).union(number_set_3) # union operation
print("number_set_1 U number_set_2 U number_set_3 - ", number_union_1)
number_union_2 = number_set_1.union(number_set_2, number_set_3) # union operation
print("number_set_1 U number_set_2 U number_set_3 - ", number_union_2)
In this program, we have defined three sets referenced by variables ‘number_set_1’, ‘number_set_2’ and ‘number_set_3’.
In the first case – we are using built-in union()
method on ‘number_set_1’ and passing ‘number_set_2’ as argument to union()
method. We are using union()
method to this result-set and passing the third set ‘number_set_3’. Final result-set is referenced by variable ‘number_union_1’.
In the second case – we are using union()
method on ‘number_set_1’ and passing both sets as an argument. Final result-set is referenced by variable ‘number_union_2’.
Program Output
join of sets - using union() method with three sets
number_set_1 U number_set_2 U number_set_3 - {70, 40, 10, 80, 50, 20, 90, 60, 30} # output result-set 1
number_set_1 U number_set_2 U number_set_3 - {70, 40, 10, 80, 50, 20, 90, 60, 30} # output result-set 2
From the program output, after the Union operation in both the cases we got result-set. Result set contains elements from all three sets and avoids duplicate elements. In both cases, result-set has similar elements.
or (|)
operator
Union of Sets – using - Syntax –
set_1 | set_2
# union using | operator
print("join of sets - union using | operator")
number_set_1 = {10, 20, 30, 40}
number_set_2 = {30, 40, 50, 60}
number_set_3 = {60, 70, 80, 90}
number_union_1 = number_set_1 | number_set_2
print("number_set_1 U number_set_2 - ", number_union_1)
number_union_2 = number_set_1 | number_set_2 | number_set_3
print("number_set_1 U number_set_2 U number_set_3 - ", number_union_2)
We have defined three sets referenced by variables ‘number_set_1’, ‘number_set_2’ and ‘number_set_3’.
In the first case – we are using or (|)
operator with ‘number_set_1’ and ‘number_set_2’. So, it an union of two sets.
In the second case – we are using or (|)
operator with ‘number_set_1’, ‘number_set_2’ and ‘number_set_3’. So, it an union of three sets.
Program Output
join of sets - union using | operator
number_set_1 U number_set_2 - {40, 10, 50, 20, 60, 30} # union of 2 sets
number_set_1 U number_set_2 U number_set_3 - {70, 40, 10, 80, 50, 20, 90, 60, 30} # union of 3 sets
After the Union operation, we got a result set in both cases. Result set contains elements from their union sets and avoids duplicates.
update()
method
Union of Sets – using We learned about update() method on Sets in previous article. Please read the article – Add elements in Set – Python
Summary
In this article we learn about various Union operation on Sets. Following scenarios were explored:
Code – Github Repository
All code snippets and programs for this article and for Python tutorial, can be accessed from Github repository – Comments and Docstring in Python.
Python Topics
Interview Questions & Answers
Q: What happens if one or both sets are empty when performing a union operation?
When performing a union operation where one or both sets are empty, the result will be a set that contains all the elements of the non-empty set(s). If both sets are empty, the result will be an empty set.
- Example – One empty set
number_set = {10, 20, 30, 40}
empty_set = set()
union_set = number_set.union(empty_set)
print(union_set) # Elements in union_set {40, 10, 20, 30}
- Example – Both empty set
empty_set_1 = set()
empty_set_2 = set()
union_set = empty_set_1.union(empty_set_2)
print(union_set) # union_set will be empty
Q: How does the union operation handle duplicate elements between sets?
In a union operation, duplicate elements between sets are automatically removed. Since sets only allow unique elements, the union will contain each element only once, even if it appears in both sets.
Q: How can the union of sets be used in real-world applications?
There are multiple real-world applications in which union of sets can be used. Union of sets is used while dealing with distinct data groups where combined unique elements are needed. Applications where union of set can be used:
- Database Query Results – Combining results from multiple queries where each query returns a unique set of results. The union operation ensures that the combined results do not have duplicates.
- Survey Analysis – In survey analysis, different groups may respond to different questions. The union operation can be used to find all unique respondents across multiple questions.
- Tagging and Categorization – When categorizing items with multiple tags, a union operation can combine tags from various categories, ensuring that no duplicate tags.
- Social Networks – In social networks, the union operation can be used to find all unique friends, followers, or connections between different users or groups.
Q: How can we perform the union operation on non-set iterables like lists or tuples?
Before performing a union operation with non-set iterables like lists and tuples, we can easily convert them into a set using set()
constructor. e.g. set(list)
and set(tuple)
. Then we can perform union operation on these sets.
Q: Explain the time complexity of the union operation in Python.
Time complexity for the union operation in Python is Big O of total number of elements in all the sets. i.e. O(len(set1) + len(set2) + … + len(setN)) for N
sets, where len(set1)
, len(set2)
and len(setN)
are the lengths of the specific sets. This is because the union operation involves iterating over each element of the sets to ensure that only unique elements are added to the result.
Since, sets uses hash tables as underlying data structure, so union operations are quite efficient even for large datasets.
Q: Write a test case to verify if a union operation has been performed correctly?
We need to verify two things, to ensure if a union operation has been performed correctly:
- The resulting set should contain every unique element from the sets involved in the union.
- There should be no duplicate element in the resulting set
set_1 = {12, 22, 32}
set_2 = {32, 44, 55}
union_set = set_1.union(set_2)
# Verification
assert union_set == {12, 22, 32, 44, 55}, "Union operation failed"
print("Union operation successful!")
If elements in union_set does not match to {12, 22, 32, 44, 55}
, then assertion will fail, else if it matches to those elements then assertion will pass.