defaultdict – Collections in Python

In Python, defaultdict is a subclass of the built-in dict class. defaultdict returns a new dictionary-like object. It provides a default value for a nonexistent key automatically, without raising a KeyError. It is part of the collections module.

defaultdict

Syntax to create a defaultdict object:

defaultdict_object = defaultdict(default_factory=None, **kwargs)

  • defaultdict() is the constructor used to create its object
  • default_factory is datatype, function or lambda function which is used to define the default value the keys. If dictionary keys don’t have any value assigned, then default_factory is used to assign default values to those keys. By default, default_factory is None and no default values are assigned when it is None.
  • **kwargs is the dictionary based on which defaultdict object will be created. This argument is optional. If no kwargs are given then defaultdict object will be empty with no key-value pair in it.
  • defaultdict_object is the reference object referring to the defaultdict dictionary.

defaultdict with int default_factory

from collections import defaultdict

# Scenario 1 - defaultdict with int as default_factory
int_defaultdict = defaultdict(int)     # defaultdict with int as default_factory
print(int_defaultdict['missing_key'])  # Output => 0
int_defaultdict['existing_key'] += 5   # operation on the dictionary element
print(int_defaultdict)  # Output => defaultdict(<class 'int'>, {'missing_key': 0, 'existing_key': 5})
  • In this scenario, we are using defaultdict(int) to create defaultdict object referenced by variable int_defaultdict. This defines int datatype as default_factory to assign default value 0 to keys with missing value.
  • We are not providing any kwargs. So, initially empty defaultdict object is created with no key-value pair.
  • int_defaultdict['missing_key'] => missing_key is not present in the dictionary, so default value 0 will be assigned to it.
  • int_defaultdict['existing_key'] += 5 => existing_key also is not present in the dictionary. Initially default value 0 will be assigned to it and then we are adding 5 to it.
  • Output => defaultdict(<class 'int'>, {'missing_key': 0, 'existing_key': 5}). This shows defaultdict object with class int and the key-value pairs added in that object.

defaultdict with list default_factory

from collections import defaultdict

# Scenario 2 - defaultdict with list as default_factory
list_defaultdict = defaultdict(list)   # defaultdict with list as default_factory
print(list_defaultdict['missing_key']) # Output => []
list_defaultdict['new_key'].append(12) # operation on the dictionary element
list_defaultdict['new_key'].append(22) # operation on the dictionary element
print(list_defaultdict)  # Output => defaultdict(<class 'list'>, {'missing_key': [], 'new_key': [12, 22]})
  • In this scenario, we are using defaultdict(list) to create defaultdict object referenced by variable list_defaultdict. This defines list datatype as default_factory to assign default empty list [] to keys with missing value.
  • We are not providing any kwargs. So, initially empty defaultdict object is created with no key-value pair.
  • list_defaultdict['missing_key'] => missing_key is not present in the dictionary, so default empty list [] will be assigned to it.
  • list_defaultdict['new_key'].append(12) => new_key also is not present in the dictionary. Initially default empty list [] will be assigned to it and then we are appending element 12 to that list.
  • list_defaultdict['new_key'].append(22) => new_key value will be accessed which was added in the last step. Then we are appending element 22 to that list.
  • Output => defaultdict(<class 'list'>, {'missing_key': [], 'new_key': [12, 22]}). This shows defaultdict object with class list and the key-value pairs added in that object.

defaultdict with function definition

from collections import defaultdict

# create defaultdict with function definition
print("create defaultdict with function definition")
def def_course():
    return "no course"
      
course_default_dict = defaultdict(def_course, {'c1':'AWS','c2':'Python'})
print("initial default dict - ", course_default_dict)
print(course_default_dict["c1"])
print(course_default_dict["c2"])
print(course_default_dict["c3"])
print("default dict after setting - ", course_default_dict)
  • We have defined a function def_course(), which will be used as default_factory to assign default value no_course to keys.
  • We are using defaultdict(def_course, {'c1':'AWS','c2':'Python'}) to create defaultdict object referenced by variable course_default_dict. def_course function is used as default_factory to assign default values to keys with missing value.
  • We are also providing kwargs => {'c1':'AWS','c2':'Python'}. So, initially object will be created with these key-value pairs.
  • Using course_default_dict["c1"], we can access values of the key. Key c1 and c2 is already present in the dictionary. When we try to access key c3 (initially not present in the dictionary), then it will invoke function def_course and will assign the default value to it.
  • Updated defaultdict object will be printed.

Program Output

create defaultdict with function definition
# Two key-value pairs, initially provided as kwargs
defaultdict(<function def_course at 0x0002A0>, {'c1': 'AWS', 'c2': 'Python'})
AWS
Python
no course  # default value
# Three key-value pairs after assigning default value to key c3
defaultdict(<function def_course at 0x0002A0>, {'c1': 'AWS', 'c2': 'Python', 'c3': 'no course'})

defaultdict with lambda

from collections import defaultdict

# create defaultdict with lambda
print("create defaultdict with lambda")
course_default_dict = defaultdict(lambda: "no course", {'c1':'AWS','c2':'Python'})
print(course_default_dict)       # initial defaultdict object
print(course_default_dict["c1"]) # access key c1 from dictionary
print(course_default_dict["c2"]) # access key c2 from dictionary
print(course_default_dict["c3"]) # access key c3 from dictionary
print(course_default_dict)       # defaultdict after setting default value
  • In this scenario, we are using default_factory as lambda like lambda: "no course" instead of defining a separate function.
  • All other steps works in similar way as it was in function definition.

Program Output

create defaultdict with lambda
# Two key-value pairs, initially provided as kwargs
defaultdict(<function def_course at 0x000D60>, {'c1': 'AWS', 'c2': 'Python'})
AWS
Python
no course  # default value
# Three key-value pairs after assigning default value to key c3
defaultdict(<function def_course at 0x000D60>, {'c1': 'AWS', 'c2': 'Python', 'c3': 'no course'})

defaultdict __missing__ method

__missing__(key) is called by __getitem__ for missing key. If default_factory is None, then this method raises KeyError. Otherwise, it will assign the default value to the keys => defaultdict[key] = value = self.default_factory()

from collections import defaultdict

# defaultdict changed with the default value for the key
print("defaultdict changed with the default value for the key")
course_default_dict = defaultdict(lambda: "no course", {'c1':'AWS','c2':'Python'})
print(course_default_dict)                   # initial defaultdict object
print(course_default_dict.__missing__('c1')) # This considers c1 as missing and reassign default value
print(course_default_dict.__missing__('c2')) # This considers c2 as missing and reassign default value
print(course_default_dict.__missing__('c3')) # c3 as missing and assign default value
print(course_default_dict)                   # defaultdict after assigning default value
  • We have defined a defaultdict object using lambda.
  • Initially, dictionary will have key-value pairs {'c1':'AWS','c2':'Python'}
  • When we use __missing__(key) method, then it considers key as missing and reassign default value to that key.
  • __missing__('c1') => c1 will have reassigned value “no course”. Similarly, default value will be assigned to c2 and c3.

Program Output

defaultdict changed with the default value for the key
# Two key-value pairs, initially provided as kwargs
defaultdict(<function <lambda> at 0x000B60>, {'c1': 'AWS', 'c2': 'Python'})
no course
no course
no course
# Three key-value pairs after assigning default value to key c3
defaultdict(<function <lambda> at 0x000B60>, {'c1': 'no course', 'c2': 'no course', 'c3': 'no course'})

defaultdict – Use Cases

  • Grouping itemsdefaultdict(list) is useful for grouping items under keys.
data = [('a', 1), ('b', 2), ('a', 3), ('b', 4), ('b', 5)]
d = defaultdict(list)
for k, v in data:
    d[k].append(v)
print(d)  # Output => defaultdict(<class 'list'>, {'a': [1, 3], 'b': [2, 4, 5]})

We are iterating over the list of tuple (paired) elements. Using defaultdict, we are collecting all the element values as list with common key. Final output will be defaultdict(<class 'list'>, {'a': [1, 3], 'b': [2, 4, 5]}).

  • Counting occurrencesdefaultdict(int) is used for counting frequencies of elements. This can be used with counting of common words, numbers or other elements.
course_list = ['Python', 'Power BI', 'Java', 'Python', 'Power BI', 'Power BI']
count_dict = defaultdict(int)
for course in course_list:
    count_dict[course] += 1
print(count_dict)  # Output => defaultdict(<class 'int'>, {'Python': 2, 'Power BI': 3, 'Java': 1})

Summary

In this article, we learned about defaultdict collection in Python. Following scenarios were discussed:

Code – Github Repository

All code snippets and programs for this article and for Python tutorial, can be accessed from Github repository – Comments and Docstring in Python.

Python Topics


Interview Questions & Answers

Q: How a defaultdict differ from a regular dict?

  • In a regular dictionary, accessing a non-existent key raises a KeyError.
  • In a defaultdict, accessing a non-existent key automatically creates the key with a default value provided by the default_factory. The default_factory is a callable, like int, list, or a user-defined function.

Q: How to convert defaultdict back to regular dict?

We can easily convert a defaultdict back into a regular dict, by passing it to dict().

from collections import defaultdict

d = defaultdict(int, {'a': 14, 'b': 42})
regular_dict = dict(d)
print(regular_dict)  # Output => {'a': 14, 'b': 42}

Q: Can we set a different default value to individual keys in a defaultdict?

No, we cannot set a different default value for individual keys directly with defaultdict. The default_factory function applies to all missing keys uniformly. However, once the key is created, we can modify the value as needed.

Q: Compare the performance of defaultdict to a regular dictionary?

The performance of defaultdict is similar to a regular dict. There is a very slight overhead due to the presence of the default_factory function, but in most cases, this overhead is negligible. The benefit of cleaner and more maintainable code often outweighs the minor performance cost.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *