


Detailed explanation of defaultdict in Python (code example)
This article brings you a detailed explanation (code example) of defaultdict in Python. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.
Default values can be very convenient
As we all know, in Python, if you access a key that does not exist in the dictionary, a KeyError exception will be raised (in JavaScript, if a certain key does not exist in the object attribute, returns undefined). But sometimes it is very convenient to have a default value for every key in the dictionary. For example, the following example:
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy') counts = {} for kw in strings: counts[kw] += 1
This example counts the number of times a word appears in strings and records it in the counts dictionary. Every time a word appears, the value stored in the key corresponding to counts is incremented by 1. But in fact, running this code will throw a KeyError exception. The timing of occurrence is when each word is counted for the first time. Because there is no default value in Python's dict, it can be verified in the Python command line:
>>> counts = dict() >>> counts {} >>> counts['puppy'] += 1 Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'puppy'
Use judgment statements to check
In this case, the first method that may come to mind is to store the default value of 1 in the corresponding key in counts when the word is counted for the first time. This requires adding a judgment statement during processing:strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy') counts = {} for kw in strings: if kw not in counts: counts[kw] = 1 else: counts[kw] += 1 # counts: # {'puppy': 5, 'weasel': 1, 'kitten': 2}
Use the dict.setdefault() method
You can also set the default value through the dict.setdefault() method:
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy') counts = {} for kw in strings: counts.setdefault(kw, 0) counts[kw] += 1
The dict.setdefault() method receives two parameters. The first parameter is the name of the key, and the second parameter is the default value. If the given key does not exist in the dictionary, the default value provided in the parameter is returned; otherwise, the value saved in the dictionary is returned. The code in the for loop can be rewritten using the return value of the dict.setdefault() method to make it more concise:
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy') counts = {} for kw in strings: counts[kw] = counts.setdefault(kw, 0) + 1
Use the collections.defaultdict class
Although the above method is to a certain extent This solves the problem that there is no default value in dict, but at this time we will wonder, is there a dictionary that itself provides the function of default value? The answer is yes, it is collections.defaultdict.The defaultdict class is like a dict, but it is initialized using a type:
>>> from collections import defaultdict >>> dd = defaultdict(list) >>> dd defaultdict(<type 'list'>, {})
The initialization function of the defaultdict class accepts a type as a parameter. When the key being accessed does not exist, it can be instantiated. Change a value as the default value:
>>> dd['foo'] [] >>> dd defaultdict(<type 'list'>, {'foo': []}) >>> dd['bar'].append('quux') >>> dd defaultdict(<type 'list'>, {'foo': [], 'bar': ['quux']})
It should be noted that this form of default value can only be passed through dict[key]
or dict.__getitem__(key)
It is only valid when accessing. The reasons for this will be introduced below.
>>> from collections import defaultdict >>> dd = defaultdict(list) >>> 'something' in dd False >>> dd.pop('something') Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'pop(): dictionary is empty' >>> dd.get('something') >>> dd['something'] []
In addition to accepting the type name as a parameter of the initialization function, this class can also use any callable function without parameters. At that time, the return result of the function will be used as the default value, which makes the default value Values are more flexible. The following uses an example to illustrate how to use the custom function zero() without parameters as the parameter of the initialization function:
>>> from collections import defaultdict >>> def zero(): ... return 0 ... >>> dd = defaultdict(zero) >>> dd defaultdict(<function zero at 0xb7ed2684>, {}) >>> dd['foo'] 0 >>> dd defaultdict(<function zero at 0xb7ed2684>, {'foo': 0})
Use collections.defaultdict
to solve the initial word statistics problem , the code is as follows:
from collections import defaultdict strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy') counts = defaultdict(lambda: 0) # 使用lambda来定义简单的函数 for s in strings: counts[s] += 1
How the defaultdict class is implemented
Through the above content, you must have understood the usage of the defaultdict class, so how to implement the default value in the defaultdict class What about the function? The key to this is the use of the __missing__() method:
>>> from collections import defaultdict >>> print defaultdict.__missing__.__doc__ __missing__(key) # Called by __getitem__ for missing key; pseudo-code: if self.default_factory is None: raise KeyError(key) self[key] = value = self.default_factory() return value
By looking at the docstring of the __missing__() method, we can see that when using the __getitem__() method to access a non-existent key ( The form dict[key] is actually a simplified form of the __getitem__() method), which calls the __missing__() method to obtain the default value and add the key to the dictionary.
For a detailed introduction to the __missing__() method, please refer to the "Mapping Types — dict" section in the official Python documentation.
Introduced in the document, starting from version 2.5, if a subclass derived from dict defines the __missing__() method, when accessing a non-existent key, dict[key] will call the __missing__() method to obtain default value.
It can be seen from this that although dict supports the __missing__() method, this method does not exist in dict itself. Instead, this method needs to be implemented in the derived subclass. This can be easily verified:
>>> print dict.__missing__.__doc__ Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: type object 'dict' has no attribute '__missing__'
At the same time, we can do further experiments, define a subclass Missing and implement the __missing__() method:
>>> class Missing(dict): ... def __missing__(self, key): ... return 'missing' ... >>> d = Missing() >>> d {} >>> d['foo'] 'missing' >>> d {}
The return result reflects the __missing__( ) method does work. On this basis, we slightly modify the __missing__() method so that this subclass sets a default value for non-existent keys like the defautldict class:
>>> class Defaulting(dict): ... def __missing__(self, key): ... self[key] = 'default' ... return 'default' ... >>> d = Defaulting() >>> d {} >>> d['foo'] 'default' >>> d {'foo': 'default'}
Implementing the function of defaultdict in older versions of Python
The defaultdict class was added after version 2.5. It is not supported in some older versions, so it is necessary to implement a compatible defaultdict class for older versions. This is actually very simple. Although the performance may not be as good as the defautldict class that comes with version 2.5, it is functionally the same.First of all, the __getitem__() method needs to call the __missing__() method when the access key fails:
class defaultdict(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key)
Secondly, the __missing__()
method needs to be implemented to set the default value:
class defaultdict(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key) def __missing__(self, key): self[key] = value = self.default_factory() return value
Then, the initialization function of the defaultdict class __init__()
needs to accept type or callable function parameters:
class defaultdict(dict): def __init__(self, default_factory=None, *a, **kw): dict.__init__(self, *a, **kw) self.default_factory = default_factory def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key) def __missing__(self, key): self[key] = value = self.default_factory() return value
最后,综合以上内容,通过以下方式完成兼容新旧Python版本的代码:
try: from collections import defaultdictexcept ImportError: class defaultdict(dict): def __init__(self, default_factory=None, *a, **kw): dict.__init__(self, *a, **kw) self.default_factory = default_factory def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key) def __missing__(self, key): self[key] = value = self.default_factory() return value
The above is the detailed content of Detailed explanation of defaultdict in Python (code example). For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

PHP is mainly procedural programming, but also supports object-oriented programming (OOP); Python supports a variety of paradigms, including OOP, functional and procedural programming. PHP is suitable for web development, and Python is suitable for a variety of applications such as data analysis and machine learning.

PHP is suitable for web development and rapid prototyping, and Python is suitable for data science and machine learning. 1.PHP is used for dynamic web development, with simple syntax and suitable for rapid development. 2. Python has concise syntax, is suitable for multiple fields, and has a strong library ecosystem.

Python is more suitable for beginners, with a smooth learning curve and concise syntax; JavaScript is suitable for front-end development, with a steep learning curve and flexible syntax. 1. Python syntax is intuitive and suitable for data science and back-end development. 2. JavaScript is flexible and widely used in front-end and server-side programming.

PHP originated in 1994 and was developed by RasmusLerdorf. It was originally used to track website visitors and gradually evolved into a server-side scripting language and was widely used in web development. Python was developed by Guidovan Rossum in the late 1980s and was first released in 1991. It emphasizes code readability and simplicity, and is suitable for scientific computing, data analysis and other fields.

VS Code can run on Windows 8, but the experience may not be great. First make sure the system has been updated to the latest patch, then download the VS Code installation package that matches the system architecture and install it as prompted. After installation, be aware that some extensions may be incompatible with Windows 8 and need to look for alternative extensions or use newer Windows systems in a virtual machine. Install the necessary extensions to check whether they work properly. Although VS Code is feasible on Windows 8, it is recommended to upgrade to a newer Windows system for a better development experience and security.

VS Code can be used to write Python and provides many features that make it an ideal tool for developing Python applications. It allows users to: install Python extensions to get functions such as code completion, syntax highlighting, and debugging. Use the debugger to track code step by step, find and fix errors. Integrate Git for version control. Use code formatting tools to maintain code consistency. Use the Linting tool to spot potential problems ahead of time.

VS Code extensions pose malicious risks, such as hiding malicious code, exploiting vulnerabilities, and masturbating as legitimate extensions. Methods to identify malicious extensions include: checking publishers, reading comments, checking code, and installing with caution. Security measures also include: security awareness, good habits, regular updates and antivirus software.

Running Python code in Notepad requires the Python executable and NppExec plug-in to be installed. After installing Python and adding PATH to it, configure the command "python" and the parameter "{CURRENT_DIRECTORY}{FILE_NAME}" in the NppExec plug-in to run Python code in Notepad through the shortcut key "F6".
