Default Dict¶
You must have come across the situation: you don't know if a key is in the dict
, but you want to change the value, append the dict or modify the data. For example:
Dict Awkward Situation
There is a list of animal types:
animals = ['cat', 'dog', 'cat', 'parrot', 'lion', 'tiger', 'dog', 'parrot', 'dog']
And if want to use a dict
to store the counts of each type.
Let's do a naive approach:
Codes
def count_animals(animals):
stat = {}
for animal in animals:
if animal not in stat:
stat[animal] = 0
stat[animal] += 1
return stat
Run
count_animals(animals)
Output
{'cat': 2, 'dog': 3, 'parrot': 2, 'lion': 1, 'tiger': 1}
It solves the problem, but handling key missing
error is annoying when the situation gets complicate.
If we can handle it in one place, and use it anywhere, won't it be great? Sounds familiar?
Poor Man Default Dict¶
Poor Man Default Dict
I know implementing a prefect default dict is appealing, but the details will blind you from the mechanism. Here, we only try to do a poor man
version defaultdict
to give you the concept how it works.
When you try to get a key that doesn't exist, __missing__
method can be called from __getitem__
. Implementing it will do the trick.
Codes
class PoorManDefaultDictV1(dict):
def __missing__(self, key):
return 0
Run
stat = PoorManDefaultDictV1()
stat['cat']
Output
0
We need to let user to provide a callable object to return the default value and store it in the dict.
Codes
class PoorManDefaultDictV2(dict):
def __init__(self, default_factory, *args, **kwargs):
# set the default factory
self.default_factory = default_factory
super().__init__(*args, **kwargs)
def __missing__(self, key):
value = self.default_factory()
self.__setitem__(key, value)
return value
Run
stat = PoorManDefaultDictV2(int)
stat['cat']
Output
0
Also we can:
Run
stat = PoorManDefaultDictV2(list)
stat['cat']
Output
[]
Now let's use our poor man
version of defaultdict
to solve the previous problem.
Codes
def poor_man_count_animals(animals):
stat = PoorManDefaultDictV2(int)
for animal in animals:
stat[animal] += 1
return stat
Run
poor_man_count_animals(animals)
Output
{'cat': 2, 'dog': 3, 'parrot': 2, 'lion': 1, 'tiger': 1}
I think now you've learned how to implement a very basic defaultdict
.
Default Dict Source Codes¶
Default Dict
defaultdict
is a great tool, very straight forward to use. The implementation is quite similar as above, but it is written in C
.
Here it defines the missing function for defaultdict
static PyObject *
defdict_missing(defdictobject *dd, PyObject *key)
{
PyObject *factory = dd->default_factory;
PyObject *value;
if (factory == NULL || factory == Py_None) {
/* XXX Call dict.__missing__(key) */
PyObject *tup;
tup = PyTuple_Pack(1, key);
if (!tup) return NULL;
PyErr_SetObject(PyExc_KeyError, tup);
Py_DECREF(tup);
return NULL;
}
value = _PyObject_CallNoArg(factory);
if (value == NULL)
return value;
if (PyObject_SetItem((PyObject *)dd, key, value) < 0) {
Py_DECREF(value);
return NULL;
}
return value;
}
Then it registers the functions to the object.
static PyMethodDef defdict_methods[] = {
{"__missing__", (PyCFunction)defdict_missing, METH_O,
defdict_missing_doc},
{"copy", (PyCFunction)defdict_copy, METH_NOARGS,
defdict_copy_doc},
{"__copy__", (PyCFunction)defdict_copy, METH_NOARGS,
defdict_copy_doc},
{"__reduce__", (PyCFunction)defdict_reduce, METH_NOARGS,
reduce_doc},
{"__class_getitem__", (PyCFunction)Py_GenericAlias, METH_O|METH_CLASS,
PyDoc_STR("See PEP 585")},
{NULL}
};
defaultdict Examples
Codes
from collections import defaultdict
def count_animals(animals):
stat = defaultdict(int)
for animal in animals:
stat[animal] += 1
return stat
Run
count_animals(animals)
Output
{'cat': 2, 'dog': 3, 'parrot': 2, 'lion': 1, 'tiger': 1}
Run
import random
from collections import defaultdict
random_default = defaultdict(lambda: random.randint(0, 10))
random_default['cat']
Output
The default value of a non-existing key is a random integer between 0
and 10
.
4