Let a class behave like it’s a list in Python

If you want only part of the list behavior, use composition (i.e. your instances hold a reference to an actual list) and implement only the methods necessary for the behavior you desire. These methods should delegate the work to the actual list any instance of your class holds a reference to, for example:

def __getitem__(self, item):
    return self.li[item] # delegate to li.__getitem__

Implementing __getitem__ alone will give you a surprising amount of features, for example iteration and slicing.

>>> class WrappedList:
...     def __init__(self, lst):
...         self._lst = lst
...     def __getitem__(self, item):
...         return self._lst[item]
... 
>>> w = WrappedList([1, 2, 3])
>>> for x in w:
...     x
... 
1
2
3
>>> w[1:]
[2, 3]

If you want the full behavior of a list, inherit from collections.UserList. UserList is a full Python implementation of the list datatype.

So why not inherit from list directly?

One major problem with inheriting directly from list (or any other builtin written in C) is that the code of the builtins may or may not call special methods overridden in classes defined by the user. Here’s a relevant excerpt from the pypy docs:

Officially, CPython has no rule at all for when exactly overridden method of subclasses of built-in types get implicitly called or not. As an approximation, these methods are never called by other built-in methods of the same object. For example, an overridden __getitem__ in a subclass of dict will not be called by e.g. the built-in get method.

Another quote, from Luciano Ramalho’s Fluent Python, page 351:

Subclassing built-in types like dict or list or str directly is error-
prone because the built-in methods mostly ignore user-defined
overrides. Instead of subclassing the built-ins, derive your classes
from UserDict , UserList and UserString from the collections
module, which are designed to be easily extended.

… and more, page 370+:

Misbehaving built-ins: bug or feature?
The built-in dict , list and str types are essential building blocks of Python itself, so
they must be fast — any performance issues in them would severely impact pretty much
everything else. That’s why CPython adopted the shortcuts that cause their built-in
methods to misbehave by not cooperating with methods overridden by subclasses.

After playing around a bit, the issues with the list builtin seem to be less critical (I tried to break it in Python 3.4 for a while but did not find a really obvious unexpected behavior), but I still wanted to post a demonstration of what can happen in principle, so here’s one with a dict and a UserDict:

>>> class MyDict(dict):
...     def __setitem__(self, key, value):
...         super().__setitem__(key, [value])
... 
>>> d = MyDict(a=1)
>>> d
{'a': 1}

>>> class MyUserDict(UserDict):
...     def __setitem__(self, key, value):
...         super().__setitem__(key, [value])
... 
>>> m = MyUserDict(a=1)
>>> m
{'a': [1]}

As you can see, the __init__ method from dict ignored the overridden __setitem__ method, while the __init__ method from our UserDict did not.

Leave a Comment