VOOZH about

URL: https://towardsdatascience.com/dont-let-the-python-dir-function-trick-you-3c42abbdb4d9/

⇱ Don't Let the Python dir() Function Trick You! | Towards Data Science


Skip to content

Don’t Let the Python dir() Function Trick You!

The dir() function returns an object's attributes. Unfortunately, it doesn't reveal everything – discover how to find all attributes…

8 min read

PYTHON PROGRAMMING

👁 Is this really everything? The dir() function doesn't guarantee to show everything! Photo by Cristiano Pinto on Unsplash
Is this really everything? The dir() function doesn’t guarantee to show everything! Photo by Cristiano Pinto on Unsplash

This is Python’s built-in dir() function:

👁 The help on the dir() function. Image by author
The help on the dir() function. Image by author

Simply put, when used for an object, say obj, the function returns names of its attributes… or technically, some of its attributes – this some of making quite a difference.

If you search the Internet, you’ll find quite a few questions about this problem with dir(). On StackOverflow, you’ll find questions about this issue for a variety of Python objects, such as Outlook messages, json objects, parsed XML objects, orscipy.sparse objects. Clearly, people expect dir() to return all attributes. Before researching this topic, I did expect that, too – and so I understand the confusion.

Even if you do read the function’s docstring and notice the "some of attributes" phrase, it doesn’t help in finding all the attributes.

This article deals with this strange aspect of the dir() function. We’ll discuss why it doesn’t list all attributes of an object, and what to do if you want to do this. Such a function will be particularly useful during debugging, when you need to understand the behavior of an object. Perhaps the most significant purpose of the article, however, is to help you avoid a mistake of considering the attribute names that dir() returns complete.

About dir()

The dir() function is a widely used tool for introspecting Python objects, providing developers with a quick way to list the names of attributes and methods associated with an object.¹ Thus, it’s an extremely useful function when you’re trying to understand the intricacies of a Python object. I’ve been using it quite extensively, mostly during interactive sessions, and I guess most Pythonistas do it, too.

dir() aims to return the names of as many attributes of an object as possible – but it does not guarantee to return all its attributes. This is what the "some of" phrase means in the function’s docstring.

First, remember this when using this function. Second, if you do need to find the names of all the attributes, dir() is not your friend. In the next section, we’ll implement a function that’ll meet your need.

Finding all attributes

In this section, we’ll implement the get_all_attributes() function, which combines three introspection techniques to provide a comprehensive overview of an object’s attributes:

  1. The built-in dir() function.
  2. The object’s __dict__ attribute, if available. It __dict__ typically contains all writable attributes of an object.
  3. The inspect module from the Python standard library. We’ll employ the inspect.getmembers() function, as it provides a complete list of all the members of the object, not limited to those in __dict__, so it includes dynamically added attributes, inherited attributes, and more.
  4. Metaclass information. We will extract attributes from the object’s metaclass (i.e., the class of the class), which can give insights into class-level methods and properties.
import inspect
from typing import Any, Union

def get_all_attributes(
 obj: Any,
 unique: bool = True,
) -> Union[list[str], dict[str, list[str]]]:
 attrs = {}
 attrs["dir"] = dir(obj)

 if hasattr(obj, '__dict__'):
 attrs["__dict__"] = list(obj.__dict__.keys())

 attrs["inspect.getmembers"] = [
 name
 for name, _ in inspect.getmembers(obj)
 ]
 metaclass = type(obj)
 attrs["metaclass"] = dir(metaclass)
 if unique:
 attrs = {
 element
 for sublist in attrs.values()
 for element in sublist
 }
 return attrs

Let’s analyze the function. First of all, the function aims to list all the attributes of a Python object, including those from the dir() function and the object’s __dict__ attribute, those revealed by the inspect.getmembers() function, and those from its metaclass.

The function has two arguments: obj, that is, the analyzed object; and unique, a Boolean flag. When unique is True, the function returns a list with a unique set of names of all the attributes of obj, the behavior we typically expect from such a function. When it’s False, the function returns a dictionary of lists, where each key points to a different source of attributes; these lists will partially overlap. Hence, use unique=True if you’re interested in getting the attributes (which is the default behavior) and unique=False when you want to compare the different sources of attributes.

Example

You can use the get_all_attributes() function for any object. Usually, its output with fully overlap with that from dir(), but for some objects, it won’t. I’d like to show here a prominent example from the Python standard library, where get_all_attributes() returns several times more attributes than dir() does: the enum.Enum class and its values.

First, let’s define a very simple enumeration class:²

>>> from enum import Enum
>>> class X(Enum):
... X1 = 1
... X2 = 2
... X3 = 3

Inheriting from enum.Enum, the X class provides an enumeration with three possible values:

>>> X.__members__
mappingproxy({'X1': <X.X1: 1>, 'X2': <X.X2: 2>, 'X3': <X.X3: 3>})

First, let’s see the attributes of the class itself. This is what dir() returns:

>>> dir_X = dir(X)
>>> len(dir_X)
14
>>> dir_X # doctest: +NORMALIZE_WHITESPACE
['X1', 'X2', 'X3', '__class__', '__contains__',
 '__doc__', '__getitem__', '__init_subclass__',
 '__iter__', '__len__', '__members__', '__module__',
 '__name__', '__qualname__']

And this is get_all_attributes() in action:

>>> get_all_X = get_all_attributes(X)
>>> len(get_all_X)
76
>>> get_all_X # doctest: +NORMALIZE_WHITESPACE
{'_value_repr_', 'mro', '__eq__', '__dict__',
 '_value2member_map_', '__dir__', '__getattribute__',
 '__itemsize__', '__reduce__', '__subclasshook__',
 '_check_for_existing_members_', '__annotations__',
 '__str__', '__class__', '__new__', '__base__',
 '__abstractmethods__', '__bool__', '__delattr__',
 '_generate_next_value_', '__bases__', '__ne__',
 '__prepare__', '__call__', '__flags__',
 '__reduce_ex__', 'X3', '_use_args_', 'value',
 '__hash__', '__members__', '__sizeof__', '__setattr__',
 '_get_mixins_', '__iter__', '__name__', '__init__',
 '_member_names_', '__module__', '_create_', '__doc__',
 '__basicsize__', '__format__', '__len__',
 '__weakrefoffset__', '__reversed__', '__ge__',
 '__init_subclass__', '__subclasscheck__', '_convert_',
 '__mro__', '__or__', '_find_data_type_', '_member_type_',
 '_new_member_', '__le__', '__contains__', 'X1', '__gt__',
 'name', '__instancecheck__', '__repr__', '__text_signature__',
 '_unhashable_values_', '__qualname__', '__getattr__',
 '__getstate__', '_member_map_', '__subclasses__',
 '__dictoffset__', '__lt__', 'X2', '__getitem__',
 '_find_data_repr_', '__ror__', '_find_new_'}

Wow, that’s quite a difference, isn’t it? This is 14 attributes from dir() against as many as 76 attributes from get_all_attributes()! Let’s analyze the latter function with unique=False, so that we can see which source provides these 62 additional attributes:

>>> get_all_X_dict = get_all_attributes(X, unique=False)
>>> {k: len(v) for k, v in get_all_X_dict.items()} # doctest: +NORMALIZE_WHITESPACE
{'dir': 14,
 '__dict__': 15,
 'inspect.getmembers': 16,
 'metaclass': 62}

As you can see, dir() returns the fewest attribute names. It occurs these 14 attribute names from dir() are not in the X.__dict__:

>>> [a for a in X.__dict__ if a not in dir_X] # doctest: +NORMALIZE_WHITESPACE
['_generate_next_value_', '_new_member_', '_use_args_',
 '_member_names_', '_member_map_', '_value2member_map_',
 '_unhashable_values_', '_member_type_', '_value_repr_',
 '__new__']
>>> [a for a in dir_X if a not in X.__dict__] # doctest: +NORMALIZE_WHITESPACE
['__class__', '__contains__', '__getitem__',
 '__init_subclass__', '__iter__', '__len__',
 '__members__', '__name__', '__qualname__'] 

What about inspect.getmembers()?

>>> [a for a in dir_X if a not in get_all_X_dict["inspect.getmembers"]]
[]
>>> [a for a in get_all_X_dict["inspect.getmembers"] if a not in dir_X]
['name', 'value']

This time, all attributes from dir() are returned by inspect.getmembers(), but the latter includes two more: "name" and "value".

Certainly, it is the metaclass source that contains most attributes, most of which aren’t listed in any other source.

Let’s see how this works for an X value, e.g., X.X1. Note that it’s enough to use the get_all_attributes() function with unique=False, since it will show us also the output from dir():


>>> get_all_X1 = get_all_attributes(X.X1, False)
>>> {k: len(v) for k, v in get_all_X1.items()}
{'dir': 10, '__dict__': 4, 'inspect.getmembers': 10, 'metaclass': 14} 
>>> from pprint import pp
>>> pp(get_all_X1)
{'dir': ['X1',
 'X2',
 'X3',
 '__class__',
 '__doc__',
 '__eq__',
 '__hash__',
 '__module__',
 'name',
 'value'],
 '__dict__': ['_value_', '_name_', '__objclass__', '_sort_order_'], 
 'inspect.getmembers': ['X1',
 'X2',
 'X3',
 '__class__',
 '__doc__',
 '__eq__',
 '__hash__',
 '__module__',
 'name',
 'value'],
 'metaclass': ['X1',
 'X2',
 'X3',
 '__class__',
 '__contains__',
 '__doc__',
 '__getitem__',
 '__init_subclass__',
 '__iter__',
 '__len__',
 '__members__',
 '__module__',
 '__name__',
 '__qualname__']}

Again, we see the quite some differences.

Conclusion

This article doesn’t aim to analyze the intricacies of the enum.Enum class and its child classes, but to show how to analyze its attributes. This is why we didn’t analyze the results we obtained thanks to the get_all_attributes() function. The point is that the dir() function doesn’t provide all attributes of some Python objects, and enumeration classes and its values are just clear examples of this phenomenon.

While dir() will occur sufficient for many Python objects, the get_all_attributes() function will sometimes offer a more detailed and comprehensive view of an object’s attributes. This can be crucial for debugging, developing libraries, or any situation where a deeper understanding of object attributes is necessary.

You may think you’ll seldom find this function useful, as dir() often returns all the attributes of an object. That’s true – but here’s a problem: you won’t know dir()‘s output is complete without checking this out, and you can do this using the get_all_attributes() function. So, if your aim is to closely inspect a particular object, you should use this very function instead of dir(). With unique=False, you’ll see the more detailed picture, with the attributes being caught by dir() and other sources.

Footnotes

¹ The name dir most likely originates from "directory." This is because it lists the attributes and methods of an object, much like a directory lists files or items it contains. Thus, the name dir reflects its directory-like view of the namespace of the object.

² Most of the code blocks in the article are formatted as docstests. You can read more about doctesting in the following article:

Python Documentation Testing with doctest: The Easy Way


Written By

Marcin Kozak

Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles