I l@ve RuBoard Previous Section Next Section

17.9 Data Structures Versus Python Built-ins

Now that I've shown you all these complicated algorithms, I need to also tell you that at least in some cases, they may not be an optimal approach. Built-in types like lists and dictionaries are often a simpler and more efficient way to represent data. For instance:

Binary trees

These may be useful in many applications, but Python dictionaries already provide a highly optimized, C-coded, search table tool. Indexing a dictionary by key is likely to be faster than searching a Python-coded tree structure:

>>> x = {}
>>> for i in [3,1,9,2,7]: x[i] = None                 # insert
>>> for i in range(10): print (i, x.has_key(i)),      # lookup
(0, 0) (1, 1) (2, 1) (3, 1) (4, 0) (5, 0) (6, 0) (7, 1) (8, 0) (9, 1)

Because dictionaries are built in to the language, they are always available, and will usually be faster than Python-based data structure implementations.

Graph algorithms

These serve many purposes, but a purely Python-coded implementation of a very large graph might be less efficient than you want in some applications. Graph programs tend to require peak performance; using dictionaries instead of class instances to represent graphs may boost performance some, but using linked-in compiled extensions may as well.

Sorting algorithms

These are an important part of many programs too, but Python's built-in list sort method is so fast that you would be hard pressed to beat it in Python in most scenarios. In fact, it's generally better to convert sequences to lists first just so you can use the built-in:

temp = list(sequence)
temp.sort(  )
...use items in temp...

For custom sorts, simply pass in a comparison function of your own:

>>> L = [{'n':3}, {'n':20}, {'n':0}, {'n':9}]    
>>> L.sort( lambda x, y: cmp(x['n'], y['n']) )    
>>> L            
[{'n': 0}, {'n': 3}, {'n': 9}, {'n': 20}]
Reversal algorithms

These are generally superfluous by the same token -- because Python lists provide a fast reverse method, you may be better off converting a non-list to a list first, just so that you can run the built-in list method.

Don't misunderstand: sometimes you really do need objects that add functionality to built-in types, or do something more custom. The set classes we met, for instance, add tools not directly supported by Python today, and the tuple-tree stack implementation was actually faster than one based upon built-in lists for common usage patterns. Permutations are something you need to add on your own too.

Moreover, class encapsulations make it possible to change and extend object internals without impacting the rest of your system. They also support reuse much better than built-in types -- types are not classes today, and cannot be specialized directly without wrapper class logic.

Yet because Python comes with a set of built-in, flexible, and optimized datatypes, data structure implementations are often not as important in Python as they are in lesser-equipped languages such as C or C++. Before you code that new datatype, be sure to ask yourself if a built-in type or call might be more in line with the Python way of thinking.

    I l@ve RuBoard Previous Section Next Section