Previous Page
Next Page

8.11. The itertools Module

The iterools module offers many powerful, high-performance building blocks to build or manipulate iterator objects. Manipulating iterators is often better than manipulating lists thanks to iterators' intrinsic "lazy evaluation" approach: items of an iterator are produced one at a time, as needed, while all the items of a list (or other sequence) must exist in memory at the same time (this "lazy" approach even makes it feasible to build and manipulate unbounded iterators, while all lists must always have finite numbers of items).

This section documents the most frequently used attributes of module itertools; each of them is an iterator type, which you can call to get an instance of the type in question.

chain

chain (*iterables )

Builds and returns an iterator whose items are all those from the first iterable passed, followed by all those from the second iterable passed, and so on until the end of the last iterable passed, just like the generator expression:

(item for iterable in
iterables for item in
iterable)

count

count(firstval=0)

Builds and returns an unbounded iterator whose items are consecutive integers starting from firstval, just like the generator:

def count(firstval=0):
    while True:
        yield firstval
        firstval += 1

cycle

count(iterable)

Builds and returns an unbounded iterator whose items are the items of iterable, endlessly repeating the items from the beginning each time the cycle reaches the end, just like the generator:

def cycle(iterable):
    buffer = []
    for item in iterable:
        yield item
        buffer.append(item)
    while True:
        for item in buffer: yield item

ifilter

ifilter(func,iterable)

Builds and returns an iterator whose items are those items of iterable for which func is true, just like the generator expression:

(item for item in
iterable if func(item))

func can be any callable object that accepts a single argument, or None. When func is None, ifilter tests for true items, just like the generator expression:

(item for item in
iterable if item)

imap

imap(func,*iterables)

Builds and returns an iterator whose items are the results of func, called with one corresponding argument from each of the iterables; stops when the shortest of the iterables is exhausted. Just like the generator.

def imap(func,*iterables):
    next_items = [iter(x).next for x in iterables]
    while True: yield func(*(next( ) for next in next_items))

islice

islice(iterable[,start],
stop[,step])

Builds and returns an iterator whose items are items of iterable, skipping the first start ones (default is 0) until the stopth one excluded, and advancing by steps of step (default is 1) at a time. All arguments must be nonnegative integers, and step must be >0. Apart from checks and optionality of the arguments, just like the generator.

def islice(iterable,start,stop,step=1):
    en = enumerate(iterable)
    for n, item in en:
        if n>=start: break
    while n<stop:
        yield item
        for x in range(step): n, item = en.next( )

izip

izip(*iterables)

Builds and returns an iterator whose items are tuples with one corresponding item from each of the iterables; stops when the shortest of the iterables is exhausted. Just like imap(tuple,*iterables).

repeat

repeat(item[,times])

Builds and returns an iterator whose times items are all the object item, just like the generator expression:

(item for x in xrange(times))

When times is absent, the iterator is unbound, with a potentially infinite number of items, which are all the object item. Just like the generator.

def repeat_unbounded(item):
    while True: yield item

tee

tee(iterable,n=2)

Builds and returns a tuple of n independent iterators, each of whom has items that are the same as the items of iterable. While the returned iterators are independent from each other, they are not independent from iterable; therefore, you must avoid altering object iterable in any way, as long as you're still using any of the returned iterators (this caveat is roughly the same as for the result of iter(iterable)). New in 2.4.

Python's online documentation has abundant examples and explanations about each of these types of module itertools, as well as about other types of itertools that I do not cover in this book. One surprising fact is the sheer speed of itertools types. To take a trivial example, consider repeating some action 10 times. On my laptop:

for x in itertools.repeat(0, 10): pass

turns out to be about 10 percent faster than the second-fastest alternative:

for x in xrange(10): pass

and almost twice as fast as using range instead of xrange in this case.



Previous Page
Next Page