Section 4.6. Sequence Operations

4.6. Sequence Operations

Python supports a variety of operations applicable to all sequences, including strings, lists, and tuples. Some sequence operations apply to all containers (including, for example, sets and dictionaries, which are not sequences), and some apply to all iterables (meaning "any object on which you can loop," as covered in "Iterables" on page 40; all containers, be they sequences or otherwise, are iterable, and so are many objects that are not containers, such as files, covered in "File Objects" on page 216, and generators, covered in "Generators" on page 78). In the following, I use the terms sequence, container, and iterable, quite precisely and specifically, to indicate exactly which operations apply to each category.

4.6.1. Sequences in General

Sequences are containers with items that are accessible by indexing or slicing. The built-in len function takes any container as an argument and returns the number of items in the container. The built-in min and max functions take one argument, a nonempty iterable whose items are comparable, and return the smallest and largest items, respectively. You can also call min and max with multiple arguments, in which case they return the smallest and largest arguments, respectively. The built-in sum function takes one argument, an iterable whose items are numbers, and returns the sum of the numbers.

4.6.1.1. Sequence conversions

There is no implicit conversion between different sequence types, except that plain strings are converted to Unicode strings if needed. (String conversion is covered in detail in "Unicode" on page 198.) You can call the built-ins tuple and list with a single argument (any iterable) to get a new instance of the type you're calling, with the same items (in the same order) as in the argument.

4.6.1.2. Concatenation and repetition

You can concatenate sequences of the same type with the + operator. You can multiply a sequence S by an integer n with the * operator. S*n or n*S is the concatenation of n copies of S. When n<=0, S*n is an empty sequence of the same type as S.

4.6.1.3. Membership testing

The x in S operator tests to check whether object x equals any item in the sequence (or other kind of container or iterable) S. It returns TRue if it does and False if it doesn't. The x not in S operator is just like not (x in S). In the specific case of strings, though, x in S is more widely applicable; in this case, the operator tests whether x equals any substring of string S, not just any single character.

4.6.1.4. Indexing a sequence

The nth item of a sequence S is denoted by an indexing: S[n]. Indexing is zero-based (S's first item is S[0]). If S has L items, the index n may be 0, 1...up to and including L-1, but no larger. n may also be -1, -2...down to and including -L, but no smaller. A negative n indicates the same item in S as L+n does. In other words, S[-1], like S[L-1], is the last element of S, S[-2] is the next-to-last one, and so on. For example:

x = [1, 2, 3, 4]
x[1]                  # 2
x[-1]                 # 4

Using an index >=L or <-L raises an exception. Assigning to an item with an invalid index also raises an exception. You can add one or more elements to a list, but to do so you assign to a slice, not an item, as I'll discuss shortly.

4.6.1.5. Slicing a sequence

To indicate a subsequence of S, you can use a slicing, with the syntax S[i:j], where i and j are integers. S[i:j] is the subsequence of S from the ith item, included, to the jth item, excluded. In Python, ranges always include the lower bound and exclude the upper bound. A slice is an empty subsequence if j is less than or equal to i, or if i is greater than or equal to L, the length of S. You can omit i if it is equal to 0, so that the slice begins from the start of S. You can omit j if it is greater than or equal to L, so that the slice extends all the way to the end of S. You can even omit both indices, to mean a copy of the entire sequence: S[:]. Either or both indices may be less than 0. A negative index indicates the same spot in S as L+n, just like in indexing. An index greater than or equal to L means the end of S, while a negative index less than or equal to -L means the start of S. Here are some examples:

x = [1, 2, 3, 4]
x[1:3]                 # [2, 3]
x[1:]                  # [2, 3, 4]
x[:2]                  # [1, 2]

Slicing can also use the extended syntax S[i:j:k]. k is the stride of the slice, or the distance between successive indices. S[i:j] is equivalent to S[i:j:1], S[::2] is the subsequence of S that includes all items that have an even index in S, and S[::-1] has the same items as S, but in reverse order.

4.6.2. Strings

String objects are immutable, so attempting to rebind or delete an item or slice of a string raises an exception. The items of a string object (corresponding to each of the characters in the string) are themselves strings, each of length 1. The slices of a string object are also strings. String objects have several methods, which are covered in "Methods of String Objects" on page 186.

4.6.3. Tuples

Tuple objects are immutable, so attempting to rebind or delete an item or slice of a tuple raises an exception. The items of a tuple are arbitrary objects and may be of different types. The slices of a tuple are also tuples. Tuples have no normal (nonspecial) methods, only some of the special methods covered in "Special Methods" on page 104.

4.6.4. Lists

List objects are mutable, so you may rebind or delete items and slices of a list. The items of a list are arbitrary objects and may be of different types. The slices of a list are lists.

4.6.4.1. Modifying a list

You can modify a list by assigning to an indexing. For instance:

x = [1, 2, 3, 4]
x[1] = 42                # x is now [1, 42, 3, 4]

Another way to modify a list object L is to use a slice of L as the target (LHS) of an assignment statement. The RHS of the assignment must then be an iterable. If the LHS slice is in extended form, then the RHS must have just as many items as the number of items in the LHS slice. However, if the LHS slice is in normal (nonextended) form, then the LHS slice and the RHS may each be of any length; assigning to a (nonextended) slice of a list can add or remove list items. For example:

x = [1, 2, 3, 4]
x[1:3] = [22, 33, 44]     # x is now [1, 22, 33, 44, 4]
x[1:4] = [8, 9]           # x is now [1, 8, 9, 4]

Here are some important special cases of assignment to slices:

Using the empty list [] as the RHS expression removes the target slice from L. In other words, L[i:j]=[] has the same effect as del L[i:j].
Using an empty slice of L as the LHS target inserts the items of the RHS at the appropriate spot in L. In other words, L[i:i]=['a','b'] inserts the items 'a' and 'b' before the item that was at index i in L before the assignment.
Using a slice that covers the entire list object, L[:], as the LHS target totally replaces the contents of L.

You can delete an item or a slice from a list with del. For instance:

x = [1, 2, 3, 4, 5]
del x[1]                 # x is now [1, 3, 4, 5]
del x[::2]               # x is now [3, 5]

4.6.4.2. In-place operations on a list

List objects define in-place versions of the + and * operators, which you can use via augmented assignment statements. The augmented assignment statement L+=L1 has the effect of adding the items of iterable L1 to the end of L. L*=n has the effect of adding n-1 copies of L to the end of L; if n<=0, L*=n empties the contents of L, like L[:]=[].

4.6.4.3. List methods

List objects provide several methods, as shown in Table 4-3. Nonmutating methods return a result without altering the object to which they apply, while mutating methods may alter the object to which they apply. Many of the mutating methods behave like assignments to appropriate slices of the list. In Table 4-3, L indicates any list object, i any valid index in L, s any iterable, and x any object.

Table 4-3. List object methods
Method
Description
Nonmutating methods

L.count(x)
Returns the number of items of L that are equal to x.
L.index(x)
Returns the index of the first occurrence of an item in L that is equal to x, or raises an exception if L has no such item.
Mutating methods

L.append(x)
Appends item x to the end of L ; e.g., L[len(L):]=[x].
L.extend(s)
Appends all the items of iterable s to the end of L; e.g., L[len(L):]=s.
L.insert(i, x)
Inserts item x in L before the item at index i, moving following items of L (if any) "rightward" to make space (increases len(L) by one, does not replace any item, does not raise exceptions: acts just like L[i:i]=[x]).
L.remove(x)
Removes from L the first occurrence of an item in L that is equal to x, or raises an exception if L has no such item.
L.pop([i])
Returns the value of the item at index i and removes it from L; if i is omitted, removes and returns the last item; raises an exception if L is empty or i is an invalid index in L.
L.reverse( )
Reverses, in place, the items of L.
L.sort([f]) (2.3)
Sorts, in place, the items of L, comparing items pairwise via function f; if f is omitted, comparison is via the built-in function cmp. For more details, see "Sorting a list" on page 57.
L.sort(cmp=cmp, key=None, reverse=False)(2.4)
Sorts, in-place, the items of L, comparing items pairwise via the function passed as cmp (by default, the built-in function cmp). When argument key is not None, what gets compared for each item x is key(x), not x itself. For more details, see "Sorting a list" on page 57.

All mutating methods of list objects, except pop, return None.

4.6.4.4. Sorting a list

A list's method sort causes the list to be sorted in-place (its items are reordered to place them in increasing order) in a way that is guaranteed to be stable (elements that compare equal are not exchanged). In practice, sort is extremely fast, often preternaturally fast, as it can exploit any order or reverse order that may be present in any sublist (the advanced algorithm sort uses, known as timsort to honor its developer, Tim Peters, is technically a "non-recursive adaptive stable natural mergesort/binary insertion sort hybrid").

In Python 2.3, a list's sort method takes one optional argument. If present, the argument must be a function that, when called with any two list items as arguments, returns -1, 0, or 1, depending on whether the first item is to be considered less than, equal to, or greater than the second item for sorting purposes. Passing the argument slows down the sort, although it makes it easy to sort small lists in flexible ways. The decorate-sort-undecorate idiom, covered in "Searching and sorting" on page 485 is faster, at least as flexible, and often less error-prone than passing an argument to sort.

In Python 2.4, the sort method takes three optional arguments, which may be passed with either positional or named-argument syntax. Argument cmp plays just the same role as the only (unnamed) optional argument in Python 2.3. Argument key, if not None, must be a function that can be called with any list item as its only argument. In this case, to compare any two items x and y, Python uses cmp(key(x),key(y)) rather than cmp(x,y) (in practice, this is implemented in the same way as the decorate-sort-undecorate idiom presented in "Searching and sorting" on page 485, and can be even faster). Argument reverse, if TRue, causes the result of each comparison to be reversed; this is not the same thing as reversing L after sorting because the sort is stable (elements that compare equal are never exchanged) whether argument reverse is true or false.

Python 2.4 also provides a new built-in function sorted (covered in sorted on page 167) to produce a sorted list from any input iterable. sorted accepts the same input arguments as a list's method sort. Moreover, Python 2.4 adds to module operator (covered in "The operator Module" on page 368) two new higher-order functions, attrgetter and itemgetter, which produce functions particularly suitable for the key= optional argument of lists' method sort and the new built-in function sorted. In Python 2.5, an identical key= optional argument has also been added to built-in functions min and max, and to functions nsmallest and nlargest in standard library module heapq, covered in "The heapq Module" on page 177.