Previous Page
Next Page

27.1. Python's distutils

distutils is a rich and flexible set of tools to package Python programs and extensions for distribution to third parties. I cover typical, simple uses of distutils for the most common packaging needs. For an in-depth, highly detailed discussion of distutils, I recommend two manuals that are part of Python's online documentation: Distributing Python Modules (available at and Installing Python Modules (available at, both by Greg Ward, the principal author of distutils.

27.1.1. The Distribution and Its Root

A distribution is the set of files to package into a single file for distribution purposes. A distribution may include zero, one, or more Python packages and other Python modules (as covered in Chapter 7), as well as, optionally, Python scripts, C-coded (and other) extensions, supporting datafiles, and auxiliary files containing metadata about the distribution itself. A distribution is said to be pure if all code it includes is Python, and nonpure if it includes non-Python code (most often, C-coded or Pyrex extensions).

You should normally place all the files of a distribution in a directory, known as the distribution root directory, and in subdirectories of the distribution root. Mostly, you can arrange the subtree of files and directories rooted at the distribution root to suit your own organizational needs. However, as covered in "Packages" on page 149, a Python package must reside in its own directory, and a package's directory must contain a file named _ _init_ (and subdirectories with _ _init_ files, for the package's subpackages, if any) as well as other modules that belong to that package.

27.1.2. The Script

The distribution root directory must contain a Python script that by convention is named The script can, in theory, contain arbitrary Python code. However, in practice, always boils down to some variation of:

from distutils.core import setup, Extension

setup( many named arguments go here )

All the action is in the parameters you supply in the call to setup. You should not import Extension if your deals with a pure distribution. Extension is needed only for nonpure distributions, and you should import it only when you need it. It is fine, of course, to have a few statements before the call to setup in order to arrange setup's arguments in clearer and more readable ways than could be managed by having everything inline as part of the setup call.

The distutils.core.setup function accepts only named arguments, and there are a large number of such arguments that you could potentially supply. A few arguments deal with the internal operations of distutils itself, and you never supply such arguments unless you are extending or debugging distutils, an advanced subject that I do not cover in this book. Other named arguments to setup fall into two groups: metadata about the distribution and information about which files are in the distribution.

27.1.3. Metadata About the Distribution

You should provide metadata about the distribution by supplying some of the following keyword arguments when you call the distutils.core.setup function. The value you associate with each argument name you supply is a string, intended mostly to be human-readable; the specifications about the string's format are mostly advisory. The explanations and recommendations about the metadata fields in the following are also non-normative and correspond only to common, not universal, conventions. Whenever the following explanations refer to "this distribution," the phrase can be taken to refer to the material included in the distribution rather than to the packaging of the distribution:


The name(s) of the author(s) of material included in the distribution. You should always provide this information, since authors deserve credit for their work.


Email address(es) of the author(s) named in argument author. You should provide this information only if the author is willing to receive email about this work.


A list of Trove strings to classify your package; each string must be one of those listed at


The name of the principal contact person or mailing list for this distribution. You should provide this information if there is somebody who should be contacted in preference to people named in arguments author and maintainer.


Email address of the contact named in argument contact. You should provide this information if and only if you supply the contact argument.


A concise description of this distribution, preferably fitting within one line of 80 characters or less. You should always provide this information.


The full name of this distribution. You should provide this information if the name supplied as argument name is in abbreviated or incomplete form (e.g., an acronym).


A list of keywords that would likely be searched for by somebody looking for the functionality provided by this distribution. You should provide this information if it might prove useful to somebody indexing this distribution in a search engine.


The licensing terms of this distribution, in a concise form that may refer for details to a file in the distribution or to a URL. You should always provide this information.


The name(s) of the current maintainer(s) of this distribution. You should normally provide this information if the maintainer is different from the author.


Email address(es) of the maintainer(s) named in argument maintainer. You should provide this information only if you supply the maintainer argument and if the maintainer is willing to receive email about this work.


The name of this distribution as a valid Python identifier (this often requires abbreviations, e.g., as an acronym). You should always provide this information.


A list of platforms on which this distribution is known to work. You should provide this information if you have reasons to believe this distribution may not work everywhere. This information should be reasonably concise, so this field may refer for details to a file in the distribution or to a URL.


A URL at which more information can be found about this distribution. You should always provide this information if any such URL exists.


The version of this distribution and/or its contents, normally structured as major.minor or even more finely. You should always provide this information.

27.1.4. Distribution Contents

A distribution can contain a mix of Python source files, C-coded extensions, and datafiles. setup accepts optional keyword arguments that detail which files to put in the distribution. Whenever you specify file paths, the paths must be relative to the distribution root directory and use / as the path separator. distutils adapts location and separator appropriately when it installs the distribution. Note, however, that the keyword arguments packages and py_modules do not list file paths, but rather Python packages and modules, respectively. Therefore, in the values of these keyword arguments, don't use path separators or file extensions. When you list subpackage names in argument packages, use Python syntax instead (e.g., top_package.sub_package). Python source files

By default, setup looks for Python modules (which you list in the value of the keyword argument py_modules) in the distribution root directory, and for Python packages (which you list in the value of the keyword argument packages) as subdirectories of the distribution root directory. You may specify keyword argument package_dir to change these defaults. However, things are simpler when you locate files according to setup's defaults, so I do not cover package_dir further in this book.

The setup keyword arguments you will most frequently use to detail which Python source files are part of the distribution are the following.


packages=[ list of package name strings ]

For each package name string p in the list, setup expects to find a subdirectory p in the distribution root directory and includes in the distribution the file p/_ _init_, which must be present, as well as any other file p/*.py (i.e., all the modules of package p). setup does not search for subpackages of p: you must explicitly list all subpackages, as well as top-level packages, in the value of keyword argument packages.


py_modules=[ list of module name strings ]

For each module name string m in the list, setup expects to find a file in the distribution root directory and includes in the distribution.


scripts=[ list of script file path strings ]

Scripts are Python source files that are meant to be run as main programs (generally from the command line). The value of the scripts keyword lists the path strings of these files, complete with .py extension, relative to the distribution root directory.

Each script file should have as its first line a shebang line, i.e., a line starting with #! and containing the substring python. When distutils installs the scripts included in the distribution, distutils alters each script's first line to point to the Python interpreter. This is quite useful on many platforms, since the shebang line is used by the platform's shells or by other programs that may run your scripts, such as web servers. Datafiles

To put datafiles of any kind in the distribution, supply the following keyword argument.


data_files=[ list of pairs (target_directory,[list of files]) ]

The value of keyword argument data_files is a list of pairs. Each pair's first item is a string and names a target directory (i.e., a directory where distutils places datafiles when installing the distribution); the second item is the list of file path strings for files to put in the target directory. At installation time, distutils places each target directory as a subdirectory of Python's sys.prefix for a pure distribution, or of Python's sys.exec_prefix for a nonpure distribution. distutils places the given files directly in the respective target directory, never in subdirectories of the target. For example, given the following data_files usage:

data_files = [('miscdata', ['conf/config.txt', 'misc/sample.txt'])]

distutils includes in the distribution the file config.txt from subdirectory conf of the distribution root, and the file sample.txt from subdirectory misc of the distribution root. At installation time, distutils creates a subdirectory named miscdata in Python's sys.prefix directory (or in the sys.exec_prefix directory, if the distribution is nonpure), and copies the two files into miscdata/config.txt and miscdata/sample.txt. C-coded extensions

To put C-coded extensions in the distribution, supply the following keyword argument.


ext_modules=[ list of instances of class Extension ]

All the details about each extension are supplied as arguments when instantiating the distutils.core.Extension class.

Extension's constructor accepts two mandatory arguments and many optional keyword arguments, as follows.


class Extension(name, sources, **kwds)

name is the module name string for the C-coded extension. name may include dots to indicate that the extension module resides within a package. sources is the list of C source files that the distutils must compile and link in order to build the extension. Each item of sources is a string that gives a source file's path relative to the distribution root directory, complete with file extension .c. kwds lets you pass other, optional arguments to Extension, as covered later in this section.

The Extension class also supports other file extensions besides .c, indicating other languages you may use to code Python extensions. On platforms having a C++ compiler, file extension .cpp indicates C++ source files. Other file extensions that may be supported, depending on the platform and on various add-ons to the distutils, include .f for Fortran, .i for SWIG, and .pyx for Pyrex files. See "Extending Python Without Python's C API" on page 645 for information about using different languages to extend Python.

In some cases, your extension needs no further information besides mandatory arguments name and sources. distutils implicitly performs all that is necessary to make the Python headers directory and the Python library available for your extension's compilation and linking, and provides whatever compiler or linker flags or options are needed to build extensions on a given platform.

When additional information is required to compile and link your extension correctly, you can supply such information via the keyword arguments of class Extension. Such arguments may potentially interfere with the cross-platform portability of your distribution. In particular, whenever you specify file or directory paths as the values of such arguments, the paths should be relative to the distribution root directory; using absolute paths seriously impairs your distribution's cross-platform portability.

Portability is not a problem when you just use distutils as a handy way to build your extension, as suggested in "Building and Installing C-Coded Python Extensions" on page 614. However, when you plan to distribute your extensions to other platforms, you should examine whether you really need to provide build information via keyword arguments to Extension. It is sometimes possible to bypass such needs by careful coding at the C level, and the already mentioned Distributing Python Modules manual provides important examples.

The keyword arguments that you may pass when calling Extension are the following:

define_macros = [ ( macro_name, macro_value)...]

Each of the items macro_name and macro_value, in the pairs listed as the value of define_macros, is a string, respectively the name and value for a C preprocessor macro definition, equivalent in effect to the C preprocessor directive:

#define macro_name macro_value

macro_value can also be None, to get the same effect as the C preprocessor directive:

#define macro_name

extra_compile_args = [ list of compile_arg strings ]

Each of the strings compile_arg listed as the value of extra_compile_args is placed among the command-line arguments for each invocation of the C compiler.

extra_link_args = [ list of link_arg strings ]

Each of the strings link_arg listed as the value of extra_link_args is placed among the command-line arguments for the invocation of the linker.

extra_objects = [ list of object_name strings ]

Each of the strings object_name listed as the value of extra_objects names an object file to add to the invocation of the linker. Do not specify the file extension as part of the object name: distutils adds the platform-appropriate file extension (such as .o on Unix-like platforms and .obj on Windows) to help you keep cross-platform portability.

include_dirs = [ list of directory_path strings ]

Each of the strings directory_path listed as the value of include_dirs identifies a directory to supply to the compiler as one where header files are found.

libraries = [ list of library_name strings ]

Each of the strings library_name listed as the value of libraries names a library to add when invoking of the linker. Do not specify the file extension or any prefix as part of the library name: distutils, in cooperation with the linker, adds the platform-appropriate file extension and prefix (such as .a, and a prefix lib, on Unix-like platforms, and .lib on Windows) to help you keep cross-platform portability.

library_dirs = [ list of directory_path strings ]

Each of the strings directory_path listed as the value of library_dirs identifies a directory to supply to the linker as one where library files are found.

runtime_library_dirs = [ list of directory_path strings ]

Each of the strings directory_path listed as the value of runtime_library_dirs identifies a directory where dynamically loaded libraries are found at runtime.

undef_macros = [ list of macro_name strings ]

Each of the strings macro_name listed as the value of undef_macros is the name for a C preprocessor macro definition, equivalent in effect to the C preprocessor directive:

#undef macro_name

27.1.5. The setup.cfg File

distutils lets a user who's installing your distribution specify many options at installation time. Most often the user will simply enter at a command line:

C:\> python install

but the already mentioned manual Installing Python Modules explains many alternatives. To provide suggested values for installation options, you can put a setup.cfg file in your distribution root directory. setup.cfg can also provide appropriate defaults for options you can supply to build-time commands. For copious details on the format and contents of file setup.cfg, see the already mentioned manual Distributing Python Modules.

27.1.6. The and MANIFEST Files

When you run:

python sdist

to produce a packaged-up source distribution (typically a .zip file on Windows or a .tgz file, a.k.a. a tarball, on Unix), the distutils by default inserts in the distribution:

  • All Python and C source files, as well as datafiles, explicitly mentioned or directly implied by your file's options, as covered earlier in this chapter

  • Test files, located at test/test*.py under the distribution root directory

  • Files README.txt (if any), setup.cfg (if any), and

To add yet more files in the source distribution .zip file or tarball, place in the distribution root directory a manifest template file named, whose lines are rules, applied sequentially, about files to add (include) or subtract (prune) from the list of files to place in the distribution. The sdist command of the distutils produces an exact list of the files placed in the source distribution in a text file named MANIFEST in the distribution root directory.

27.1.7. Creating Prebuilt Distributions with distutils

The packaged source distributions you create with python sdist are the most useful files you can produce with distutils. However, you can make life even easier for users with specific platforms by also creating prebuilt forms of your distribution with the command python bdist.

For a pure distribution, supplying prebuilt forms is merely a matter of convenience for the users. You can create prebuilt, pure distributions for any platform, including ones different from those on which you work, as long as you have the needed commands (such as zip, gzip, bzip2, and tar) available on your path. Such commands are freely available on the Internet for all sorts of platforms, so you can easily stock up on them in order to provide maximum convenience to users who want to install your distribution.

For a nonpure distribution, making prebuilt forms available may be more than just an issue of convenience. A nonpure distribution, by definition, includes code that is not pure Pythongenerally, C code. Unless you supply a prebuilt form, users need to have the appropriate C compiler installed in order to build and install your distribution. This is not a terrible problem on platforms where the appropriate C compiler is the free and ubiquitous gcc (nowadays, this means most Unix-like platforms, including Mac OS X, where gcc is part of the free XCode IDE that comes with the operating system). However, on other platforms (mostly Windows), the C compiler needed for normal building of Python extensions is commercial and costly. For example, on Windows, the normal C compiler used by Python and its C-coded extensions is Microsoft Visual Studio (VS 2003, for Python 2.4 and 2.5). It is possible to substitute other compilers, including free ones such as the mingw32 and cygwin versions of gcc. However, using such alternative compilers, as documented in the Python online manuals, is rather intricate, particularly for end users who may not be experienced programmers.

Therefore, if you want your nonpure distribution to be widely adopted on such platforms as Windows, it's highly advisable to make your distribution also available in prebuilt form. However, unless you have developed or purchased advanced cross-compilation environments, building a nonpure distribution and packaging it in prebuilt form is feasible only on the target platform. You also need to have the necessary C compiler installed. When all of these conditions are satisfied, however, distutils makes the procedure quite simple. In particular, the command:

python bdist_wininst

creates an .exe file that is a Windows installer for your distribution. If your distribution is nonpure, the prebuilt distribution is dependent on the specific Python version. The distutils reflect this fact in the name of the .exe installer they create for you. Say, for example, that your distribution's name metadata is mydist, your distribution's version metadata is 0.1, and the Python version you use is 2.4. In this case, the distutils build a Windows installer named mydist-0.1.win32-py2.4.exe.

Previous Page
Next Page