27.1. Python's distutils
distutils is a rich and flexible set of tools to package Python programs and extensions for distribution to third parties. I cover typical, simple uses of distutils for the most common packaging needs. For an in-depth, highly detailed discussion of distutils, I recommend two manuals that are part of Python's online documentation: Distributing Python Modules (available at http://www.python.org/doc/current/dist/) and Installing Python Modules (available at http://www.python.org/doc/current/inst/), both by Greg Ward, the principal author of distutils.
27.1.1. The Distribution and Its Root
A distribution is the set of files to package into a single file for distribution purposes. A distribution may include zero, one, or more Python packages and other Python modules (as covered in Chapter 7), as well as, optionally, Python scripts, C-coded (and other) extensions, supporting datafiles, and auxiliary files containing metadata about the distribution itself. A distribution is said to be pure if all code it includes is Python, and nonpure if it includes non-Python code (most often, C-coded or Pyrex extensions).
You should normally place all the files of a distribution in a directory, known as the distribution root directory, and in subdirectories of the distribution root. Mostly, you can arrange the subtree of files and directories rooted at the distribution root to suit your own organizational needs. However, as covered in "Packages" on page 149, a Python package must reside in its own directory, and a package's directory must contain a file named _ _init_ _.py (and subdirectories with _ _init_ _.py files, for the package's subpackages, if any) as well as other modules that belong to that package.
27.1.2. The setup.py Script
The distribution root directory must contain a Python script that by convention is named setup.py. The setup.py script can, in theory, contain arbitrary Python code. However, in practice, setup.py always boils down to some variation of:
from distutils.core import setup, Extension setup( many named arguments go here )
All the action is in the parameters you supply in the call to setup. You should not import Extension if your setup.py deals with a pure distribution. Extension is needed only for nonpure distributions, and you should import it only when you need it. It is fine, of course, to have a few statements before the call to setup in order to arrange setup's arguments in clearer and more readable ways than could be managed by having everything inline as part of the setup call.
The distutils.core.setup function accepts only named arguments, and there are a large number of such arguments that you could potentially supply. A few arguments deal with the internal operations of distutils itself, and you never supply such arguments unless you are extending or debugging distutils, an advanced subject that I do not cover in this book. Other named arguments to setup fall into two groups: metadata about the distribution and information about which files are in the distribution.
27.1.3. Metadata About the Distribution
You should provide metadata about the distribution by supplying some of the following keyword arguments when you call the distutils.core.setup function. The value you associate with each argument name you supply is a string, intended mostly to be human-readable; the specifications about the string's format are mostly advisory. The explanations and recommendations about the metadata fields in the following are also non-normative and correspond only to common, not universal, conventions. Whenever the following explanations refer to "this distribution," the phrase can be taken to refer to the material included in the distribution rather than to the packaging of the distribution:
27.1.4. Distribution Contents
A distribution can contain a mix of Python source files, C-coded extensions, and datafiles. setup accepts optional keyword arguments that detail which files to put in the distribution. Whenever you specify file paths, the paths must be relative to the distribution root directory and use / as the path separator. distutils adapts location and separator appropriately when it installs the distribution. Note, however, that the keyword arguments packages and py_modules do not list file paths, but rather Python packages and modules, respectively. Therefore, in the values of these keyword arguments, don't use path separators or file extensions. When you list subpackage names in argument packages, use Python syntax instead (e.g., top_package.sub_package).
220.127.116.11. Python source files
By default, setup looks for Python modules (which you list in the value of the keyword argument py_modules) in the distribution root directory, and for Python packages (which you list in the value of the keyword argument packages) as subdirectories of the distribution root directory. You may specify keyword argument package_dir to change these defaults. However, things are simpler when you locate files according to setup's defaults, so I do not cover package_dir further in this book.
The setup keyword arguments you will most frequently use to detail which Python source files are part of the distribution are the following.
To put datafiles of any kind in the distribution, supply the following keyword argument.
18.104.22.168. C-coded extensions
To put C-coded extensions in the distribution, supply the following keyword argument.
The Extension class also supports other file extensions besides .c, indicating other languages you may use to code Python extensions. On platforms having a C++ compiler, file extension .cpp indicates C++ source files. Other file extensions that may be supported, depending on the platform and on various add-ons to the distutils, include .f for Fortran, .i for SWIG, and .pyx for Pyrex files. See "Extending Python Without Python's C API" on page 645 for information about using different languages to extend Python.
In some cases, your extension needs no further information besides mandatory arguments name and sources. distutils implicitly performs all that is necessary to make the Python headers directory and the Python library available for your extension's compilation and linking, and provides whatever compiler or linker flags or options are needed to build extensions on a given platform.
When additional information is required to compile and link your extension correctly, you can supply such information via the keyword arguments of class Extension. Such arguments may potentially interfere with the cross-platform portability of your distribution. In particular, whenever you specify file or directory paths as the values of such arguments, the paths should be relative to the distribution root directory; using absolute paths seriously impairs your distribution's cross-platform portability.
Portability is not a problem when you just use distutils as a handy way to build your extension, as suggested in "Building and Installing C-Coded Python Extensions" on page 614. However, when you plan to distribute your extensions to other platforms, you should examine whether you really need to provide build information via keyword arguments to Extension. It is sometimes possible to bypass such needs by careful coding at the C level, and the already mentioned Distributing Python Modules manual provides important examples.
The keyword arguments that you may pass when calling Extension are the following:
27.1.5. The setup.cfg File
distutils lets a user who's installing your distribution specify many options at installation time. Most often the user will simply enter at a command line:
C:\> python setup.py install
but the already mentioned manual Installing Python Modules explains many alternatives. To provide suggested values for installation options, you can put a setup.cfg file in your distribution root directory. setup.cfg can also provide appropriate defaults for options you can supply to build-time commands. For copious details on the format and contents of file setup.cfg, see the already mentioned manual Distributing Python Modules.
27.1.6. The MANIFEST.in and MANIFEST Files
When you run:
python setup.py sdist
to produce a packaged-up source distribution (typically a .zip file on Windows or a .tgz file, a.k.a. a tarball, on Unix), the distutils by default inserts in the distribution:
To add yet more files in the source distribution .zip file or tarball, place in the distribution root directory a manifest template file named MANIFEST.in, whose lines are rules, applied sequentially, about files to add (include) or subtract (prune) from the list of files to place in the distribution. The sdist command of the distutils produces an exact list of the files placed in the source distribution in a text file named MANIFEST in the distribution root directory.
27.1.7. Creating Prebuilt Distributions with distutils
The packaged source distributions you create with python setup.py sdist are the most useful files you can produce with distutils. However, you can make life even easier for users with specific platforms by also creating prebuilt forms of your distribution with the command python setup.py bdist.
For a pure distribution, supplying prebuilt forms is merely a matter of convenience for the users. You can create prebuilt, pure distributions for any platform, including ones different from those on which you work, as long as you have the needed commands (such as zip, gzip, bzip2, and tar) available on your path. Such commands are freely available on the Internet for all sorts of platforms, so you can easily stock up on them in order to provide maximum convenience to users who want to install your distribution.
For a nonpure distribution, making prebuilt forms available may be more than just an issue of convenience. A nonpure distribution, by definition, includes code that is not pure Pythongenerally, C code. Unless you supply a prebuilt form, users need to have the appropriate C compiler installed in order to build and install your distribution. This is not a terrible problem on platforms where the appropriate C compiler is the free and ubiquitous gcc (nowadays, this means most Unix-like platforms, including Mac OS X, where gcc is part of the free XCode IDE that comes with the operating system). However, on other platforms (mostly Windows), the C compiler needed for normal building of Python extensions is commercial and costly. For example, on Windows, the normal C compiler used by Python and its C-coded extensions is Microsoft Visual Studio (VS 2003, for Python 2.4 and 2.5). It is possible to substitute other compilers, including free ones such as the mingw32 and cygwin versions of gcc. However, using such alternative compilers, as documented in the Python online manuals, is rather intricate, particularly for end users who may not be experienced programmers.
Therefore, if you want your nonpure distribution to be widely adopted on such platforms as Windows, it's highly advisable to make your distribution also available in prebuilt form. However, unless you have developed or purchased advanced cross-compilation environments, building a nonpure distribution and packaging it in prebuilt form is feasible only on the target platform. You also need to have the necessary C compiler installed. When all of these conditions are satisfied, however, distutils makes the procedure quite simple. In particular, the command:
python setup.py bdist_wininst
creates an .exe file that is a Windows installer for your distribution. If your distribution is nonpure, the prebuilt distribution is dependent on the specific Python version. The distutils reflect this fact in the name of the .exe installer they create for you. Say, for example, that your distribution's name metadata is mydist, your distribution's version metadata is 0.1, and the Python version you use is 2.4. In this case, the distutils build a Windows installer named mydist-0.1.win32-py2.4.exe.