Section 14.7. Running Other Programs

14.7. Running Other Programs

You can run other programs via functions in the os module or, in Python 2.4, by using the new subprocess module.

14.7.1. Running Other Programs with the os Module

In Python 2.4, the best way for your program to run other processes is with the new subprocess module, covered in "The Subprocess Module" on page 358. However, the os module also offers several ways to do this, which in some cases may be simpler or allow your code to remain backward-compatible to older versions of Python.

The simplest way to run another program is through function os.system, although this offers no way to control the external program. The os module also provides a number of functions whose names start with exec. These functions offer fine-grained control. A program run by one of the exec functions replaces the current program (i.e., the Python interpreter) in the same process. In practice, therefore, you use the exec functions mostly on platforms that let a process duplicate itself by fork (i.e., Unix-like platforms). os functions whose names start with spawn and popen offer intermediate simplicity and power: they are cross-platform and not quite as simple as system, but simple and usable enough for most purposes.

The exec and spawn functions run a specified executable file, given the executable file's path, arguments to pass to it, and optionally an environment mapping. The system and popen functions execute a command, which is a string passed to a new instance of the platform's default shell (typically /bin/sh on Unix; command.com or cmd.exe on Windows). A command is a more general concept than an executable file, as it can include shell functionality (pipes, redirection, built-in shell commands) using the normal shell syntax specific to the current platform.

execl, execle, execlp, execv, execve, execvp, execvpe
execl(path,*args) execle(path,*args) execlp(path,*args) execv(path,args) execve(path,args,env) execvp(path,args) execvpe(path,args,env)

These functions run the executable file (program) indicated by string path, replacing the current program (i.e., the Python interpreter) in the current process. The distinctions encoded in the function names (after the prefix exec) control three aspects of how the new program is found and run:

Does path have to be a complete path to the program's executable file, or can the function accept a name as the path argument and search for the executable in several directories, as operating system shells do? execlp, execvp, and execvpe can accept a path argument that is just a filename rather than a complete path. In this case, the functions search for an executable file of that name along the directories listed in os.environ['PATH']. The other functions require path to be a complete path to the executable file for the new program.
Are arguments for the new program accepted as a single sequence argument args to the function or as separate arguments to the function? Functions whose names start with execv take a single argument args that is the sequence of the arguments to use for the new program. Functions whose names start with execl take the new program's arguments as separate arguments (execle, in particular, uses its last argument as the environment for the new program).
Is the new program's environment accepted as an explicit mapping argument env to the function, or is os.environ implicitly used? execle, execve, and execvpe take an argument env that is a mapping to be used as the new program's environment (keys and values must be strings), while the other functions use os.environ for this purpose.

Each exec function uses the first item in args as the name under which the new program is told it's running (for example, argv[0] in a C program's main); only args[1:] is passed as arguments proper to the new program.

popen
popen(cmd,mode='r',bufsize=-1)

Runs the string command cmd in a new process P and returns a file-like object f that wraps a pipe to P's standard input or from P's standard output (depending on mode). mode and bufsize have the same meaning as for Python's built-in open function, covered in "Creating a File Object with open" on page 216. When mode is 'r' (or 'rb', for binary-mode reading), f is read-only and wraps P's standard output. When mode is 'w' (or 'wb', for binary-mode writing), f is write-only and wraps P's standard input.

The key difference of f with respect to other file-like objects is the behavior of method f.close. f.close waits for P to terminate and returns None, as close methods of file-like objects normally do, when P's termination is successful. However, if the operating system associates an integer error code c with P's termination, indicating that P's termination was unsuccessful, f.close also returns c. Not all operating systems support this mechanism: on some platforms, f.close therefore always returns None. On Unix-like platforms, if P terminates with the system call exit(n) (e.g., if P is a Python program and terminates by calling sys.exit(n)), f.close receives from the operating system, and returns to f.close's caller, the code 256*n.

popen2, popen3, popen4
popen2(cmd,mode='t',bufsize=-1) popen3(cmd,mode='t',bufsize=-1) popen4(cmd,mode='t',bufsize=-1)

Each of these functions runs the string command cmd in a new process P, and returns a tuple of file-like objects that wrap pipes to P's standard input and from P's standard output and standard error. mode must be 't' to get file-like objects in text mode or 'b' to get them in binary mode. On Windows, bufsize must be -1. On Unix, bufsize has the same meaning as for Python's built-in open function, covered in "Creating a File Object with open" on page 216.

popen2 returns a pair (fi,fo), where fi wraps P's standard input (so the calling process can write to fi) and fo wraps P's standard output (so the calling process can read from fo). popen3 returns a tuple with three items (fi,fo,fe), where fe wraps P's standard error (so the calling process can read from fe). popen4 returns a pair (fi,foe), where foe wraps both P's standard output and error (so the calling process can read from foe). While popen3 is in a sense the most general of the three functions, it can be difficult to coordinate your reading from fo and fe. popen2 is simpler to use than popen3 when it's okay for cmd's standard error to go to the same destination as your own process's standard error, and popen4 is simpler when it's okay for cmd's standard error and output to be somewhat arbitrarily mixed with each other.

File objects fi, fo, fe, and foe are all normal ones, without the special semantics of the close method as covered for function popen. In other words, there is no way in which the caller of popen2, popen3, or popen4 can learn about P's termination code.

Depending on the buffering strategy of command cmd (which is normally out of your control, unless you're the author of cmd), there may be nothing to read on files fo, fe, and/or foe until your process has closed file fi. Therefore, the normal pattern of usage is something like:

import os def pipethrough(cmd, list_of_lines): fi, fo = os.popen2(cmd, 't') fi.writelines(list_of_lines) fi.close( ) result_lines = fo.readlines( ) fo.close( ) return result_lines

Functions in the popen group are generally not suitable for driving another process interactively (i.e., writing something, then reading cmd's response to that, then writing something else, and so on). The first time your program tries to read the response, if cmd is following a typical buffering strategy, everything blocks. In other words, your process is waiting for cmd's output but cmd has already placed its pending output in a memory buffer, which your process can't get at, and is now waiting for more input. This is a typical case of deadlock.

If you have some control over cmd, you can try to work around this issue by ensuring that cmd runs without buffering. For example, if cmd.py is a Python program, you can run cmd without buffering as follows:

C:/> python -u cmd.py

Other possible approaches include module telnetlib (covered in "Telnet" on page 515) if your platform supports telnet, and third-party, Unix-like-only extensions such as expectpy.sf.net and packages such as pexpect.sf.net. There is no general solution applicable to all platforms and all cmds of interest.

spawnv, spawnve
spawnv(mode,path,args) spawnve(mode,path,args,env)

These functions run the program indicated by path in a new process P, with the arguments passed as sequence args. spawnve uses mapping env as P's environment (both keys and values must be strings), while spawnv uses os.environ for this purpose. On Unix-like platforms only, there are other variations of os.spawn, corresponding to variations of os.exec, but spawnv and spawnve are the only two that exist on Windows.

mode must be one of two attributes supplied by the os module: os.P_WAIT indicates that the calling process waits until the new process terminates, while os.P_NOWAIT indicates that the calling process continues executing simultaneously with the new process. When mode is os.P_WAIT, the function returns the termination code c of P: 0 indicates successful termination, c less than 0 indicates P was killed by a signal, and c greater than 0 indicates normal but unsuccessful termination. When mode is os.P_NOWAIT, the function returns P's process ID (on Windows, P's process handle). There is no cross-platform way to use P's ID or handle; platform-specific ways (not covered further in this book) include function os.waitpid on Unix-like platforms and the win32all extensions (starship.python.net/crew/mhammond) on Windows.

For example, your interactive program can give the user a chance to edit a text file that your program is about to read and use. You must have previously determined the full path to the user's favorite text editor, such as c:\\windows\\notepad.exe on Windows or /bin/vim on a Unix-like platform. Say that this path string is bound to variable editor and the path of the text file you want to let the user edit is bound to textfile:

import os os.spawnv(os.P_WAIT, editor, [editor, textfile])

The first item of the argument args is passed to the program being spawned as "the name under which the program is being invoked." Most programs don't look at this, so you can place any string here. Just in case the editor program does look at this special first argument, passing the same string editor that is used as the second argument to os.spawnv is the simplest and most effective approach.

system
system(cmd)

Runs the string command cmd in a new process and returns 0 if the new process terminates successfully (or if Python is unable to ascertain the success status of the new process's termination, as happens on Windows 95 and 98). If the new process terminates unsuccessfully (and Python is able to ascertain this unsuccessful termination), system returns an integer error code not equal to 0.

14.7.2. The Subprocess Module

The subprocess module, available only since Python 2.4, supplies one rich class Popen, which supports many diverse ways for your program to run another program.

Popen
class Semaphore(n=1)
class Popen(args, bufsize=0, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=False, shell=False, cwd=None, env=None, universal_newlines=False, startupinfo=None, creationflags=0)
Popen starts a subprocess to run a distinct program, and creates and returns an object p, which represents that subprocess. The args mandatory argument and the many optional (named) arguments control all details of how the subprocess is to be run.
If any exception occurs, during the subprocess creation and before the distinct program starts, the call to Popen re-raises that exception in the calling process with the addition of an attribute named child_traceback, which is the Python traceback object for the subprocess. Such an exception would normally be an instance of OSError (or possibly TypeError or ValueError to indicate that you've passed to Popen an argument that's invalid in type or value).

14.7.2.1. What to run, and how: args, executable, shell

args is a sequence (normally a list) of strings: the first item is the path to the program to execute, and the following items, if any, are arguments to pass to the program (args can also be just a string, when you don't need to pass arguments). executable, when not None, overrides args in determining which program to execute. When shell is true, executable specifies which shell to use to run the subprocess; when shell is true and executable is None, the shell used is /bin/sh on Unix-like systems (on Windows, it's os.environ['COMSPEC']).

14.7.2.2. Subprocess files: stdin, stdout, stderr, bufsize, universal_newlines, close_fds

stdin, stdout, and stderr specify the subprocess's standard input, output, and error files, respectively. Each may be PIPE, which creates a new pipe to/from the subprocess; None, meaning that the subprocess is to use the same file as this ("parent") process; or a file object (or file descriptor) that's already suitably open (for reading, for the standard input; for writing, for the standard output and standard error). stderr may also be STDOUT, meaning that the subprocess's standard error is to occur on the same file as its standard output. bufsize controls the buffering of these files (unless they're already open), with the same semantics as the same argument to the open function covered in "Creating a File Object with open" on page 216 (the default, 0, means "unbuffered"). When universal_newlines is true, stdout and stderr (unless they're already open) are opened in "universal newlines" ('rU') mode, covered in "File mode" on page 217. When close_fds is true, all other files (apart from standard input, output, and error) are closed in the subprocess before the subprocess's program or shell is executed.

14.7.2.3. Other arguments: preexec_fn, cwd, env, startupinfo, creationflags

When preexec_fn is not None, it must be a function, or other callable object, and gets called in the subprocess before the subprocess's program or shell is executed.

When cwd is not None, it must be a string that gives the path to an existing directory; the current directory gets changed to cwd in the subprocess before the subprocess's program or shell is executed.

When env is not None, it must be a mapping (normally a dictionary) with strings as both keys and values, and fully defines the environment for the new process.

startupinfo and creationflags are Windows-only arguments to pass to the CreateProcess Win32 API call used to create the subprocess, for Windows-specific purposes (they are not covered further in this book, which focuses on cross-platform uses of Python).

14.7.2.4. Attributes of subprocess.Popen instances

An instance p of class Popen supplies the following attributes:

pid: The process ID of the subprocess.
returncode: None to indicate that the subprocess has not yet exited; otherwise, an integer: 0 for successful termination, >0 for termination with an error code, or <0 if the subprocess was killed by a signal.
stderr, stdin, stdout: When the corresponding argument to Popen was subprocess.PIPE, each of these attributes is a file object wrapping the corresponding pipe; otherwise, each of these attributes is None.

14.7.2.5. Methods of subprocess.Popen instances

An instance p of class Popen supplies the following methods.

communicate
p.communicate(input=None)

Sends the string input as the subprocess's standard input (when input is not None), then reads the subprocess's standard output and error files into in-memory strings so and se until both files are finished and finally waits for the subprocess to terminate and returns a pair (two-item tuple) (so, se).

poll
p.poll( )

Checks if the subprocess has terminated, then returns p.returncode.

wait
p.wait( )

Waits for the subprocess to terminate, then returns p.returncode.