Clearing hung processes.
Solaris supports the concept of sending software signals to a process. These signals are ways for other processes to interact with a running process outside the context of the hardware. The kill command is used to send a signal to a process. System administrators most often use the signals SIGHUP, SIGKILL, SIGSTOP, and SIGTERM. The SIGHUP signal is used by some utilities as a way to notify the process to do something, such as re-read its configuration file. The SIGHUP signal is also sent to a process if the remote connection is lost or hangs up. The SIGKILL signal is used to abort a process, and the SIGSTOP signal is used to pause a process. The SIGTERM signal is the default signal sent to processes by commands such as k ill and pkill when no signal is specified. Table 5.12 describes the most common signals an administrator is likely to use.
Don't worry about remembering all of the signals listed; just be familiar with the more common signals, such as SIGHUP, SIGKILL, SIGSTOP, and SIGTERM.
In addition, you can write a signal handler, or trap, in a program to respond to a signal being sent. For example, many system programs, such as the name server daemon, respond to the SIGHUP signal by re-reading their configuration files. This signal can then be used to update the process while running, without having to terminate and restart the process. Signal handlers cannot be installed for SIGSTOP (23) or SIGKILL (9). Because the process cannot install a signal handler for signal 9, an otherwise well-behaved process may leave temporary files around or not be able to finish out critical operations that it is in the middle of. Thus, kill -9 invites corruption of application data files and should only be used as a last resort.
Here's an example of how to trap a signal in a script:
trap '/bin/rm tmp$$;exit 1' 1 2 3 15
As the name suggests, trap traps system interrupt until some command can be executed. The previous example traps the signals 1, 2, 3, and 15, and executes the /bin/rm tmp$$ command before exiting the program. The example deletes all tmp files even if the program terminates abnormally.
The kill command sends a terminate signal (signal 15) to the process, and the process is terminated. Signal 15, which is the default when no options are used with the kill command, is a gentle kill that allows a process to perform cleanup work before terminating. Signal 9, on the other hand, is called a sure, unconditional kill because it cannot be caught or ignored by a process. If the process is still around after a kill -9, either it is hung up in the Unix kernel, waiting for an event such as disk I/O to complete, or you are not the owner of the process.
The kill command is routinely used to send signals to a process. You can kill any process you own, and the superuser can kill all processes in the system except those that have process IDs 0, 1, 2, 3, and 4. The kill command is poorly named because not every signal sent by it is used to kill a process. This command gets its name from its most common useterminating a process with the kill -15 signal.
Forking Problem A common problem occurs when a process continually starts up new copies of itselfthis is referred to as forking or spawning. Users have a limit on the number of new processes they can fork. This limit is set in the kernel with the MAXUP (maximum number of user processes) value. Sometimes, through user error, a process keeps forking new copies of itself until the user hits the MAXUP limit. As a user reaches this limit, the system appears to be waiting. If you kill some of the user's processes, the system resumes creating new processes on behalf of the user. It can be a no-win situation. The best way to handle these runaway processes is to send the STOP signal to suspend all processes and then send a KILL signal to terminate the processes. Because the processes were first suspended, they can't create new ones as you kill them off.
You can send a signal to a process you own with the kill command. Many signals are available, as listed in Table 5.12. To send a signal to a process, first use the ps command to find the process ID (PID) number. For example, type ps -ef to list all processes and find the PID of the process you want to terminate:
ps -ef UID PID PPID C STIME TTY TIME CMD root 0 0 0 Nov 27 ? 0:01 sched root 1 0 0 Nov 27 ? 0:01 /etc/init - root 2 0 0 Nov 27 ? 0:00 pageout root 3 0 0 Nov 27 ? 12:52 fsflush root 101 1 0 Nov 27 ? 0:00 /usr/sbin/in.routed -q root 298 1 0 Nov 27 ? 0:00 /usr/lib/saf/sac -t 300 root 111 1 0 Nov 27 ? 0:02 /usr/sbin/rpcbind root 164 1 0 Nov 27 ? 0:01 /usr/sbin/syslogd -n -z 12 root 160 1 0 Nov 27 ? 0:01 /usr/lib/autofs/automountd . . . root 5497 433 1 09:58:02 pts/4 0:00 script psef
Another way to kill a process is to use the pkill command. pkill functions identically to pgrep, which was described earlier, except that instead of displaying information about each process, the process is terminated. A signal name or number may be specified as the first command-line option to pkill. The value for the signal can be any value described in Table 5.12. For example, to kill the process named psef with a SIGKILL signal, issue the following command:
pkill -9 psef
Killing a Process If no signal is specified, SIGTERM (15) is sent by default. This is the preferred signal to send when trying to kill a process. Only when a SIGTERM fails should you send a SIGKILL signal to a process. As stated earlier in this section, a process cannot install a signal handler for signal 9 and an otherwise well-behaved process might not shut down properly.
In addition, the Desktop Process Manager, which was described earlier, can be used to kill processes. In the Process Manager window, highlight the process that you want to terminate, click Process from the toolbar at the top of the window, and then select Kill from the pull-down menu, as shown in Figure 5.9.
Figure 5.9. Killing processes.
kill -9 <PID>
<PID> is the process ID of the selected process.
The preap command forces the killing of a defunct process, known as a zombie. In previous Solaris releases, zombie processes that could not be killed off remained until the next system reboot. Defunct processes do not normally impact system operation; however, they do consume a small amount of system memory. See the preap manual page for further details of this command.