Tidy Configuration Options
In tidy, configuration options represent the bulk of all the power found in the extension. Early on in the chapter, I ignored them for the sake of simplicity because approximately 80 options can be set that encompass a wide range of functionality. Now that the basics are out of the way, it's time to look at how configuration options work and what role they play when repairing and validating documents.
Every time a document is parsed, the behavior tidy exhibits is dictated by its configuration. Although a default configuration is provided every time tidy is used, this configuration can be overridden in a number of ways, which I will discuss shortly.
Note that although I will be discussing setting and retrieving configuration values in this section, only a handful of options are directly discussed throughout this chapter. Rather, a section has been devoted to useful applications of the tidy extension (which uses a number of the most common configuration settings). For a complete list of configuration options and their meaning, visit the tidy home page at http://tidy.sourceforge.net/ or the PHP manual at http://www.php.net/tidy.
Tidy Options at Runtime
tidy_parse_file($document [, $options [, $encoding [, $use_include_path]]); tidy_parse_string($data [, $options [, $encoding]]);
Now that the basics are out of the way, it's time to revisit these functions, specifically the second optional parameter in each$options. This parameter provides the means to set configuration options at runtime from within PHP and can be either an associative array or a string. Depending on the type of variable passed, the behavior is as follows:
Because I discuss configuration files in the following section, let us take a look at the first option I presentedpassing an array value for the $options parameter. As stated, this array should be an associative array of option/value pairs to set for the document to be processed by tidy. For example, Listing 15.4 applies the configuration option show-body-only to the parsed string. This option, when activated, tells tidy to produce only a document fragment (specifically, anything that would normally be within the <BODY> block) instead of a complete standalone document:
Listing 15.4. Passing Tidy Options at Runtime
<?php $options = array("show-body-only" => true); $tidy = tidy_parse_string("<B>Hello<I>World!</B></I>", $options); echo $tidy; ?>
Reading Configuration Values
Although all configuration options for a document must be set when the document is parsed, they can be read at any time after parsing. Determining the values of one of or all the available tidy configuration options is done through two function calls. The first is the tidy_getopt() function:
$tidy is a valid tidy resource. This function is designed to retrieve an associative array of all the configuration values and their respective values for the given tidy document resource in the same format accepted by tidy_parse_file() and tidy_parse_string().
Tidy Configuration Files
Depending on the application, with more than 80 possible configuration options, it is very likely that setting them all at runtime will become a very inefficient and cumbersome task. For this reason, tidy supports the storing of tidy configuration options in configuration files that can be loaded at runtime. Tidy configuration files can also be loaded and applied universally to all documents by setting the tidy.default_config php.ini configuration directive.
A sample tidy configuration file is shown next:
indent-spaces: 4 wrap: 4096 indent: auto tidy-mark: no show-body-only: yes force-output: yes new-blocklevel-tags: mytag, anothertag
Through the use of configuration files, specific tidy profiles can be created to accomplish a particular task. For instance, you could create one configuration file specifically to "beautify" HTML for reading or editing and another to make the document as compact as possible to save bandwidth. Then, from within your PHP scripts, these configuration files can be loaded and applied to documents quickly and effectively by setting the $options parameter in tidy_parse_file() or tidy_parse_string() to the configuration file, as shown in Listing 15.5.
Listing 15.5. Using Tidy Configuration Files
<?php $tidy = tidy_parse_file("myfile.html", "beautify.tcfg"); tidy_clean_repair($tidy); echo $tidy; ?>
Because the use of configuration files in the manner shown in Listing 15.5 is such a common task, tidy also provides two time-saver functions that roll the preceding functionality into a single function call depending on whether you are working with a file or a string as input: tidy_repair_file() and tidy_repair_string(). The syntax for each of these functions is as follows:
tidy_repair_file($filename [, $config_file [, $use_include_path]]); tidy_repair_string($data [, $config_file]);
$filename is the filename to validate when using tidy_repair_file(), and $data is the string to validate using tidy_repair_file(). The second optional parameter in each function, $config_file, represents the configuration file to apply to the input. When executed, each of these functions attempts to parse and clean or repair the specified input based on the provided configuration and then returns the results. For the tidy_repair_file() function, the third optional parameter, $use_include_path, is a Boolean indicating whether PHP should search the include path for the input file if it is not found initially. An example of these functions in use is shown in Listing 15.6.
Listing 15.6. Using tidy_repair_file()
<?php /* This code: $opts = array('show-body-only' => true); $tidy = tidy_parse_file('myfile.html', $opts, true); tidy_clean_repair($tidy); echo tidy_get_output($tidy); ... is identical to the below one-line statement assuming 'myconfig.tcfg' has the show-body-only option set to "On". */