Team LiB
Previous Section Next Section

Working with Files in PHP

As seems to be the case with almost every aspect of PHP, there is incredible native support for working with files from within PHP. I'll be starting off this chapter with a discussion of the basics, including reading and writing text files, followed by an introduction to using PHP to read and write binary files. For those of you who have experience with C development, you'll find many of your favorite functions in PHP and should quickly be well on your way. Let's get right into it and introduce the primary file-access functionthe fopen() function.

In PHP, almost every file system function (except specialty file system functions) cannot function unless the fopen() function is used (which actually opens the file so that it can read and write to it). The formal syntax of the fopen() function is as follows:

      fopen(string $filename, string $mode [, boolean
 $use_include_path ])

$filename represents the filename to open, $mode represents the "mode" under which the fopen() function is opening the file (see Table 20.1), and $use_include_path is a Boolean value that indicates whether the file should be looked for in the PHP include path. The fopen() function then returns a "reference" to the opened file, which is used when working with other file system functions, or returns false if the fopen() call fails.

Table 20.1. Acceptable fopen() Modes

r

Open the file for reading.

r+

Open the file for reading and writing.

w

Open the file for writing, overwriting existing files, and creating the file if it does not exist.

w+

Open the file for reading and writing, overwriting existing files, and creating the file if it does not exist.

a

Open the file for writing, creating the file if it does not exist, and appending to the file if it does.

a+

Open the file for reading and writing, creating the file if it does not exist, and appending to the file if it does.

b

Open the file in binary reading/writing mode (applicable only on Windows systems; however, recommended in all scripts).


One of the most unique capabilities of the PHP file system functions is the capability to open files remotely by specifying a URL for the $filename parameter for fopen(). Although it is not possible to write to files opened remotely, PHP can read data from files that reside on both Web and FTP servers by specifying the full URL to the file. Beyond being able to open files remotely using both the HTTP or FTP protocol, PHP also allows you to access standard input/output streams using the following wrappers:

php://stdin

Read from standard input (keyboard)

php://stdout

Write to standard output

php://stderr

Write to standard error


NOTE

When using the CLI (command line interface) version of PHP, rather than directly opening one of the preceding streams, the constants STDIN, STDOUT, and STDERR are always available as active file references to these streams.


When you're working with URL wrappers, it is important to realize that URLs that contain invalid characters (such as a whitespace in the filename) must be encoded prior to being used by calling the urlencode() function. The urlencode() function takes a single parameter (the URL to encode) and returns the encoded URL. Listing 20.1 provides a number of examples of the fopen() function:

Listing 20.1. Using the fopen() Function
<?php
     /* Open the file for reading */
     $fr = fopen("myfile.txt", 'r');
     /* Open the file for binary read/append writing */
     $fr = fopen("myfile.dat", 'ba+');
     /* Open the file for read/write (searching the include path) */
     $fr = fopen("code.php", 'w+', true);

     /* Open the file index.php on the php.net server for reading via HTTP*/
     $fr = fopen("http://www.php.net/index.php", 'r');
     /* Open the file index.php on the php.net server for reading via FTP */
     $fr = fopen("ftp://ftp.php.net/index.php", 'r');
     /* Encode a URL then open it using fopen() for reading via HTTP */
     $url = "http://www.php.net/this is my invalid URL.php";
     $url = urlencode($url);
     $fr = fopen($url, 'r');
?>

When you're working with file system functions, it is very important that the value returned from a successful fopen() function is saved in a variable for later use. Because multiple files can be opened at once, without this value there is no way to determine what file is being manipulated.

After a file reference has been created, it will exist until one of two events occurthe script ends or the file is closed using the fclose() function. It's always good practice to close a file reference after any work being done with the file is complete. To close the reference, call the fclose() function and provide it the variable that contains the file reference:

<?php
     $fr = fopen("php://stdout", 'w');
          /* Code to work with standard output */
          fclose($fr);
?>

Reading and Writing Text Files

To demonstrate how PHP file system functions work, the first example we will look at is a simple counter for a Web page. To understand how this script works, you'll need to understand how to write and read data from a text file. For this, you'll need two functions: fgets(), which retrieves a string from a file, and fputs(), which writes a string to a file.

Because it only makes sense to discuss the first function you'll be using in your script, let's take a look at the fputs() function. This function is used for writing a string (or any other data) to a given file reference and has the following syntax:

fputs($file_ref, $data_str [, int $length])

$file_ref refers to the file-reference value that was returned from the appropriate fopen() call; $data should contain the data that will be written to the file referenced by $file_ref, and the optional parameter $length determines how much of the data in $data will actually be written.

NOTE

Although we will be dealing with text-file examples when using fputs(), be aware that when you're dealing with binary data, the $length parameter should always be specified! Failing to do so may result in the entire file not being written, because PHP will write data to the file only until a null character is encountered.


To demonstrate the fputs() function in an actual PHP script, we'll couple it with the special URL wrapper for standard output (php://stdout) and create our own custom echo function, as shown in Listing 20.2:

Listing 20.2. Using the fputs() Function
<?php
     function custom_echo($string) {
          $output = "Custom Message: $string";
          fputs(STDOUT, $string);
     }

     custom_echo("This is my custom echo function!");
?>

Now that you've been exposed to writing to files, let's discuss the opposite end of the picture and introduce the function used to read data from a text file. Unlike writing to a text file that has a single function to accomplish both tasks, reading files in PHP is separated into two functions, depending on whether you are reading binary data(fgets() or fread()). For our current discussion, we'll focus on the non-binary reading function fgets() and save fread() for later in the chapter. The formal declaration of the fgets() function is as follows:

fgets($file_ref [, $length]);

$file_ref refers to the file reference that will be read from and $length refers to the number of bytes to read from the file. When executed, this function returns the desired string from the text file. It is important to note that because fgets() is designed for text files, the fgets() function will read from the file until one of the following conditions is met:

  • ($length -1) bytes have been read from the file.

  • A newline character is encountered.

  • The end of the file is reached.

NOTE

Unless otherwise specified, the default value for the $length parameter is the reading of a single line from the file.


Although fgets() (at least as it relates to text-based files) will always retrieve the desired data, another alternative is the fscanf() function. The fscanf() function is designed to read structured data from a text file and automatically store each individual piece of information into a variable. For instance, consider the following example of a text-based data file containing the names and birthdays of people (see Listing 20.3):

Listing 20.3. Sample Text Data File
04-25-81     John Coggeshall
01-23-81     Max Harmen
03-12-73     Amy Pellgram
06-54-72     Cliff Pellgram

In a case such as Listing 20.3, not only would each line of the file have to be read using fgets(), but a great deal of parsing would also have to be done to extract a piece of data (for instance, the year of birth). It is situations like this that the fscanf() function was designed for. With fscanf() you can read each line of the file according to a predetermined template and store each individual piece of information into a separate PHP variable automatically. The specific syntax for the fscanf() function is as follows:

fscanf($file_ref, $format [, $var_one [, $var_two [...]]])

$file_ref is the file reference to read from; $format represents the string defining the template to use when reading; and $var_one, $var_two represent the variables in which to store the parsed values (these optional parameters must be passed by reference). Upon success, fscanf() returns the number of items parsed according to the template or returns false upon failure.

NOTE

If no variables are provided to store the parsed values from fscanf(), the function will return an array with each parsed value in it instead of returning the number of items parsed. (fscanf() will still return false on failure.)


Using the fscanf() function is similar to using the printf() function discussed in Chapter 1, "Basic PHP Development." Rather than outputting formated data, the fscanf() function accepts a template that defines the format of input data. Table 20.2 shows a table of acceptable identifiers that can be used in the $format string:

Table 20.2. Acceptable Format Values for fscanf()

%b

Binary number

%c

Single character

%d

Signed decimal number

%u

Unsigned decimal number

%f

Float

%o

Octal

%s

String

%x

Hexadecimal Number


Looking at our earlier text-file birthday example (see Listing 20.3) writing a script to extract specific pieces of information from the text file becomes fairly simple, as illustrated in Listing 20.4:

Listing 20.4. Reading Formatted Text Using fscanf()
<?php
     $fr = @fopen('birthdays.txt', 'r');
     if(!$fr) {
          echo "Error! Couldn't open the file.<BR>";
          exit;
     }

     while(!feof($fr)) {
          fscanf($fr, "%u-%u-%u %s %s", &$month, &$day,
                                        &$year, &$first, &$last);
          echo "First Name: $first<BR>";
          echo "Last Name: $last<BR>";
          echo "Birthday: $month/$day/$year<BR>";
     }
     fclose($fr);
?>

This script, when executed, parses each line in our birthday text file example shown in Listing 20.3 and displays the data in a more human-friendly format. Note in this example that another function you have not been formally introduced to is usedthe feof() function. The syntax for this function is as follows:

feof($file_ref)

This function is used to determine if, during the course of reading from a file, there is any more data to be read. When executed, it returns a Boolean value of TRue if there is no more data to be read from the given file reference $file_ref. Hence, in Listing 20.4 this function is used to read every line in the file, allowing proper execution without knowing beforehand the number of lines or size of the file. When the entire file has been read and there is no more work to be done with the file, it is closed via an fclose() function.

As you can see, accessing files from within PHP is a fairly simple task. Now that all the required individual functions have been explained, it's time to create the counter script discussed earlier in the chapter. This script will consist of a single function, retrieveCount(), which adds one to the hit count stored in a text file every time the function is executed and then returns the updated count back to be displayed to the client or returns false upon failure. It accepts one parameter, a file that is assumed to store a hit count. In Listing 20.5, let's take a look at the retrieveCount() function and its use:

Listing 20.5. A Simple Text-File Hit Counter
<?php
    function retrieveCount($hitfile) {

        /* Try to open an existing hit-count file,
        and either get the hitcount or assume the
        script has to open a new file and set $count
        to zero. */
        $fr = @fopen($hitfile, 'r');
        if(!$fr) {
            $count = 0;
        } else {
            $count = fgets($fr, 4096);
            fclose($fr);
        }

        /* Now that $count has been determined, re-open
        the file and write the new count to it */
        $fr = @fopen($hitfile, 'w');
        if(!$fr) return false;

        $count++;
        if(@fputs($fr, $count) == -1) return false;
        fclose($fr);

        return $count;
    }

    $count = retrieveCount('hitcount.dat');
    if($count !== false) {
        echo "This page has been visited $count times<BR>";
    } else {
        echo "An Error has occurred.<BR>";
    }
?>

Examining the retrieveCount() function, you can see that the code required to implement such a simple text file read/write application is a bit more complex than first expected! Because there are no assurances that the file provided to the retrieveCount() function actually exists, every time a file system function is used, its return value must be checked to ensure that the function actually exists. Starting from the top of the script, the first attempt is made to open the file in reading (r) mode. If this fopen() call succeeds, a subsequent call to fgets() is then made, which reads the current count from the text file, and the reference is closed. If the fopen() call fails, however, it must be assumed that the hit-count file does not exist, and therefore a $count of zero is assumed.

After the value of $count has been determined, the file is again reopened (this time in write mode) and the new value of $count is then written to the file (overwriting the previous value) using the fputs() function. The success of the fputs() statement is then confirmed, the file reference is closed, and the function returns successfully with the $count value.

Listing 20.5 not only demonstrates how to use PHP's file-system functions, but also implements a number of the operators discussed in Chapter 1. For instance, note the use of the error-suppression operator @ used on each file-access function. Because every file access is double-checked to ensure that it indeed did succeed, this operator is used to suppress any PHP-generated error messages. Furthermore, outside of the retrieveCount() function, note the use of the type-specific comparison done on the return value. Because in PHP a value of zero will evaluate to false in a standard comparison, the type-specific operator !== is used to assure that the error message is printed only when the function actually returns false and not zero.

Reading and Writing Binary Files

With the basics of reading and writing text files complete, let's now turn our attention to working with binary files. Unlike text files, binary files can be much harder both to work with and debug because they are by their very nature unreadable by anything but a computer. In PHP, writing binary files is done in the same manner as writing text files (via the fputs() function) and therefore requires no explanation. In fact, the only difference (which has already been mentioned) is the use of the b mode when the file is opened via fopen(). Hence, this section will focus primarily on those functions relating to reading binary data from a file and converting it into a form usable by PHP. Specifically, we will be constructing a function that will read the header from a Zip-compressed file and determine the minimum version number required to decompress the data. To accomplish this, we'll be examining the fseek(), fread(), and unpack() functions.

When you're working with binary files, it is often necessary to jump around different locations (or offsets) within the file. This is in contrast to working with text files, which are generally both read and written in a linear fashion. In PHP, adjusting the file pointer to a particular offset within an open file is done through the fseek() using the following sytnax:

fseek($file_ref, $offset [, $reference]);

NOTE

The fseek() function can be used only for files that exist on the local file system. This function will not work with files that are opened remotely via HTTP or FTP.


$file_ref is the file reference, and $offset represents the relative offset in relation to the location of the internal file pointer. The final parameter $reference is used to adjust the location of the file pointer prior to moving it according to $offset, and it accepts one of the following three PHP constant values as input (see Table 20.3):

Table 20.3. Constant Reference Points for fseek()

SEEK_SET

(Default) the beginning of the file

SEEK_CUR

The current location of the file pointer

SEEK_END

One byte past the end of the file


When using the fseek() function, it is perfectly acceptable to adjust the file pointer beyond the end of the given file. Although attempting to read values when the file pointer is beyond this point will result in an error, writing to these locations will increase the size of the file to accommodate the new data. The fseek() function returns zero upon success, or returns 1 if the file pointer could not be adjusted. Also note that fseek() reference offsets are indexed starting at zero; hence, $offset must reflect this by actually being passed as $offset - 1 to the fseek() function as shown in Listing 20.6:

Listing 20.6. Using the fseek() Function
<?php
     $fr = fopen('mybinfile.dat', 'r');
     if(!$fr) exit;
     /* Adjust the pointer to the 9th byte in the file */
     fseek($fr, 10);
     /* Adjust the pointer to 10 bytes from the end */
     fseek($fr, -10, SEEK_END);
     /* Move the pointer 2 bytes from its current location */
     fseek($fr, 2, SEEK_CUR);
?>

After the file pointer has been adjusted (if needed) to the proper location, reading the appropriate binary data is done using the fread() function, which has the following syntax:

fread($file_ref, $length)

$file ref refers to the appropriate file reference and $length determines the number of bytes to read from the file. Upon completion, any bytes read from the given file reference are returned.

NOTE

When you're working with binary files, PHP "magic quotes" must be disabled before any fread() call is done! Failure to do so will cause affected values such as null characters to be converted to their escaped notations '\0'. To ensure that magic quotes are disabled, either turn them off in the php.ini file or use set_magic_quotes_runtime() to turn them off as shown:

<?php set_magic_quotes_runtime(false); ?>

If you would like to return magic quotes to its original value, use get_magic_quotes_runtime() to retrieve the value of the configuration option prior to adjusting the value:

<?php
     $mquote_cfg = get_magic_quotes_runtime();
     /* Read from the file */
     set_magic_quotes_runtime($mquote_cfg);
?>


As mentioned earlier in the section, binary data often requires an intermediate step for any values stored within the read data to be converted into a format usable by PHP. This process (called unpacking) is accomplished via the unpack() function with the following syntax:

unpack($format, $data)

The $format string is a description that contains both the necessary format codes and the variable names to assign to each value, and $data represents the data on which the unpack operation is performed. When constructing a description string to use for the $format parameter, the same codes that are used to pack binary data (see the pack() function in the PHP manual) are used, with the following form:

<formatcodes><variable name>/<formatcodes><variablename>

Upon success, unpack() returns an associative array containing key values for each <variable name> unpacked from $data. Thus, extracting two integers (int1 and int2, respectively) is done in the following fashion (see Listing 20.7):

Listing 20.7. Unpacking Values from Binary Data Using unpack()
<?php
     /* Assume $data contains binary data for two packed integers */
     $bdata = unpack("nint1/nint2", $data);
     echo "The first integer in the packed data: {$bdata['int1']}<BR>";
     echo "The second integer in the packed data: {$bdata['int2']}<BR>";
?>

Pulling these functions together, we are now able to create a script that retrieves the version of a given Zip file. But before we can read the appropriate data from the Zip file, we need to know where to look for it within the archive. A quick visit to your favorite search engine will yield documentation on this widely used format, but I've saved you the time of doing so. For our purposes, we are concerned with the fifth and sixth bytes (the bytes that represent the Zip file version). With this information in hand, we have all that is needed to create a function we'll call getZipVer() to retrieve the version from an arbitrary Zip file (see Listing 20.8):

Listing 20.8. Getting the Version of a Zip File from PHP
<?php
    function getZipVer($zipfile) {

        $quote_val = get_magic_quotes_runtime();
        set_magic_quotes_runtime(false);

        $fr = @fopen($zipfile, 'rb');
        if(!$fr) return false;

        if(fseek($fr, 4) == -1) return false;

        $ver = fread($fr, 2);
        fclose($fr);
        $values = unpack("vversion", $ver);
        $verdata = array('major' => $values['version'] / 10,
                         'minor' => $values['version'] % 10);
        set_magic_quotes_runtime($quote_val);
        return $verdata;

    }

    $version = getZipVer('test.zip');
    if(!$version) {
        echo "Error reading version information!";
    } else {
         echo "Version info: {$version['major']} (major)" .
              ", {$version['minor']} (minor)";
    }
?>

Looking at the getZipVer() function, the first thing done by the function is to ensure that magic quotes in PHP are disabled (although the current value of the configuration option is saved). The file is then opened in binary-read mode and the file pointer is advanced to offset 4 using the fseek() function. Two bytes are then read (the version information) and the file is closed.

With the necessary data read from the file, the data must now be unpacked for PHP to make any sense of it. According to the Zip file specification, these two bytes represent a 16-bit unsigned short. This number divided by 10 represents the major version number of the Zip file, whereas the same number modulus 10 represents the minor version. The number is then unpacked from the binary data using the format code v for the unpack() function, which stores the resulting integer in the version key of an array returned and stored in the $values array. This value is then used to create the $verdata array, which contains the two separate values for the major and minor Zip file version. To wrap up the function, the original magic quote configuration option is restored, and the function returns our completed version array. These values are then displayed to the client, and the script exits.

Working with Directories in PHP

Along with the extensive support for file access, PHP also provides a complete set of directory manipulation functions. PHP natively supports functionality to make, remove, and display the contents of directories. This section is devoted to using these methods and shows how they can be used to both gather information and manipulate the directory tree from within PHP.

Dealing with directories in PHP is much like that of the file system: The directory first must be opened prior to any action being taken, after which the directory is then closed. To do this, PHP provides a function similar to the fopen() and fclose() functions for filesthe opendir() and closedir() functions. The syntax for the opendir() function is as follows:

opendir($dir_path)

$dir_path represents the pathname to open a handle to. The $dir_path parameter does not necessarily have to be completely qualified (meaning it can be relative to the current directory); however, opendir() will display an error message if the provided directory does not exist. Upon success, the opendir() function returns a directory reference (the directory version of the previously discussed file reference,) which is then used with other directory functions.

NOTE

Although PHP will open directories that exist on mapped drives in Windows environments, the process under which PHP runs (usually IIS or the Apache service) must be given permission to access the shared resource. Consult your system's documentation or contact your network administrator for further information on Windows file permissions.


After a directory reference has been created, it is always good practice to close it when any necessary operations on the directory list have been completed. To do this, the closedir() function is used. The closedir() function takes a single parameter (the directory reference value returned from opendir()).

After a particular directory has been opened using the opendir() function, each entry in the directory can be read via a call to the readdir() function. The syntax of the readdir() function is as follows:

readdir($dir_reference)

$dir_reference refers to the value returned from a successful call to the opendir() function. Upon success, this function returns a string representing one of the files in the directory reference by $dir_reference. Each subsequent call to readdir() will return the next file in the directory (as listed by the file system) until there are no more files to list. If no more files are in the directory, or another error has otherwise occurred, readdir() will return a Boolean false. In Listing 20.9, we use the PHP directory functions to read the files in the /tmp/ directory and store them in an array.

Listing 20.9. Reading a Directory Listing Using opendir()
<?php

     $dr = @opendir('/tmp/');
     if(!$dr) {
          echo "Error opening the /tmp/ directory!<BR>";
          exit;
     }

     while(($files[] = readdir($dr)) !== false);

     print_r($files);
?>

Because the readdir() function returns a new file every time it is executed, each individual file in a given directory can be viewed only once. For situations when it is desirable to revisit a directory, PHP provides a function that allows you to "rewind" the directory listing to its initial state before the first time readdir() is calledrewinddir(). The syntax is as follows:

rewinddir($dir_reference)

$dir_reference refers to a valid directory reference returned from opendir().

Although at times opendir() and its related family of functions has its advantages, an alternative method is especially useful to retrieve a list of files that meet a certain criteria (that is, a filemask). The function I am referring to is called glob() and has the following syntax:

glob($filemask [, flags])

$filemask is a string representing the filemask to search for (that is, *.txt) and flags represents one or more of the constants found in Table 20.4. Upon success, glob() returns a sorted array of filenames that matched the given filemask.

Table 20.4. glob() Constants

GLOB_MARK

Append a slash to filenames that are really directories.

GLOB_NOSORT

Do not sort the returned filenames.

GLOB_NOCHECK

If no files were found that match the filemask, return the filemask instead of an empty array.

GLOB_ONLYDIR

Match only directories that meet the filemask.


NOTE

Table 20.4 represents an incomplete list of possible constants for glob()many of the available constants do not have applications in PHP scripts and therefore have been omitted.


In Listing 20.10, the glob() function is used to create two separate arraysone with all the files in the /tmp/ directory and a second containing a list of all the directories that exist in /tmp/:

Listing 20.10. Using the glob() Function
<?php
     $directories = glob("/tmp/*", GLOB_ONLYDIR);
     $complete = glob("/tmp/*");
     $files = array_diff($directories, $complete);

     echo "Directories in /tmp/<BR>";
     foreach($directories as $val) {
          echo "$val<BR>\n";
     }
     echo "<BR>Files in /tmp/<BR>";
     foreach($files as $val) {
          echo "$val<BR>\n";
     }
?>

Looking at Listing 20.10, you can see that although a flag is available to the glob() function to return actual files (not directories), a simple call to the array_diff() function can be used to determine the differences between the directory-only listing and a complete listing (hence, only the files). For more information on the array_diff() function, see the PHP manual at http://www.php.net/manual/.

    Team LiB
    Previous Section Next Section