Team LiB
Previous Section Next Section

Strings and Locales

Because people live in different countries, it is often necessary to format strings according to different customs. We saw an example of this in the previous paragraph, but support for this type of functionality is much more generalized. The very operating system on which each copy of PHP runs provides a number of facilities to automatically and transparently handle localized strings.

Naturally, a single systemwide setting may not be the answer you're looking for, particularly if you're creating a Web site designed to serve people from different countries. As a result, PHP provides you with the setlocale function, which can be used to control the behavior of certain string formatting functions:

bool setlocale ($category, $locale[, $locale...]);

The category parameter determines which aspect of the locale functionality is controlled by the call to setlocale(), as you can see from Table 1.5.

Table 1.5. Setlocale() Options




Modify all settings


String comparison only


String classification (for example, differentiation between upper/lowercase)


Currency values


Numeric values


Date/time values

The $locale parameter is the name of the locale that should be set for the class of settings specified by $category. You can actually specify more than one locale by adding more instances of this parameter. This is useful because the same locale can have different names, depending on which operating system you're using.

The fine-grained level of control that setlocale() provides us when determining what aspects of string management should be affected by its settings may seem a bit exaggeratedbut they can come in very handy at times. As an example, modifying the LC_NUMERIC class affects all numeric value conversions, both in input and output. This means that when you acquire a string from outside your scriptbe it from the user or from a databaseit will have to be formatted according to the locale you have chosen in your call to setlocale(), or it won't be recognized properly by the system.

In most cases, you will want to limit your locale manipulations to LC_COLLATE, LC_ MONETARY, LC_TYPE, and LC_COLLATE. You probably don't want to change LC_NUMERIC except in localized situations, because it will affect the way your strings are interpreted when you convert them over to numbers. Therefore, if, for example, your database server returns its numeric values using the English notation (xxx.xx), and you have LC_NUMERIC set to another locale, the decimal portion will be ignored.

Formatting Currency Values

The money_format() function can be used to format a numeric value in its currency representation for a given locale. The function takes two parameters:

money_format ($format, $number)

The $number parameter contains the floating-point value that must be formatted, and the $format parameter contains a string that provides the formatting rules for money_format to follow. A format string contains the following elements:

  • A % character

  • One or more optional flags

  • An optional field width

  • An optional alignment identifier

  • An optional integer precision

  • An optional dot and decimal precision

  • A conversion character

Therefore, the simplest format string is composed of the character % and a conversion character, which essentially determines how $number is formatted according to the information shown in Table 1.6.

Table 1.6. money_format Specifiers




Print a percent sign.


Format the currency value according to the locale's "national" setting.


Format the currency value according to the locale's "international" setting.

The difference between using the "national" and "international" currency format is relevant in the context of your intended audience. For example, consider the following script:


    $a = 1232322210.44;

    setlocale (LC_MONETARY, 'en_US');

    echo money_format ("%n", $a);
    echo "\n";
    echo money_format ("%i", $a);
    echo "\n";


If you execute it, it will print out the following result:

USD 1,232,322,210.44

As you can see, the first directive (using the "national" type) formats the currency value as a person using the current locale would normally write it. The second directive, on the other hand, formats the value using a method that would be suitable for an international audience. If you're in the U.S., it's clear to you that $10 means "ten U.S. dollars," whereas to a Canadian, that would normally mean "ten Canadian dollars." As anyone who has ever lived in Canada knows, those are two very different interpretations. Therefore, if you cater to an international audience, you may want to use the i specifier, which outputs the universally recognized currency acronym "USD."

You can modify the output of money_format to more closely suit your needs. For example, you can use the optional flags to change the minimum length of the result:


      setlocale (LC_MONETARY, 'en_US');
      echo money_format ('%=030#5.2i', 1000);


The #5.2 directive in the format specifier indicates that the resulting string should have no less than five integers and two decimal digits. The =0 portion indicates that the minimum integer/decimal lengths should be reached by padding the string using the character 0. You could, in fact, use any characterfor example, the asterisk (*) is often used when printing checks. Finally, the 30 portion is used to indicate that the field should be at least 30 characters long. As a result, the preceding script will output:

USD 01,000.00

As you can see, the grouping symbols (the comma) are not counted in the total number of digits that you specify through the flags.

The capabilities of money_format() do not end here. You can also use the ! flag to suppress the output of the currency identifier and the ^ flag to prevent the use of grouping symbols. This means that you can use money_format as a reasonable substitute for number_format(), although the former doesn't give you as much flexibility as the latter.

If you're wondering why you should even bother with this, remember that for number_format() to format values using a locale of your choice, you will have to change the LC_NUMERIC locale parameter, and this also affects the numerical input that you receive from the outside (including your databases). Thus, if, for example, your database runs on a locale that is different from the one you want to use when showing results to your user (as most will, particularly in a shared environment), you will constantly have to call setlocale() before and after executing number_format() to ensure that the data you read from and write to the database is formatted correctly. If, on the other hand, you use currency_format to print your numeric values, you need to change the LC_MONETARY locale only once and be done with it for the rest of the script.

Finally, you should note that the format parameter of money_format() can also contain text extraneous to the actual format specification of the function's output. The extra text will be returned as is by the function (but remember to escape any percent signs by using %%). Here's an example:


      setlocale (LC_MONETARY, 'en_US');
      echo money_format ('The total comes to %n, payable 50%% upon signature' .
      ' and 50%% upon completion of project', 1000);


This script will output the following string:

The total comes to $1,000.00, payable 50% upon signature and 50% upon  completion of project.

    Team LiB
    Previous Section Next Section