10.3 The RegExp Object
As mentioned at the beginning of this
chapter, regular expressions are represented as RegExp objects. In
addition to the
RegExp( ) constructor, RegExp objects support
three methods and a number of properties. An unusual feature of the
RegExp class is that it defines both class (or static) properties and
instance properties. That is, it defines global properties that
belong to the RegExp( ) constructor as well as
other properties that belong to individual RegExp objects. RegExp
pattern-matching methods and properties are described in the next two
sections.
The RegExp( ) constructor takes one or two string
arguments and creates a new RegExp object. The first argument to this
constructor is a string that contains the body of the regular
expression -- the text that would appear within slashes in a
regular expression literal. Note that both string literals and
regular expressions use the
\ character for escape
sequences, so when you pass a regular expression to RegExp(
) as a string literal, you must replace each
\ character with \\. The second
argument to RegExp( ) is optional. If supplied, it
indicates the regular expression flags. It should be
g, i, m, or
a combination of those letters. For example:
// Find all five digit numbers in a string. Note the double \\ in this case.
var zipcode = new RegExp("\\d{5}", "g");
The RegExp( ) constructor is useful when a regular
expression is being dynamically created and thus cannot be
represented with the regular expression literal syntax. For example,
to search for a string entered by the user, a regular expression must
be created at runtime with RegExp( ).
10.3.1 RegExp Methods for Pattern Matching
RegExp objects
define two methods that
perform pattern-matching operations; they behave similarly to the
String methods described earlier. The main RegExp pattern-matching
method is exec( ). It is similar
to the String match( ) method described above,
except that it is a RegExp method that takes a string, rather than a
String method that takes a RegExp. The exec( )
method executes a regular expression on the specified string. That
is, it searches the string for a match. If it finds none, it returns
null. If it does find one, however, it returns an
array just like the array returned by the match( )
method for nonglobal searches. Element 0 of the array contains the
string that matched the regular expression, and any subsequent array
elements contain the substrings that matched any parenthesized
subexpressions. Furthermore, the index property
contains the character position at which the match occurred, and the
input property refers to the string that was
searched.
Unlike the match(
) method, exec( ) returns the same kind
of array whether or not the regular expression has the global
g flag. Recall that match( )
returns an array of matches when passed a global regular expression.
exec( ), by contrast, always returns a single
match and provides complete information about that match. When
exec( ) is called for a regular expression that
has the g flag, it sets the
lastIndex property of the regular expression
object to the character position immediately following the matched
substring. When exec( ) is invoked a second time
for the same regular expression, it begins its search at the
character position indicated by the lastIndex
property. If exec( ) does not find a match, it
resets lastIndex to 0. (You can also set
lastIndex to 0 at any time, which you should do
whenever you quit a search before you find the last match in one
string and begin searching another string with the same RegExp
object.) This special behavior allows us to call exec(
) repeatedly in order to loop through all the regular
expression matches in a string. For example:
var pattern = /Java/g;
var text = "JavaScript is more fun than Java!";
var result;
while((result = pattern.exec(text)) != null) {
alert("Matched `" + result[0] + "'" +
" at position " + result.index +
"; next search begins at " + pattern.lastIndex);
}
The other RegExp method is test( ).
test( ) is a much simpler method than
exec( ). It takes a string and returns
true if the string matches the regular expression:
var pattern = /java/i;
pattern.test("JavaScript"); // Returns true
Calling test( ) is equivalent to calling
exec( ) and returning true if
the return value of exec( ) is not
null. Because of this equivalence, the
test( ) method behaves the same way as the
exec( ) method when invoked for a global regular
expression: it begins searching the specified string at the position
specified by lastIndex, and if it finds a match,
it sets lastIndex to the position of the character
immediately following the match. Thus, we can loop through a string
using the test( ) method just as we can with the
exec( ) method.
The String methods search(
) , replace( ), and
match( ) do not use the
lastIndex property as exec( )
and test( ) do. In fact, the String methods simply
reset lastIndex( ) to 0. If you use exec(
) or test( ) on a pattern that has the
g flag set and you are searching multiple strings,
you must either find all the matches in each string, so that
lastIndex is automatically reset to zero (this
happens when the last search fails), or you must explicitly set the
lastIndex property to 0 yourself. If you forget to
do this, you may start searching a new string at some arbitrary
position within the string rather than from the beginning. Finally,
remember that this special lastIndex behavior
occurs only for regular expressions with the g
flag. exec( ) and test( )
ignore the lastIndex property of RegExp objects
that do not have the g flag.
10.3.2 RegExp Instance Properties
Each RegExp object has five
properties. The
source property is a
read-only string that contains the text of the regular expression.
The global property
is a read-only boolean value that specifies whether the regular
expression has the g flag. The
ignoreCase
property is a read-only boolean value that specifies whether the
regular expression has the i flag. The
multiline property is a read-only boolean
value that specifies whether the regular expression has the
m flag. The final property is
lastIndex, a read-write integer. For patterns with
the g flag, this property stores the
position in the string at which the next search is to begin. It is
used by the exec( ) and test( )
methods, as described in the previous section.
|