diff options
Diffstat (limited to 'contrib/perl5/pod/perlre.pod')
-rw-r--r-- | contrib/perl5/pod/perlre.pod | 19 |
1 files changed, 15 insertions, 4 deletions
diff --git a/contrib/perl5/pod/perlre.pod b/contrib/perl5/pod/perlre.pod index 382ba6524274..d4c1deee88f7 100644 --- a/contrib/perl5/pod/perlre.pod +++ b/contrib/perl5/pod/perlre.pod @@ -116,7 +116,11 @@ The following standard quantifiers are recognized: (If a curly bracket occurs in any other context, it is treated as a regular character.) The "*" modifier is equivalent to C<{0,}>, the "+" modifier to C<{1,}>, and the "?" modifier to C<{0,1}>. n and m are limited -to integral values less than 65536. +to integral values less than a preset limit defined when perl is built. +This is usually 32766 on the most common platforms. The actual limit can +be seen in the error message generated by code such as this: + + $_ **= $_ , / {$_} / for 2 .. 42; By default, a quantified subpattern is "greedy", that is, it will match as many times as possible (given a particular starting location) while still @@ -458,7 +462,7 @@ the time when used on a similar string with 1000000 C<a>s. Be aware, however, that this pattern currently triggers a warning message under B<-w> saying it C<"matches the null string many times">): -On simple groups, such as the pattern C<(?> [^()]+ )>, a comparable +On simple groups, such as the pattern C<(?E<gt> [^()]+ )>, a comparable effect may be achieved by negative lookahead, as in C<[^()]+ (?! [^()] )>. This was only 4 times slower on a string with 1000000 C<a>s. @@ -730,6 +734,13 @@ following all specify the same class of three characters: C<[-az]>, C<[az-]>, and C<[a\-z]>. All are different from C<[a-z]>, which specifies a class containing twenty-six characters.) +Note also that the whole range idea is rather unportable between +character sets--and even within character sets they may cause results +you probably didn't expect. A sound principle is to use only ranges +that begin from and end at either alphabets of equal case ([a-e], +[A-E]), or digits ([0-9]). Anything else is unsafe. If in doubt, +spell out the character sets in full. + Characters may be specified using a metacharacter syntax much like that used in C: "\n" matches a newline, "\t" a tab, "\r" a carriage return, "\f" a form feed, etc. More generally, \I<nnn>, where I<nnn> is a string @@ -752,7 +763,7 @@ start and end. Alternatives are tried from left to right, so the first alternative found for which the entire expression matches, is the one that is chosen. This means that alternatives are not necessarily greedy. For -example: when mathing C<foo|foot> against "barefoot", only the "foo" +example: when matching C<foo|foot> against "barefoot", only the "foo" part will match, as that is the first alternative tried, and it successfully matches the target string. (This might not seem important, but it is important when you are capturing matched text using parentheses.) @@ -805,7 +816,7 @@ with most other power tools, power comes together with the ability to wreak havoc. A common abuse of this power stems from the ability to make infinite -loops using regular expressions, with something as innocous as: +loops using regular expressions, with something as innocuous as: 'foo' =~ m{ ( o? )* }x; |