Greedy Search in Perl Regular Expressions

You might have come across the problem of .* capturing all the pattern it matches in the string, while you wanted only a part of it. E.g.,

$var = 'some_url?id=1&number=2&name=Peter&address=someaddress';

In the above string if you only want to capture the value of “id” (i.e. 1) and you use this regex

$var =~ /\?(.*)=(.*)\&/ ;

it will not work as expected. $1 would have this

id=1&number=2&name

(.*) tries to capture as much matching content as possible. In other words it is greedy (wants to have more and more). But of-course only the content which matches the pattern.

If you want to capture id=1 only in $1 and $2 respectively you have to refrain the greed of regex by using ? (question mark) like this,

$var =~ /\?(.*?)=(.*?)\&/ ;
print "$1 and $2\n";

it would print,

id and 1

You can use (.+) in place of (.*) to make sure that it matches only when some character there for sure, on eor more times.

$var =~ /\?(.+?)=(.+?)\&/ ;

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>