Published: Monday, April 30, 2001
Utilizing Regular Expression SubMatches, Part 2
By Scott Mitchell
Read Part 1
In Part 1 we examined how to use the $N to refer to
found strings when using the Replace function. However, as we noted in Part 1,
there may be times when it would be nice to access these matched dollar-sign notation strings in the
Match object when we run the Execute method. Fortunately, we can access these values
this way through the SubMatches property of the Match object.
Using the SubMatches Collections Property
When running the Execute method of the regular expression object a series of
Match objects are returned
via the Matches collection. For example, if we extended the code in Part 1
to:
Dim strHTML
strHTML = "<html><body><a href=""http://www.aspfaqs.com/"">" & _
"ASPFAQs.com</a><br><a href=""http://www.aspmessageboard.com/"">" & _
"ASPMessageboard.com</a></body></html>"
'First, create a reg exp object
Dim objRegExp
Set objRegExp = New RegExp
objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.Pattern = "<a\s+href=""http://(.*?)"">\s*((\n|.)+?)\s*</a>"
'Display all of the matches
Dim objMatch
For Each objMatch in objRegExp.Execute(strHTML)
Response.Write("<xmp>" & objMatch.Value & "</xmp><br>")
Next
|
The output of the above code would be:
ASPFAQs.com
ASPMessageboard.com
If we want to get just the URL or URL description portion of the HREF tag, we could use the $1 or
$2 notation, respectively, in a Replace statement, but to access these values
through the Match object we have to use the SubMatches property of the Match
object. Therefore, if we wanted to list just the URL portion of the matches in the above code we could
alter our For Each ... Next loop to output the value of SubMatches(0) instead of
Match.Value:
'... continued from above ...
For Each objMatch in objRegExp.Execute(strHTML)
Response.Write("http://" & objMatch.SubMatches(0) & "<br>")
Next
|
[
View the live demo!]
which will give us the following output:
http://www.aspfaqs.com/
http://www.aspmessageboard.com/
The SubMatches property is really a collection object that's indexed starting at zero. With the dollar-sign
notation we started at $1 and worked up incrementally for each parenthetical match; using the
SubMatches property, however, we'd start at zero and work up. Pretty neat, eh?
Conclusion
This article examined some advanced features of regular expressions: using the dollar-sign notation to refer back
to a string match when using the Replace function, and using the SubMatches property of
the Match object to access the same information when using the Execute method. For more
information on regular expressions be sure to check out the Regular
Expressions Article Index.
Happy Programming!
By Scott Mitchell