Mailing Lists: Apple Mailing Lists

Image of Mac OS face in stamp
 
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Regex weirdness



But the point is that it should only match > which are NOT preceded by the pattern.

Why would an engine "match" a pattern it's not supposed to match?

(BTW, I think you're right if this were a positive look-behind, but it's not.)

Todd

On Apr 5, 2004, at 1:29 PM, Stephen Paulsen wrote:

After reading <http://www.regular-expressions.info/alternation.html> as
Larry Rosenstein suggested I have to conclude that your two regexes are
matching correctly and there is no bug. The first alternative that can
match is used, and using "(?<!<b/?|<br/?)>" matches "<br/>" by finding
the "<b" -- since the '/' is optional. The pointer is then at the 'r'
which does not match '>'

I don't know if you can alter the behavior because alternation is "eager"
which is not the same as "greedy".

Still side-stepping the issue of unconventional but legal tag formatting,
I think the most straight-forward regex to match <b> <br> <b/> <br/>
is "(?<!<br?/?)>"

- Stephen


On Friday, April 2, 2004, at 07:49 pm, Todd O'Bryan wrote:

I read the first article and it says that Sun's 1.4 JDK should be up to negative look-behinds with optional characters and of varying lengths.

To answer Greg Guerin's question,

(?<!<b/?|<br/?)> should match all >'s not preceded by <b, <b/, <br, or <br/.

Similarly,

(?<!<br/?|<b/?)> should do the same thing with just the disjunction stated in a different order.

To clarify for those who don't live in regexes, (?<!pattern1)pattern2 will match pattern2 when it is not preceded by pattern1. So, in my case, pattern1 is either <b/?|<br/? or <br/?|<b/? which should function identically. Right?

Todd

P.S. to Greg. Using regexes probably results from the 2 years I spent parsing various documents in Perl. I just threw all the tags into a String[], made patterns that matched them, and then used replaceAll. Strangely, only <br/> has a problem, probably due to interference from <b/>.

On Apr 2, 2004, at 8:19 PM, Larry Rosenstein wrote:

At 8:29 AM -0500 4/2/04, Todd O'Bryan wrote:
Can someone confirm that (a) this actually happens, and that (b) it's a bug and there's not some reason first regex should match while the second doesn't?

Take a look at <http://www.regular-expressions.info/lookaround.html> and <http://www.regular-expressions.info/alternation.html>. It has some good information about how these regexp constructs are implemented.

--
Larry Rosenstein
email@hidden
_______________________________________________
--
Java is not C++. If you want C++, it's down the hall, on your right.
- John Brewer
_______________________________________________
java-dev mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/java-dev
Do not post admin requests to the list. They will be ignored.




Visit the Apple Store online or at retail locations.
1-800-MY-APPLE

Contact Apple | Terms of Use | Privacy Policy

Copyright © 2007 Apple Inc. All rights reserved.