• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: RegEx question
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RegEx question


  • Subject: Re: RegEx question
  • From: Walter Ian Kaye <email@hidden>
  • Date: Tue, 20 Apr 2004 03:41:29 -0700

At 11:22a +0200 04/20/2004, Wim Melis didst inscribe upon an electronic papyrus:

This question applies to a 'do shell script' that calls Perl for some
RegEx work. All works fine, but there's one thing I couldn't find.

Is it possible to set RegEx so that when you search for a string, it also
finds all the variations with diacriticals?

For instance: from a search string '/xanax/', I would like Perl to also
return spellings where the letters in it have accents, tildes, umlauts,
etc.

(I guess can you can tell what it's for... we're now receiving over a
thousand spams a day)

Yup, I wrote a spam filter in Perl several months ago. :-)

Is this possible?

Sure, assuming you know which characters you're searching for.

x[@a3]n[@a3]x

There are \-escaped codes (like octal codes) you can use, but like I say, ya gotta know which ones you want in your list. I suppose you could just use the range from 128-255, such as \200-\377. I think this might work:

x[@a3\200-\377]n[@a3\200-\377]x


-boo
coding off the top of his head
_______________________________________________
applescript-users mailing list | email@hidden
Help/Unsubscribe/Archives: http://www.lists.apple.com/mailman/listinfo/applescript-users
Do not post admin requests to the list. They will be ignored.


References: 
 >RegEx question (From: Wim Melis <email@hidden>)

  • Prev by Date: (no subject)
  • Next by Date: Re: RegEx question
  • Previous by thread: RegEx question
  • Next by thread: Re: RegEx question
  • Index(es):
    • Date
    • Thread