• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Removing html Tags From Text
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Removing html Tags From Text


  • Subject: Re: Removing html Tags From Text
  • From: Jeff Ganyard <email@hidden>
  • Date: Sun, 28 Oct 2001 18:49:15 -0800

At 2:46 AM -0600 10/28/01, Ehsan Saffari wrote:
Hi

When trying to remove html tags from text (archived html email messages),
there may be valid "<" and ">" in the text that is not part of any tags,
so removing tags by removing everything btwn those two characters will
mangle the text.

Has anyone come up with a better logic for removing html from text?

cheers
ehsan

Unfortunately html email is rarely properly formed... web pages are much easier to deal with, but you could put together a list of opening tags, just the first part (i.e. "<img") and look for the immediately following ">" then look for "</" and the following ">" - that should be mostly effective.

Tedious to create but html is tedious in sooo many ways. <sigh>

jeff


  • Prev by Date: Re: Keychain weirdness?
  • Next by Date: Re: Looking for a disk in disks
  • Previous by thread: Re: Removing html Tags From Text
  • Next by thread: Gaining control of a Scripted Application
  • Index(es):
    • Date
    • Thread