• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Getting Unicode Number
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Getting Unicode Number


  • Subject: Re: Getting Unicode Number
  • From: has <email@hidden>
  • Date: Sun, 6 Feb 2005 18:57:37 +0000

Joseph Weaks wrote:

I'm working on a routine to encode Unicode characters into xhtml entities.

This is one of those tasks that tends to be speed-critical, so the faster the language you can use the better. AppleScript is just too slow for converting more than very small amounts of text. An osax would be best if you don't mind getting down-n-dirty with C. Or here's a simple Perl script that should be reasonably nippy:


#!/usr/bin/perl

die 'Bad args' unless (scalar @ARGV == 2);
my ($srcfile, $destfile) = @ARGV;
writeFile($destfile, utf16ToHTML($srcfile));

sub utf16ToHTML {
	sysopen F, $_[0], 0;
	my @chars = '';
	while (sysread F, $c, 2) {
		$charnum = (unpack 'S', $c);
		if ($charnum < 128) {
			push @chars, chr $charnum;
		} else {
			push @chars, '&#'.$charnum.';';
		}
	}
	return join '', @chars;
}
sub writeFile {
	my ($f, $text) = @_;
	open F, ">$f";
	print F $text;
	close F;
}

This takes a UTF16 file (written using 'write <text> to <file> as Unicode text') and outputs HTML-encoded ASCII to a second file. Call it using:

do shell script "perl /path/to/script /path/to/inputfile /path/to/outputfile"

HTH

has

p.s. Python also has very good text processing libraries, so I could wrap a bunch of its text conversion routines in a scriptable FBA if folks want to provide me a list of requests and a bit of free hosting for it.
--
http://freespace.virgin.net/hamish.sanderson/
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden
  • Prev by Date: Re: Getting Unicode Number
  • Next by Date: saving ask dont work
  • Previous by thread: Re: Getting Unicode Number
  • Next by thread: saving ask dont work
  • Index(es):
    • Date
    • Thread