• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Most efficient character parsing.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Most efficient character parsing.


  • Subject: Re: Most efficient character parsing.
  • From: "Eric Stewart" <email@hidden>
  • Date: Mon, 20 Feb 2006 10:58:18 -0500

On 2/20/06, Anjo Krank <email@hidden> wrote:
> Uh, why don't you pre-compile the regex, pre-initialize the array
> (next time probably with a loop?) and measure more iterations? After
> all, a "0" millisecond result would suggest that the time needed is
> not actually measurable, while times larger could come from - say -
> the class loader, the garbage collector or whatever else? Or is this
> an actual example of how you will call up your character conversion
> routine?

The only point of the test was to see which was faster. I wanted to
know which initialized faster. I also want to know which was faster on
subsequent call after call.

> Oh, while you're at it, you could pre-init your stringbuffer with the
> length of your result string, so it doesn't get re-allocated every
> few calls, pull up the chararray[i] into a variable and also with an
> explicit if(c == \n...) check.

Both of these are very good points. They should speed up the boolean
array solution even more. Not sure what the "if(c == \n...)" is in
reference to.

> After all, you test data is so short, it probably takes three times
> longer to pre-init the array and compile the pattern, than it takes
> to look at each char.

The boolean array is a class level property and not a method level
property so it should only be initialized once. But I really should
store the compiled regex as a class level property and not method
property as well. That is a glaring mistake and I will fix that and
run the test again.

> Heed the advice given previously: don't care about this stuff until
> you actually see that you have a problem...

This is a problem or I wouldn't have asked about and I would have
taken several hours to write two different solutions. Then write a
test and then share the results. I have one application that currently
takes 3 XServes to handle the total load and I will be purchasing a
fourth XServe in the next few days. So I'm re-examining all the code
looking for ways to save time. This particular operation is happening
roughly 600 million times a day so I'm looking to optimize it the best
that I can.

Thank you for pointing those things out, I'm still fairly new to Java
and I'm still learning the most efficient ways to get things done.

> Cheers, Anjo
>
> Am 19.02.2006 um 23:45 schrieb Eric Stewart:
>
> > Okay, I took the advice on the board and choose to write a boolean
> > array and regex solution and tested the two head-to-head. The bottom
> > of the email contains both solution classes and the testing class (So
> > you can see exactly what I did).
> >
> > So here's a brief description of the testing process. The test file
> > has 5 separate strings. Each string is run through the character
> > checker and the time taken to run is recorded in milliseconds. For
> > each round I built the application and started it. Then ran either the
> > boolean array or regex test three times successively. Then stop the
> > application, rebuilt it and restarted it and ran the other test. 10
> > rounds were run.
> >
> > Here is how it went.
> >
> > Round 1:
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,0,0,1,3,0,0,0,0,0,0,0,0,0,1
> > Total Time: 9
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 19,4,1,6,1,1,1,0,3,1,0,0,0,1,1
> > Total Time: 39
> >
> > Round 2.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 5,0,1,0,3,0,1,0,0,0,0,0,0,0,0
> > Total Time: 10
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 18,5,2,3,0,0,1,0,2,1,1,1,0,1,0
> > Total Time: 35
> >
> > Round 3.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,1,0,0,4,0,1,0,0,0,0,0,0,0,0
> > Total Time: 10
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 19,7,2,3,2,0,0,1,3,0,1,0,1,0,1
> > Total Time: 40
> >
> > Round 4.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,1,0,0,4,0,0,0,0,0,0,0,0,0,0
> > Total Time: 9
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 20,4,1,6,1,0,1,0,3,0,1,1,0,1,0
> > Total Time: 39
> >
> > Round 5.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,0,0,0,3,0,0,0,0,0,0,0,0,0,0
> > Total Time: 7
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 19,4,1,5,0,0,0,1,2,0,0,0,0,1,0
> > Total Time: 33
> >
> > Round 6.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 5,0,0,1,3,0,0,0,0,0,0,0,0,0,1
> > Total Time: 10
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 36,5,2,4,0,0,0,1,2,1,1,1,0,1,0
> > Total Time: 54
> >
> > Round 7.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,0,0,1,3,0,0,0,0,1,0,0,1,1,0
> > Total Time: 11
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 19,4,1,4,0,1,0,0,2,1,1,1.0,0,1
> > Total Time: 35
> >
> > Round 8.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 5,1,0,0,4,0,0,0,0,0,0,0,0,0,0
> > Total Time: 10
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 21,5,1,4,1,1,0,0,3,0,1,1,0,1,0
> > Total Time: 39
> >
> > Round 9.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 5,0,0,0,3,0,0,0,0,0,0,1,0,0,0
> > Total Time: 9
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 18,4,2,5,1,0,0,1,2,0,0,0,0,0,0
> > Total Time: 34
> >
> > Round 10.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,0,0,0,4,0,0,0,0,0,0,0,0,0,0
> > Total Time: 8
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 37,4,1,4,1,1,1,0,3,1,0,0,1,0,1
> > Total Time: 55
> >
> > Summary: The first time in each test was obviously higher because it
> > was the first time the solution object was instantiated. What was
> > interesting was that even though I was building a boolean array with
> > 194 elements, it was still faster to initialize than the regex
> > solution, and by quite a bit at that.
> >
> > The boolean array solution was faster to initialize and ran faster
> > overall. Both the first time it ran and over the majority of
> > subsequent runs.
> >
> > I hope this helps someone else.
> >
> > Here are the java files I used.
> >
> > DirectAction.java
> > -------------------------
> > //
> > // DirectAction.java
> > // Project Norway
> > //
> > // Created by ericstewart on 2/15/06
> > //
> >
> > import com.webobjects.foundation.*;
> > import com.webobjects.appserver.*;
> > import com.webobjects.eocontrol.*;
> > import java.util.*;
> >
> > public class DirectAction extends WODirectAction {
> >
> >     public DirectAction(WORequest aRequest) {
> >         super(aRequest);
> >     }
> >
> >     public WOActionResults defaultAction() {
> >         return pageWithName("Main");
> >     }
> >
> >       public WOActionResults charSpeedArrayMapAction() {
> >               // build test string
> >               StringBuffer testString = new StringBuffer("kfdlas;n 0wqm dsagjnoisa
> > fd;af[aghjr3q-tifnewna fafjpewiq nor0dafnlw;l jfh0w flw;f saofh8");
> >               testString.append((char)1000);
> >               testString.append("fd0 f023 fkdls anflrwjap  fsa[w fjnw f[2-
> > dawjv094 tn3oh9k04r3 309r3hg854mvrm3w0v5nw[0 v9");
> >               testString.append((char)10000);
> >               testString.append("qmgn vjdsop 00 89w nv3ni0vr nmv p3orm vnrv rm v
> > fw mdndw sjuio490n v uckm4uv4n fj iivkmcj");
> >               testString.append((char)100000);
> >               testString.append("o489jnrbnv8m 5tjvb6fci9   uv77vj vu v7v 678i9ls
> > fdgo09 i9 r98 jk  f78 fm,f juy fiker fdmf");
> >               testString.append((char)1000000);
> >               testString.append("irmvn 984mn  juf78 km 4 d0v76 7 m j37 67k
> > 6mbvjk8cv56 6yjn r vcjv u7849md cx;df]c0-8 end");
> >               testString.append((char)10000000);
> >
> >               // Strip illegal characters.
> >               NSTimestamp time1 = new NSTimestamp();
> >               ISOLatin1CharacterUtilityArrayMap charUtility = new
> > ISOLatin1CharacterUtilityArrayMap();
> >               String resultString =
> > charUtility.stripInvalidCharsFromString(testString.toString());
> >               NSTimestamp time2 = new NSTimestamp();
> >               GregorianCalendar startCal = new GregorianCalendar();
> >               GregorianCalendar endCal = new GregorianCalendar();
> >               long diffMillis = 0;
> >               startCal.setTime(time1);
> >               endCal.setTime(time2);
> >               diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> >               NSLog.debug.appendln("Array map time to parse string: "+diffMillis);
> >
> >               testString = new StringBuffer("9km4bnchj 8jk4 v y739dfj jme89vu8 n
> > d';v FRkfvK U*N UO (F&^#  VM#I( )*$ JM  KOW#M$ @M< IF");
> >               testString.append((char)1000);
> >               testString.append("4mvj930 fn89 2 no98304 nr0mj v8v87395 09vm vwlr e
> > ;vd s,mnrv K VUYRMNVDHJ SUISVI  DVO$MMV");
> >               testString.append((char)10000);
> >               testString.append("i4m *$N lfju67 K$(N kjgurn jkd7 KMN* JND^&V
> > kf9]6l4m,d id 8f4 j md k3idd8j4m cems  duij4m");
> >               testString.append((char)100000);
> >               testString.append("imn4nf8 IUj4nvjud8mner iec 883mnd  J893M K
> > VEniw8923m  mdwjw8m vmskl w o290894 vw m s s94");
> >               testString.append((char)1000000);
> >               testString.append("fwjo wo fro3neqwvr03 f94fdwc J VW)RJ)VJ EQW(
> > VNDSHVV@HPNVDSOPV)J*(J V)W)RHjiwo vhjdwj vlj");
> >               testString.append((char)10000000);
> >
> >               time1 = new NSTimestamp();
> >               resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> >               time2 = new NSTimestamp();
> >               diffMillis = 0;
> >               startCal.setTime(time1);
> >               endCal.setTime(time2);
> >               diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> >               NSLog.debug.appendln("Array map time to parse string: "+diffMillis);
> >
> >               testString = new StringBuffer("fj fjau");
> >               testString.append((char)1000);
> >               testString.append("fafj daf ds");
> >               testString.append((char)10000);
> >               testString.append("fjw csl jw ");
> >               testString.append((char)100000);
> >               testString.append("M)_CQ)");
> >               testString.append((char)1000000);
> >               testString.append("K(@*NE");
> >               testString.append((char)10000000);
> >
> >               time1 = new NSTimestamp();
> >               resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> >               time2 = new NSTimestamp();
> >               diffMillis = 0;
> >               startCal.setTime(time1);
> >               endCal.setTime(time2);
> >               diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> >               NSLog.debug.appendln("Array map time to parse string: "+diffMillis);
> >
> >               testString = new StringBuffer("9km4bnchj 8jk4 v y739dfj jme89vu8 n
> > d';v FRkfvK U*N UO (F&^#  VM#I( )*$ JM  KOW#M$ @M< IF");
> >               testString.append((char)1000);
> >               testString.append("4mvj930 fn89 2 no98304 nr0mj v8v87395 09vm vwlr e
> > ;vd s,mnrv K VUYRMNVDHJ SUISVI  DVO$MMV");
> >               testString.append((char)10000);
> >               testString.append("i4m *$N lfju67 K$(N kjgurn jkd7 KMN* JND^&V
> > kf9]6l4m,d id 8f4 j md k3idd8j4m cems  duij4m");
> >               testString.append((char)100000);
> >               testString.append("imn4nf8 IUj4nvjud8mner iec 883mnd  J893M K
> > VEniw8923m  mdwjw8m vmskl w o290894 vw m s s94");
> >               testString.append((char)1000000);
> >               testString.append("fwjo wo fro3neqwvr03 f94fdwc J VW)RJ)VJ EQW(
> > VNDSHVV@HPNVDSOPV)J*(J V)W)RHjiwo vhjdwj vlj");
> >               testString.append((char)10000000);
> >
> >               time1 = new NSTimestamp();
> >               resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> >               time2 = new NSTimestamp();
> >               diffMillis = 0;
> >               startCal.setTime(time1);
> >               endCal.setTime(time2);
> >               diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> >               NSLog.debug.appendln("Array map time to parse string: "+diffMillis);
> >
> >               testString = new StringBuffer("kfdlas;n 0wqm dsagjnoisa
> > fd;af[aghjr3q-tifnewna fafjpewiq nor0dafnlw;l jfh0w flw;f saofh8");
> >               testString.append((char)1000);
> >               testString.append("fd0 f023 fkdls anflrwjap  fsa[w fjnw f[2-
> > dawjv094 tn3oh9k04r3 309r3hg854mvrm3w0v5nw[0 v9");
> >               testString.append((char)10000);
> >               testString.append("qmgn vjdsop 00 89w nv3ni0vr nmv p3orm vnrv rm v
> > fw mdndw sjuio490n v uckm4uv4n fj iivkmcj");
> >               testString.append((char)100000);
> >               testString.append("o489jnrbnv8m 5tjvb6fci9   uv77vj vu v7v 678i9ls
> > fdgo09 i9 r98 jk  f78 fm,f juy fiker fdmf");
> >               testString.append((char)1000000);
> >               testString.append("irmvn 984mn  juf78 km 4 d0v76 7 m j37 67k
> > 6mbvjk8cv56 6yjn r vcjv u7849md cx;df]c0-8 end");
> >               testString.append((char)10000000);
> >
> >               time1 = new NSTimestamp();
> >               resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> >               time2 = new NSTimestamp();
> >               diffMillis = 0;
> >               startCal.setTime(time1);
> >               endCal.setTime(time2);
> >               diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> >               NSLog.debug.appendln("Array map time to parse string: "+diffMillis);
> >
> >               Main page = (Main)pageWithName("Main");
> >               page.setVersion(resultString);
> >
> >               return page;
> >       }
> >
> >       public WOActionResults charSpeedRegexAction() {
> >               // build test string
> >               StringBuffer testString = new StringBuffer("kfdlas;n 0wqm dsagjnoisa
> > fd;af[aghjr3q-tifnewna fafjpewiq nor0dafnlw;l jfh0w flw;f saofh8");
> >               testString.append((char)1000);
> >               testString.append("fd0 f023 fkdls anflrwjap  fsa[w fjnw f[2-
> > dawjv094 tn3oh9k04r3 309r3hg854mvrm3w0v5nw[0 v9");
> >               testString.append((char)10000);
> >               testString.append("qmgn vjdsop 00 89w nv3ni0vr nmv p3orm vnrv rm v
> > fw mdndw sjuio490n v uckm4uv4n fj iivkmcj");
> >               testString.append((char)100000);
> >               testString.append("o489jnrbnv8m 5tjvb6fci9   uv77vj vu v7v 678i9ls
> > fdgo09 i9 r98 jk  f78 fm,f juy fiker fdmf");
> >               testString.append((char)1000000);
> >               testString.append("irmvn 984mn  juf78 km 4 d0v76 7 m j37 67k
> > 6mbvjk8cv56 6yjn r vcjv u7849md cx;df]c0-8 end");
> >               testString.append((char)10000000);
> >
> >               // Strip illegal characters.
> >               NSTimestamp time1 = new NSTimestamp();
> >               ISOLatin1CharacterUtilityRegex charUtility = new
> > ISOLatin1CharacterUtilityRegex();
> >               String resultString =
> > charUtility.stripInvalidCharsFromString(testString.toString());
> >               NSTimestamp time2 = new NSTimestamp();
> >               GregorianCalendar startCal = new GregorianCalendar();
> >               GregorianCalendar endCal = new GregorianCalendar();
> >               long diffMillis = 0;
> >               startCal.setTime(time1);
> >               endCal.setTime(time2);
> >               diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> >               NSLog.debug.appendln("Regex time to parse string: "+diffMillis);
> >
> >               testString = new StringBuffer("9km4bnchj 8jk4 v y739dfj jme89vu8 n
> > d';v FRkfvK U*N UO (F&^#  VM#I( )*$ JM  KOW#M$ @M< IF");
> >               testString.append((char)1000);
> >               testString.append("4mvj930 fn89 2 no98304 nr0mj v8v87395 09vm vwlr e
> > ;vd s,mnrv K VUYRMNVDHJ SUISVI  DVO$MMV");
> >               testString.append((char)10000);
> >               testString.append("i4m *$N lfju67 K$(N kjgurn jkd7 KMN* JND^&V
> > kf9]6l4m,d id 8f4 j md k3idd8j4m cems  duij4m");
> >               testString.append((char)100000);
> >               testString.append("imn4nf8 IUj4nvjud8mner iec 883mnd  J893M K
> > VEniw8923m  mdwjw8m vmskl w o290894 vw m s s94");
> >               testString.append((char)1000000);
> >               testString.append("fwjo wo fro3neqwvr03 f94fdwc J VW)RJ)VJ EQW(
> > VNDSHVV@HPNVDSOPV)J*(J V)W)RHjiwo vhjdwj vlj");
> >               testString.append((char)10000000);
> >
> >               time1 = new NSTimestamp();
> >               resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> >               time2 = new NSTimestamp();
> >               diffMillis = 0;
> >               startCal.setTime(time1);
> >               endCal.setTime(time2);
> >               diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> >               NSLog.debug.appendln("Regex time to parse string: "+diffMillis);
> >
> >               testString = new StringBuffer("fj fjau");
> >               testString.append((char)1000);
> >               testString.append("fafj daf ds");
> >               testString.append((char)10000);
> >               testString.append("fjw csl jw ");
> >               testString.append((char)100000);
> >               testString.append("M)_CQ)");
> >               testString.append((char)1000000);
> >               testString.append("K(@*NE");
> >               testString.append((char)10000000);
> >
> >               time1 = new NSTimestamp();
> >               resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> >               time2 = new NSTimestamp();
> >               diffMillis = 0;
> >               startCal.setTime(time1);
> >               endCal.setTime(time2);
> >               diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> >               NSLog.debug.appendln("Regex time to parse string: "+diffMillis);
> >
> >               testString = new StringBuffer("9km4bnchj 8jk4 v y739dfj jme89vu8 n
> > d';v FRkfvK U*N UO (F&^#  VM#I( )*$ JM  KOW#M$ @M< IF");
> >               testString.append((char)1000);
> >               testString.append("4mvj930 fn89 2 no98304 nr0mj v8v87395 09vm vwlr e
> > ;vd s,mnrv K VUYRMNVDHJ SUISVI  DVO$MMV");
> >               testString.append((char)10000);
> >               testString.append("i4m *$N lfju67 K$(N kjgurn jkd7 KMN* JND^&V
> > kf9]6l4m,d id 8f4 j md k3idd8j4m cems  duij4m");
> >               testString.append((char)100000);
> >               testString.append("imn4nf8 IUj4nvjud8mner iec 883mnd  J893M K
> > VEniw8923m  mdwjw8m vmskl w o290894 vw m s s94");
> >               testString.append((char)1000000);
> >               testString.append("fwjo wo fro3neqwvr03 f94fdwc J VW)RJ)VJ EQW(
> > VNDSHVV@HPNVDSOPV)J*(J V)W)RHjiwo vhjdwj vlj");
> >               testString.append((char)10000000);
> >
> >               time1 = new NSTimestamp();
> >               resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> >               time2 = new NSTimestamp();
> >               diffMillis = 0;
> >               startCal.setTime(time1);
> >               endCal.setTime(time2);
> >               diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> >               NSLog.debug.appendln("Regex time to parse string: "+diffMillis);
> >
> >               testString = new StringBuffer("kfdlas;n 0wqm dsagjnoisa
> > fd;af[aghjr3q-tifnewna fafjpewiq nor0dafnlw;l jfh0w flw;f saofh8");
> >               testString.append((char)1000);
> >               testString.append("fd0 f023 fkdls anflrwjap  fsa[w fjnw f[2-
> > dawjv094 tn3oh9k04r3 309r3hg854mvrm3w0v5nw[0 v9");
> >               testString.append((char)10000);
> >               testString.append("qmgn vjdsop 00 89w nv3ni0vr nmv p3orm vnrv rm v
> > fw mdndw sjuio490n v uckm4uv4n fj iivkmcj");
> >               testString.append((char)100000);
> >               testString.append("o489jnrbnv8m 5tjvb6fci9   uv77vj vu v7v 678i9ls
> > fdgo09 i9 r98 jk  f78 fm,f juy fiker fdmf");
> >               testString.append((char)1000000);
> >               testString.append("irmvn 984mn  juf78 km 4 d0v76 7 m j37 67k
> > 6mbvjk8cv56 6yjn r vcjv u7849md cx;df]c0-8 end");
> >               testString.append((char)10000000);
> >
> >               time1 = new NSTimestamp();
> >               resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> >               time2 = new NSTimestamp();
> >               diffMillis = 0;
> >               startCal.setTime(time1);
> >               endCal.setTime(time2);
> >               diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> >               NSLog.debug.appendln("Regex time to parse string: "+diffMillis);
> >
> >
> >               Main page = (Main)pageWithName("Main");
> >               page.setVersion(resultString);
> >
> >               return page;
> >       }
> >
> > }
> >
> >
> > ISOLatin1CharacterUtilityArrayMap.java
> > ----------------------------------------------------------
> > //
> > //  ISOLatin1CharacterUtilityArrayMap.java
> > //  Norway
> > //
> > //  Created by Eric Stewart on 2/19/06.
> > //  Copyright 2006 __MyCompanyName__. All rights reserved.
> > //
> >
> > public class ISOLatin1CharacterUtilityArrayMap {
> >       private boolean[] charMap = new boolean[256];
> >
> >       public ISOLatin1CharacterUtilityArrayMap() {
> >               // Initialize ISO-8859-1 character map.
> >               charMap[9]   = true; charMap[10]  = true; charMap[13]  = true;
> > charMap[32] = true;
> >               charMap[33]  = true; charMap[34]  = true; charMap[35]  = true;
> > charMap[36] = true;
> >               charMap[37]  = true; charMap[38]  = true; charMap[39]  = true;
> > charMap[40] = true;
> >               charMap[41]  = true; charMap[42]  = true; charMap[43]  = true;
> > charMap[44] = true;
> >               charMap[45]  = true; charMap[46]  = true; charMap[47]  = true;
> > charMap[48] = true;
> >               charMap[49]  = true; charMap[50]  = true; charMap[51]  = true;
> > charMap[52] = true;
> >               charMap[53]  = true; charMap[54]  = true; charMap[55]  = true;
> > charMap[56] = true;
> >               charMap[57]  = true; charMap[58]  = true; charMap[59]  = true;
> > charMap[60] = true;
> >               charMap[61]  = true; charMap[62]  = true; charMap[63]  = true;
> > charMap[64] = true;
> >               charMap[65]  = true; charMap[66]  = true; charMap[67]  = true;
> > charMap[68] = true;
> >               charMap[69]  = true; charMap[70]  = true; charMap[71]  = true;
> > charMap[72] = true;
> >               charMap[73]  = true; charMap[74]  = true; charMap[75]  = true;
> > charMap[76] = true;
> >               charMap[77]  = true; charMap[78]  = true; charMap[79]  = true;
> > charMap[80] = true;
> >               charMap[81]  = true; charMap[82]  = true; charMap[83]  = true;
> > charMap[84] = true;
> >               charMap[85]  = true; charMap[86]  = true; charMap[87]  = true;
> > charMap[88] = true;
> >               charMap[89]  = true; charMap[90]  = true; charMap[91]  = true;
> > charMap[92] = true;
> >               charMap[93]  = true; charMap[94]  = true; charMap[95]  = true;
> > charMap[96] = true;
> >               charMap[97]  = true; charMap[98]  = true; charMap[99]  = true;
> > charMap[100] = true;
> >               charMap[101] = true; charMap[102] = true; charMap[103] = true;
> > charMap[104] = true;
> >               charMap[105] = true; charMap[106] = true; charMap[107] = true;
> > charMap[108] = true;
> >               charMap[109] = true; charMap[110] = true; charMap[111] = true;
> > charMap[112] = true;
> >               charMap[113] = true; charMap[114] = true; charMap[115] = true;
> > charMap[116] = true;
> >               charMap[117] = true; charMap[118] = true; charMap[119] = true;
> > charMap[120] = true;
> >               charMap[121] = true; charMap[122] = true; charMap[123] = true;
> > charMap[124] = true;
> >               charMap[125] = true; charMap[126] = true; charMap[160] = true;
> > charMap[161] = true;
> >               charMap[162] = true; charMap[163] = true; charMap[164] = true;
> > charMap[165] = true;
> >               charMap[166] = true; charMap[167] = true; charMap[168] = true;
> > charMap[169] = true;
> >               charMap[170] = true; charMap[171] = true; charMap[172] = true;
> > charMap[173] = true;
> >               charMap[174] = true; charMap[175] = true; charMap[176] = true;
> > charMap[177] = true;
> >               charMap[178] = true; charMap[179] = true; charMap[180] = true;
> > charMap[181] = true;
> >               charMap[182] = true; charMap[183] = true; charMap[184] = true;
> > charMap[185] = true;
> >               charMap[186] = true; charMap[187] = true; charMap[188] = true;
> > charMap[189] = true;
> >               charMap[190] = true; charMap[191] = true; charMap[192] = true;
> > charMap[193] = true;
> >               charMap[194] = true; charMap[195] = true; charMap[196] = true;
> > charMap[197] = true;
> >               charMap[198] = true; charMap[199] = true; charMap[200] = true;
> > charMap[201] = true;
> >               charMap[202] = true; charMap[203] = true; charMap[204] = true;
> > charMap[205] = true;
> >               charMap[206] = true; charMap[207] = true; charMap[208] = true;
> > charMap[209] = true;
> >               charMap[210] = true; charMap[211] = true; charMap[212] = true;
> > charMap[213] = true;
> >               charMap[214] = true; charMap[215] = true; charMap[216] = true;
> > charMap[217] = true;
> >               charMap[218] = true; charMap[219] = true; charMap[220] = true;
> > charMap[221] = true;
> >               charMap[222] = true; charMap[223] = true; charMap[224] = true;
> > charMap[225] = true;
> >               charMap[226] = true; charMap[227] = true; charMap[228] = true;
> > charMap[229] = true;
> >               charMap[230] = true; charMap[231] = true; charMap[232] = true;
> > charMap[233] = true;
> >               charMap[234] = true; charMap[235] = true; charMap[236] = true;
> > charMap[237] = true;
> >               charMap[238] = true; charMap[239] = true; charMap[240] = true;
> > charMap[241] = true;
> >               charMap[242] = true; charMap[243] = true; charMap[244] = true;
> > charMap[245] = true;
> >               charMap[246] = true; charMap[247] = true; charMap[248] = true;
> > charMap[249] = true;
> >               charMap[250] = true; charMap[251] = true; charMap[252] = true;
> > charMap[253] = true;
> >               charMap[254] = true; charMap[255] = true;
> >       }
> >
> >       /*
> >        * Determines if a char is a valid ISO-8859-1 character.
> >        */
> >       public boolean isCharValid(char value) {
> >               if (((int)value) < 256 && charMap[(int)value]) {
> >                       return true;
> >               } else {
> >                       return false;
> >               }
> >       }
> >
> >       /*
> >        * Returns a string clean of all invalid ISO-8859-1 characters.
> >        */
> >       public String stripInvalidCharsFromString(String value) {
> >               StringBuffer buffer = new StringBuffer();
> >               char[] charArray = value.toCharArray();
> >               int charArrayLength = charArray.length;
> >               for (int i = 0; i < charArrayLength; i++) {
> >                       if (((int)charArray[i]) < 256 && charMap[(int)charArray[i]]) {
> >                               buffer.append(charArray[i]);
> >                       }
> >               }
> >               return buffer.toString();
> >       }
> > }
> >
> > ISOLatin1CharacterUtilityRegex.java
> > -----------------------------------------------------
> > //
> > //  ISOLatin1CharCleaner.java
> > //  Norway
> > //
> > //  Created by Eric Stewart on 2/15/06.
> > //  Copyright 2006 __MyCompanyName__. All rights reserved.
> > //
> >
> > import java.util.regex.*;
> >
> > public class ISOLatin1CharacterUtilityRegex {
> >
> >       public ISOLatin1CharacterUtilityRegex() {
> >       }
> >
> >       /*
> >        * Returns a string clean of all invalid ISO-8859-1 characters.
> >        */
> >       public String stripInvalidCharsFromString(String value) {
> >               String regExp = "[^\\x09\\x0A\\x0D\\x20-\\x7E\\xA0-\\xFF]+";
> >               Pattern p = Pattern.compile(regExp);
> >               String result = p.matcher(value).replaceAll("");
> >               return result;
> >       }
> > }
> >  _______________________________________________
> > Do not post admin requests to the list. They will be ignored.
> > Webobjects-dev mailing list      (email@hidden)
> > Help/Unsubscribe/Update your Subscription:
> > 40logicunited.com
> >
> > This email sent to email@hidden
>
>
 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

  • Follow-Ups:
    • Re: Most efficient character parsing.
      • From: Anjo Krank <email@hidden>
References: 
 >Re: Most efficient character parsing. (From: "Eric Stewart" <email@hidden>)
 >Re: Most efficient character parsing. (From: Anjo Krank <email@hidden>)

  • Prev by Date: Re: PostgreSQL vs MySQL for Blobs
  • Next by Date: Re: Most efficient character parsing.
  • Previous by thread: Re: Most efficient character parsing.
  • Next by thread: Re: Most efficient character parsing.
  • Index(es):
    • Date
    • Thread