Re: Most efficient character parsing.
Re: Most efficient character parsing.
- Subject: Re: Most efficient character parsing.
- From: "Eric Stewart" <email@hidden>
- Date: Mon, 20 Feb 2006 10:58:18 -0500
On 2/20/06, Anjo Krank <email@hidden> wrote:
> Uh, why don't you pre-compile the regex, pre-initialize the array
> (next time probably with a loop?) and measure more iterations? After
> all, a "0" millisecond result would suggest that the time needed is
> not actually measurable, while times larger could come from - say -
> the class loader, the garbage collector or whatever else? Or is this
> an actual example of how you will call up your character conversion
> routine?
The only point of the test was to see which was faster. I wanted to
know which initialized faster. I also want to know which was faster on
subsequent call after call.
> Oh, while you're at it, you could pre-init your stringbuffer with the
> length of your result string, so it doesn't get re-allocated every
> few calls, pull up the chararray[i] into a variable and also with an
> explicit if(c == \n...) check.
Both of these are very good points. They should speed up the boolean
array solution even more. Not sure what the "if(c == \n...)" is in
reference to.
> After all, you test data is so short, it probably takes three times
> longer to pre-init the array and compile the pattern, than it takes
> to look at each char.
The boolean array is a class level property and not a method level
property so it should only be initialized once. But I really should
store the compiled regex as a class level property and not method
property as well. That is a glaring mistake and I will fix that and
run the test again.
> Heed the advice given previously: don't care about this stuff until
> you actually see that you have a problem...
This is a problem or I wouldn't have asked about and I would have
taken several hours to write two different solutions. Then write a
test and then share the results. I have one application that currently
takes 3 XServes to handle the total load and I will be purchasing a
fourth XServe in the next few days. So I'm re-examining all the code
looking for ways to save time. This particular operation is happening
roughly 600 million times a day so I'm looking to optimize it the best
that I can.
Thank you for pointing those things out, I'm still fairly new to Java
and I'm still learning the most efficient ways to get things done.
> Cheers, Anjo
>
> Am 19.02.2006 um 23:45 schrieb Eric Stewart:
>
> > Okay, I took the advice on the board and choose to write a boolean
> > array and regex solution and tested the two head-to-head. The bottom
> > of the email contains both solution classes and the testing class (So
> > you can see exactly what I did).
> >
> > So here's a brief description of the testing process. The test file
> > has 5 separate strings. Each string is run through the character
> > checker and the time taken to run is recorded in milliseconds. For
> > each round I built the application and started it. Then ran either the
> > boolean array or regex test three times successively. Then stop the
> > application, rebuilt it and restarted it and ran the other test. 10
> > rounds were run.
> >
> > Here is how it went.
> >
> > Round 1:
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,0,0,1,3,0,0,0,0,0,0,0,0,0,1
> > Total Time: 9
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 19,4,1,6,1,1,1,0,3,1,0,0,0,1,1
> > Total Time: 39
> >
> > Round 2.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 5,0,1,0,3,0,1,0,0,0,0,0,0,0,0
> > Total Time: 10
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 18,5,2,3,0,0,1,0,2,1,1,1,0,1,0
> > Total Time: 35
> >
> > Round 3.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,1,0,0,4,0,1,0,0,0,0,0,0,0,0
> > Total Time: 10
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 19,7,2,3,2,0,0,1,3,0,1,0,1,0,1
> > Total Time: 40
> >
> > Round 4.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,1,0,0,4,0,0,0,0,0,0,0,0,0,0
> > Total Time: 9
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 20,4,1,6,1,0,1,0,3,0,1,1,0,1,0
> > Total Time: 39
> >
> > Round 5.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,0,0,0,3,0,0,0,0,0,0,0,0,0,0
> > Total Time: 7
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 19,4,1,5,0,0,0,1,2,0,0,0,0,1,0
> > Total Time: 33
> >
> > Round 6.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 5,0,0,1,3,0,0,0,0,0,0,0,0,0,1
> > Total Time: 10
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 36,5,2,4,0,0,0,1,2,1,1,1,0,1,0
> > Total Time: 54
> >
> > Round 7.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,0,0,1,3,0,0,0,0,1,0,0,1,1,0
> > Total Time: 11
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 19,4,1,4,0,1,0,0,2,1,1,1.0,0,1
> > Total Time: 35
> >
> > Round 8.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 5,1,0,0,4,0,0,0,0,0,0,0,0,0,0
> > Total Time: 10
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 21,5,1,4,1,1,0,0,3,0,1,1,0,1,0
> > Total Time: 39
> >
> > Round 9.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 5,0,0,0,3,0,0,0,0,0,0,1,0,0,0
> > Total Time: 9
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 18,4,2,5,1,0,0,1,2,0,0,0,0,0,0
> > Total Time: 34
> >
> > Round 10.
> >
> > Built & ran application. Invoked boolean array direct action 3
> > times in a row.
> > Results in milliseconds: 4,0,0,0,4,0,0,0,0,0,0,0,0,0,0
> > Total Time: 8
> >
> > Built & ran application. Invoked regex direct action 3 times in a row.
> > Results in milliseconds: 37,4,1,4,1,1,1,0,3,1,0,0,1,0,1
> > Total Time: 55
> >
> > Summary: The first time in each test was obviously higher because it
> > was the first time the solution object was instantiated. What was
> > interesting was that even though I was building a boolean array with
> > 194 elements, it was still faster to initialize than the regex
> > solution, and by quite a bit at that.
> >
> > The boolean array solution was faster to initialize and ran faster
> > overall. Both the first time it ran and over the majority of
> > subsequent runs.
> >
> > I hope this helps someone else.
> >
> > Here are the java files I used.
> >
> > DirectAction.java
> > -------------------------
> > //
> > // DirectAction.java
> > // Project Norway
> > //
> > // Created by ericstewart on 2/15/06
> > //
> >
> > import com.webobjects.foundation.*;
> > import com.webobjects.appserver.*;
> > import com.webobjects.eocontrol.*;
> > import java.util.*;
> >
> > public class DirectAction extends WODirectAction {
> >
> > public DirectAction(WORequest aRequest) {
> > super(aRequest);
> > }
> >
> > public WOActionResults defaultAction() {
> > return pageWithName("Main");
> > }
> >
> > public WOActionResults charSpeedArrayMapAction() {
> > // build test string
> > StringBuffer testString = new StringBuffer("kfdlas;n 0wqm dsagjnoisa
> > fd;af[aghjr3q-tifnewna fafjpewiq nor0dafnlw;l jfh0w flw;f saofh8");
> > testString.append((char)1000);
> > testString.append("fd0 f023 fkdls anflrwjap fsa[w fjnw f[2-
> > dawjv094 tn3oh9k04r3 309r3hg854mvrm3w0v5nw[0 v9");
> > testString.append((char)10000);
> > testString.append("qmgn vjdsop 00 89w nv3ni0vr nmv p3orm vnrv rm v
> > fw mdndw sjuio490n v uckm4uv4n fj iivkmcj");
> > testString.append((char)100000);
> > testString.append("o489jnrbnv8m 5tjvb6fci9 uv77vj vu v7v 678i9ls
> > fdgo09 i9 r98 jk f78 fm,f juy fiker fdmf");
> > testString.append((char)1000000);
> > testString.append("irmvn 984mn juf78 km 4 d0v76 7 m j37 67k
> > 6mbvjk8cv56 6yjn r vcjv u7849md cx;df]c0-8 end");
> > testString.append((char)10000000);
> >
> > // Strip illegal characters.
> > NSTimestamp time1 = new NSTimestamp();
> > ISOLatin1CharacterUtilityArrayMap charUtility = new
> > ISOLatin1CharacterUtilityArrayMap();
> > String resultString =
> > charUtility.stripInvalidCharsFromString(testString.toString());
> > NSTimestamp time2 = new NSTimestamp();
> > GregorianCalendar startCal = new GregorianCalendar();
> > GregorianCalendar endCal = new GregorianCalendar();
> > long diffMillis = 0;
> > startCal.setTime(time1);
> > endCal.setTime(time2);
> > diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> > NSLog.debug.appendln("Array map time to parse string: "+diffMillis);
> >
> > testString = new StringBuffer("9km4bnchj 8jk4 v y739dfj jme89vu8 n
> > d';v FRkfvK U*N UO (F&^# VM#I( )*$ JM KOW#M$ @M< IF");
> > testString.append((char)1000);
> > testString.append("4mvj930 fn89 2 no98304 nr0mj v8v87395 09vm vwlr e
> > ;vd s,mnrv K VUYRMNVDHJ SUISVI DVO$MMV");
> > testString.append((char)10000);
> > testString.append("i4m *$N lfju67 K$(N kjgurn jkd7 KMN* JND^&V
> > kf9]6l4m,d id 8f4 j md k3idd8j4m cems duij4m");
> > testString.append((char)100000);
> > testString.append("imn4nf8 IUj4nvjud8mner iec 883mnd J893M K
> > VEniw8923m mdwjw8m vmskl w o290894 vw m s s94");
> > testString.append((char)1000000);
> > testString.append("fwjo wo fro3neqwvr03 f94fdwc J VW)RJ)VJ EQW(
> > VNDSHVV@HPNVDSOPV)J*(J V)W)RHjiwo vhjdwj vlj");
> > testString.append((char)10000000);
> >
> > time1 = new NSTimestamp();
> > resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> > time2 = new NSTimestamp();
> > diffMillis = 0;
> > startCal.setTime(time1);
> > endCal.setTime(time2);
> > diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> > NSLog.debug.appendln("Array map time to parse string: "+diffMillis);
> >
> > testString = new StringBuffer("fj fjau");
> > testString.append((char)1000);
> > testString.append("fafj daf ds");
> > testString.append((char)10000);
> > testString.append("fjw csl jw ");
> > testString.append((char)100000);
> > testString.append("M)_CQ)");
> > testString.append((char)1000000);
> > testString.append("K(@*NE");
> > testString.append((char)10000000);
> >
> > time1 = new NSTimestamp();
> > resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> > time2 = new NSTimestamp();
> > diffMillis = 0;
> > startCal.setTime(time1);
> > endCal.setTime(time2);
> > diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> > NSLog.debug.appendln("Array map time to parse string: "+diffMillis);
> >
> > testString = new StringBuffer("9km4bnchj 8jk4 v y739dfj jme89vu8 n
> > d';v FRkfvK U*N UO (F&^# VM#I( )*$ JM KOW#M$ @M< IF");
> > testString.append((char)1000);
> > testString.append("4mvj930 fn89 2 no98304 nr0mj v8v87395 09vm vwlr e
> > ;vd s,mnrv K VUYRMNVDHJ SUISVI DVO$MMV");
> > testString.append((char)10000);
> > testString.append("i4m *$N lfju67 K$(N kjgurn jkd7 KMN* JND^&V
> > kf9]6l4m,d id 8f4 j md k3idd8j4m cems duij4m");
> > testString.append((char)100000);
> > testString.append("imn4nf8 IUj4nvjud8mner iec 883mnd J893M K
> > VEniw8923m mdwjw8m vmskl w o290894 vw m s s94");
> > testString.append((char)1000000);
> > testString.append("fwjo wo fro3neqwvr03 f94fdwc J VW)RJ)VJ EQW(
> > VNDSHVV@HPNVDSOPV)J*(J V)W)RHjiwo vhjdwj vlj");
> > testString.append((char)10000000);
> >
> > time1 = new NSTimestamp();
> > resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> > time2 = new NSTimestamp();
> > diffMillis = 0;
> > startCal.setTime(time1);
> > endCal.setTime(time2);
> > diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> > NSLog.debug.appendln("Array map time to parse string: "+diffMillis);
> >
> > testString = new StringBuffer("kfdlas;n 0wqm dsagjnoisa
> > fd;af[aghjr3q-tifnewna fafjpewiq nor0dafnlw;l jfh0w flw;f saofh8");
> > testString.append((char)1000);
> > testString.append("fd0 f023 fkdls anflrwjap fsa[w fjnw f[2-
> > dawjv094 tn3oh9k04r3 309r3hg854mvrm3w0v5nw[0 v9");
> > testString.append((char)10000);
> > testString.append("qmgn vjdsop 00 89w nv3ni0vr nmv p3orm vnrv rm v
> > fw mdndw sjuio490n v uckm4uv4n fj iivkmcj");
> > testString.append((char)100000);
> > testString.append("o489jnrbnv8m 5tjvb6fci9 uv77vj vu v7v 678i9ls
> > fdgo09 i9 r98 jk f78 fm,f juy fiker fdmf");
> > testString.append((char)1000000);
> > testString.append("irmvn 984mn juf78 km 4 d0v76 7 m j37 67k
> > 6mbvjk8cv56 6yjn r vcjv u7849md cx;df]c0-8 end");
> > testString.append((char)10000000);
> >
> > time1 = new NSTimestamp();
> > resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> > time2 = new NSTimestamp();
> > diffMillis = 0;
> > startCal.setTime(time1);
> > endCal.setTime(time2);
> > diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> > NSLog.debug.appendln("Array map time to parse string: "+diffMillis);
> >
> > Main page = (Main)pageWithName("Main");
> > page.setVersion(resultString);
> >
> > return page;
> > }
> >
> > public WOActionResults charSpeedRegexAction() {
> > // build test string
> > StringBuffer testString = new StringBuffer("kfdlas;n 0wqm dsagjnoisa
> > fd;af[aghjr3q-tifnewna fafjpewiq nor0dafnlw;l jfh0w flw;f saofh8");
> > testString.append((char)1000);
> > testString.append("fd0 f023 fkdls anflrwjap fsa[w fjnw f[2-
> > dawjv094 tn3oh9k04r3 309r3hg854mvrm3w0v5nw[0 v9");
> > testString.append((char)10000);
> > testString.append("qmgn vjdsop 00 89w nv3ni0vr nmv p3orm vnrv rm v
> > fw mdndw sjuio490n v uckm4uv4n fj iivkmcj");
> > testString.append((char)100000);
> > testString.append("o489jnrbnv8m 5tjvb6fci9 uv77vj vu v7v 678i9ls
> > fdgo09 i9 r98 jk f78 fm,f juy fiker fdmf");
> > testString.append((char)1000000);
> > testString.append("irmvn 984mn juf78 km 4 d0v76 7 m j37 67k
> > 6mbvjk8cv56 6yjn r vcjv u7849md cx;df]c0-8 end");
> > testString.append((char)10000000);
> >
> > // Strip illegal characters.
> > NSTimestamp time1 = new NSTimestamp();
> > ISOLatin1CharacterUtilityRegex charUtility = new
> > ISOLatin1CharacterUtilityRegex();
> > String resultString =
> > charUtility.stripInvalidCharsFromString(testString.toString());
> > NSTimestamp time2 = new NSTimestamp();
> > GregorianCalendar startCal = new GregorianCalendar();
> > GregorianCalendar endCal = new GregorianCalendar();
> > long diffMillis = 0;
> > startCal.setTime(time1);
> > endCal.setTime(time2);
> > diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> > NSLog.debug.appendln("Regex time to parse string: "+diffMillis);
> >
> > testString = new StringBuffer("9km4bnchj 8jk4 v y739dfj jme89vu8 n
> > d';v FRkfvK U*N UO (F&^# VM#I( )*$ JM KOW#M$ @M< IF");
> > testString.append((char)1000);
> > testString.append("4mvj930 fn89 2 no98304 nr0mj v8v87395 09vm vwlr e
> > ;vd s,mnrv K VUYRMNVDHJ SUISVI DVO$MMV");
> > testString.append((char)10000);
> > testString.append("i4m *$N lfju67 K$(N kjgurn jkd7 KMN* JND^&V
> > kf9]6l4m,d id 8f4 j md k3idd8j4m cems duij4m");
> > testString.append((char)100000);
> > testString.append("imn4nf8 IUj4nvjud8mner iec 883mnd J893M K
> > VEniw8923m mdwjw8m vmskl w o290894 vw m s s94");
> > testString.append((char)1000000);
> > testString.append("fwjo wo fro3neqwvr03 f94fdwc J VW)RJ)VJ EQW(
> > VNDSHVV@HPNVDSOPV)J*(J V)W)RHjiwo vhjdwj vlj");
> > testString.append((char)10000000);
> >
> > time1 = new NSTimestamp();
> > resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> > time2 = new NSTimestamp();
> > diffMillis = 0;
> > startCal.setTime(time1);
> > endCal.setTime(time2);
> > diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> > NSLog.debug.appendln("Regex time to parse string: "+diffMillis);
> >
> > testString = new StringBuffer("fj fjau");
> > testString.append((char)1000);
> > testString.append("fafj daf ds");
> > testString.append((char)10000);
> > testString.append("fjw csl jw ");
> > testString.append((char)100000);
> > testString.append("M)_CQ)");
> > testString.append((char)1000000);
> > testString.append("K(@*NE");
> > testString.append((char)10000000);
> >
> > time1 = new NSTimestamp();
> > resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> > time2 = new NSTimestamp();
> > diffMillis = 0;
> > startCal.setTime(time1);
> > endCal.setTime(time2);
> > diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> > NSLog.debug.appendln("Regex time to parse string: "+diffMillis);
> >
> > testString = new StringBuffer("9km4bnchj 8jk4 v y739dfj jme89vu8 n
> > d';v FRkfvK U*N UO (F&^# VM#I( )*$ JM KOW#M$ @M< IF");
> > testString.append((char)1000);
> > testString.append("4mvj930 fn89 2 no98304 nr0mj v8v87395 09vm vwlr e
> > ;vd s,mnrv K VUYRMNVDHJ SUISVI DVO$MMV");
> > testString.append((char)10000);
> > testString.append("i4m *$N lfju67 K$(N kjgurn jkd7 KMN* JND^&V
> > kf9]6l4m,d id 8f4 j md k3idd8j4m cems duij4m");
> > testString.append((char)100000);
> > testString.append("imn4nf8 IUj4nvjud8mner iec 883mnd J893M K
> > VEniw8923m mdwjw8m vmskl w o290894 vw m s s94");
> > testString.append((char)1000000);
> > testString.append("fwjo wo fro3neqwvr03 f94fdwc J VW)RJ)VJ EQW(
> > VNDSHVV@HPNVDSOPV)J*(J V)W)RHjiwo vhjdwj vlj");
> > testString.append((char)10000000);
> >
> > time1 = new NSTimestamp();
> > resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> > time2 = new NSTimestamp();
> > diffMillis = 0;
> > startCal.setTime(time1);
> > endCal.setTime(time2);
> > diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> > NSLog.debug.appendln("Regex time to parse string: "+diffMillis);
> >
> > testString = new StringBuffer("kfdlas;n 0wqm dsagjnoisa
> > fd;af[aghjr3q-tifnewna fafjpewiq nor0dafnlw;l jfh0w flw;f saofh8");
> > testString.append((char)1000);
> > testString.append("fd0 f023 fkdls anflrwjap fsa[w fjnw f[2-
> > dawjv094 tn3oh9k04r3 309r3hg854mvrm3w0v5nw[0 v9");
> > testString.append((char)10000);
> > testString.append("qmgn vjdsop 00 89w nv3ni0vr nmv p3orm vnrv rm v
> > fw mdndw sjuio490n v uckm4uv4n fj iivkmcj");
> > testString.append((char)100000);
> > testString.append("o489jnrbnv8m 5tjvb6fci9 uv77vj vu v7v 678i9ls
> > fdgo09 i9 r98 jk f78 fm,f juy fiker fdmf");
> > testString.append((char)1000000);
> > testString.append("irmvn 984mn juf78 km 4 d0v76 7 m j37 67k
> > 6mbvjk8cv56 6yjn r vcjv u7849md cx;df]c0-8 end");
> > testString.append((char)10000000);
> >
> > time1 = new NSTimestamp();
> > resultString = charUtility.stripInvalidCharsFromString
> > (testString.toString());
> > time2 = new NSTimestamp();
> > diffMillis = 0;
> > startCal.setTime(time1);
> > endCal.setTime(time2);
> > diffMillis = endCal.getTimeInMillis() - startCal.getTimeInMillis();
> > NSLog.debug.appendln("Regex time to parse string: "+diffMillis);
> >
> >
> > Main page = (Main)pageWithName("Main");
> > page.setVersion(resultString);
> >
> > return page;
> > }
> >
> > }
> >
> >
> > ISOLatin1CharacterUtilityArrayMap.java
> > ----------------------------------------------------------
> > //
> > // ISOLatin1CharacterUtilityArrayMap.java
> > // Norway
> > //
> > // Created by Eric Stewart on 2/19/06.
> > // Copyright 2006 __MyCompanyName__. All rights reserved.
> > //
> >
> > public class ISOLatin1CharacterUtilityArrayMap {
> > private boolean[] charMap = new boolean[256];
> >
> > public ISOLatin1CharacterUtilityArrayMap() {
> > // Initialize ISO-8859-1 character map.
> > charMap[9] = true; charMap[10] = true; charMap[13] = true;
> > charMap[32] = true;
> > charMap[33] = true; charMap[34] = true; charMap[35] = true;
> > charMap[36] = true;
> > charMap[37] = true; charMap[38] = true; charMap[39] = true;
> > charMap[40] = true;
> > charMap[41] = true; charMap[42] = true; charMap[43] = true;
> > charMap[44] = true;
> > charMap[45] = true; charMap[46] = true; charMap[47] = true;
> > charMap[48] = true;
> > charMap[49] = true; charMap[50] = true; charMap[51] = true;
> > charMap[52] = true;
> > charMap[53] = true; charMap[54] = true; charMap[55] = true;
> > charMap[56] = true;
> > charMap[57] = true; charMap[58] = true; charMap[59] = true;
> > charMap[60] = true;
> > charMap[61] = true; charMap[62] = true; charMap[63] = true;
> > charMap[64] = true;
> > charMap[65] = true; charMap[66] = true; charMap[67] = true;
> > charMap[68] = true;
> > charMap[69] = true; charMap[70] = true; charMap[71] = true;
> > charMap[72] = true;
> > charMap[73] = true; charMap[74] = true; charMap[75] = true;
> > charMap[76] = true;
> > charMap[77] = true; charMap[78] = true; charMap[79] = true;
> > charMap[80] = true;
> > charMap[81] = true; charMap[82] = true; charMap[83] = true;
> > charMap[84] = true;
> > charMap[85] = true; charMap[86] = true; charMap[87] = true;
> > charMap[88] = true;
> > charMap[89] = true; charMap[90] = true; charMap[91] = true;
> > charMap[92] = true;
> > charMap[93] = true; charMap[94] = true; charMap[95] = true;
> > charMap[96] = true;
> > charMap[97] = true; charMap[98] = true; charMap[99] = true;
> > charMap[100] = true;
> > charMap[101] = true; charMap[102] = true; charMap[103] = true;
> > charMap[104] = true;
> > charMap[105] = true; charMap[106] = true; charMap[107] = true;
> > charMap[108] = true;
> > charMap[109] = true; charMap[110] = true; charMap[111] = true;
> > charMap[112] = true;
> > charMap[113] = true; charMap[114] = true; charMap[115] = true;
> > charMap[116] = true;
> > charMap[117] = true; charMap[118] = true; charMap[119] = true;
> > charMap[120] = true;
> > charMap[121] = true; charMap[122] = true; charMap[123] = true;
> > charMap[124] = true;
> > charMap[125] = true; charMap[126] = true; charMap[160] = true;
> > charMap[161] = true;
> > charMap[162] = true; charMap[163] = true; charMap[164] = true;
> > charMap[165] = true;
> > charMap[166] = true; charMap[167] = true; charMap[168] = true;
> > charMap[169] = true;
> > charMap[170] = true; charMap[171] = true; charMap[172] = true;
> > charMap[173] = true;
> > charMap[174] = true; charMap[175] = true; charMap[176] = true;
> > charMap[177] = true;
> > charMap[178] = true; charMap[179] = true; charMap[180] = true;
> > charMap[181] = true;
> > charMap[182] = true; charMap[183] = true; charMap[184] = true;
> > charMap[185] = true;
> > charMap[186] = true; charMap[187] = true; charMap[188] = true;
> > charMap[189] = true;
> > charMap[190] = true; charMap[191] = true; charMap[192] = true;
> > charMap[193] = true;
> > charMap[194] = true; charMap[195] = true; charMap[196] = true;
> > charMap[197] = true;
> > charMap[198] = true; charMap[199] = true; charMap[200] = true;
> > charMap[201] = true;
> > charMap[202] = true; charMap[203] = true; charMap[204] = true;
> > charMap[205] = true;
> > charMap[206] = true; charMap[207] = true; charMap[208] = true;
> > charMap[209] = true;
> > charMap[210] = true; charMap[211] = true; charMap[212] = true;
> > charMap[213] = true;
> > charMap[214] = true; charMap[215] = true; charMap[216] = true;
> > charMap[217] = true;
> > charMap[218] = true; charMap[219] = true; charMap[220] = true;
> > charMap[221] = true;
> > charMap[222] = true; charMap[223] = true; charMap[224] = true;
> > charMap[225] = true;
> > charMap[226] = true; charMap[227] = true; charMap[228] = true;
> > charMap[229] = true;
> > charMap[230] = true; charMap[231] = true; charMap[232] = true;
> > charMap[233] = true;
> > charMap[234] = true; charMap[235] = true; charMap[236] = true;
> > charMap[237] = true;
> > charMap[238] = true; charMap[239] = true; charMap[240] = true;
> > charMap[241] = true;
> > charMap[242] = true; charMap[243] = true; charMap[244] = true;
> > charMap[245] = true;
> > charMap[246] = true; charMap[247] = true; charMap[248] = true;
> > charMap[249] = true;
> > charMap[250] = true; charMap[251] = true; charMap[252] = true;
> > charMap[253] = true;
> > charMap[254] = true; charMap[255] = true;
> > }
> >
> > /*
> > * Determines if a char is a valid ISO-8859-1 character.
> > */
> > public boolean isCharValid(char value) {
> > if (((int)value) < 256 && charMap[(int)value]) {
> > return true;
> > } else {
> > return false;
> > }
> > }
> >
> > /*
> > * Returns a string clean of all invalid ISO-8859-1 characters.
> > */
> > public String stripInvalidCharsFromString(String value) {
> > StringBuffer buffer = new StringBuffer();
> > char[] charArray = value.toCharArray();
> > int charArrayLength = charArray.length;
> > for (int i = 0; i < charArrayLength; i++) {
> > if (((int)charArray[i]) < 256 && charMap[(int)charArray[i]]) {
> > buffer.append(charArray[i]);
> > }
> > }
> > return buffer.toString();
> > }
> > }
> >
> > ISOLatin1CharacterUtilityRegex.java
> > -----------------------------------------------------
> > //
> > // ISOLatin1CharCleaner.java
> > // Norway
> > //
> > // Created by Eric Stewart on 2/15/06.
> > // Copyright 2006 __MyCompanyName__. All rights reserved.
> > //
> >
> > import java.util.regex.*;
> >
> > public class ISOLatin1CharacterUtilityRegex {
> >
> > public ISOLatin1CharacterUtilityRegex() {
> > }
> >
> > /*
> > * Returns a string clean of all invalid ISO-8859-1 characters.
> > */
> > public String stripInvalidCharsFromString(String value) {
> > String regExp = "[^\\x09\\x0A\\x0D\\x20-\\x7E\\xA0-\\xFF]+";
> > Pattern p = Pattern.compile(regExp);
> > String result = p.matcher(value).replaceAll("");
> > return result;
> > }
> > }
> > _______________________________________________
> > Do not post admin requests to the list. They will be ignored.
> > Webobjects-dev mailing list (email@hidden)
> > Help/Unsubscribe/Update your Subscription:
> > 40logicunited.com
> >
> > This email sent to email@hidden
>
>
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden