Lists

Terms and Conditions
Lists hosted on this site
Email the Postmaster
Tips for posting to public mailing lists

Re: [OT] File Encoding question again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [OT] File Encoding question again

Subject: Re: [OT] File Encoding question again
From: Ondra Cada <email@hidden>
Date: Mon, 5 Dec 2005 20:37:01 +0100

Hi,

On 5.12.2005, at 20:23, Chuck Hill wrote:

Is there a test I can run on the string (or the file) which gives back the current encoding? I could not find it in Javadoc. I can change the encoding but I need to know which encoding the file came with, as far as I understand it.

Is there a way to know? When the user exports from excel on a windows machine he neither knows nor cares (nor could I find a way to tell excel which encoding to use). But in Germany we have all those Umlauts .... so we need to be careful with the encoding.
This is not really (at all) my area of expertise, but I don't think it is possible to know by examining the file. AFAIK, all the 8 bit encodings (Mac Roman whatever, UTF-88, Win Latin, Western Latin etc), are too similar to correctly guess at.

There are heuristics which can give *comparatively* good results, but they are *never* absolutely dependable, and they tend to be complex (and, you generally need to know the language the text is written in). We are speaking of uglies like frequency analysis of the text, or even trying to run texts created by using different encodings through a spellchecker, selecting the encoding which causes the smallest number of unknown words.

Unless one has to do this, it is much better to allow the user to select the encoding freely with a feedback (are the data displayed all right? If no, try another...). --- Ondra Čada OCSoftware: email@hidden http://www.ocs.cz private email@hidden http://www.ocs.cz/oc


_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden



References:  
  >[OT] File Encoding question again (From: Ute Hoffmann <email@hidden>)
  >Re: [OT] File Encoding question again (From: Chuck Hill <email@hidden>)




Prev by Date:
Re: [OT] File Encoding question again

Next by Date:
Re: Locking/Unlocking-Problem with EditingContext

Previous by thread:
Re: [OT] File Encoding question again

Next by thread:
Newbie question, please help with WOPopupButton

Index(es):

Date
Thread