Re: NSLinguisticTagger
Re: NSLinguisticTagger
- Subject: Re: NSLinguisticTagger
- From: "Gerriet M. Denkmann" <email@hidden>
- Date: Wed, 24 Sep 2014 12:02:11 +0700
On 24 Sep 2014, at 11:46, Roland King <email@hidden> wrote:
>
>> On 24 Sep 2014, at 12:31 pm, Gerriet M. Denkmann <email@hidden> wrote:
>>
>> I have a problem with NSLinguisticTagger / CFStringTokenizer on iOS 8.0
>>
>> OS X 10.9.5 (and iOS 7 and earlier) parses "สีเหลือง" quite rightly as two words: "สี" = colour and "เหลือง" = yellow.
>>
>> No dictionary will ever contain "yellow colour". Every dictionary will contain "yellow" and "colour".
>> There are hundreds, if not thousands of these expressions, which are wrongly classified as one word.
>> Might have something to do with the new predictive keyboard.
>>
>> But I am not writing this to complain, but to ask for a favour: could anybody on 10.10 just click anywhere in: "สีเหลือง" and tell me whether all gets highlighted, or just a part (as in 10.9.5)?
>
>
> If I double click anywhere on the right of that I get the second part (all bar the first character) highlighted. Clicking on the first character I get just that character. So 10.10 (beta 8) splits that sequence into two ‘words’.
This is a big relief. Thanks a lot.
>
> Why do you suspect the predictive keyboard? Certainly wouldn’t be the first thing I thought of seeing that issue. I would probably instead assume I’d written myself a bug.
Well, here is the code; maybe you can find a bug:
let text = "สีเหลือง"
let opts: Int = 0
let schemes = [ NSLinguisticTagSchemeTokenType, NSLinguisticTagSchemeNameTypeOrLexicalClass ]
let tagger = NSLinguisticTagger(tagSchemes: schemes, options: opts )
let nsText = text as NSString
let length = nsText.length
tagger.string = nsText
let range = NSMakeRange(0,length)
let theScheme = NSLinguisticTagSchemeTokenType
let ops = NSLinguisticTaggerOptions(0)
tagger.enumerateTagsInRange (
range,
scheme: theScheme,
options: ops,
usingBlock:
{ ( tag: String!,
tokenRange: NSRange,
sentenceRange: NSRange,
stop: UnsafeMutablePointer<ObjCBool>
) -> Void in
let word = nsText.substringWithRange(tokenRange)
println("\(tag) = \(word) " )
}
)
Gerriet.
_______________________________________________
Cocoa-dev mailing list (email@hidden)
Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden