• Open Menu Close Menu
  • Apple
  • Shopping Bag
  • Apple
  • Mac
  • iPad
  • iPhone
  • Watch
  • TV
  • Music
  • Support
  • Search apple.com
  • Shopping Bag

Lists

Open Menu Close Menu
  • Terms and Conditions
  • Lists hosted on this site
  • Email the Postmaster
  • Tips for posting to public mailing lists
Re: Use case for a "case insensitive equals" qualifier
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Use case for a "case insensitive equals" qualifier


  • Subject: Re: Use case for a "case insensitive equals" qualifier
  • From: "Morris, Mark" <email@hidden>
  • Date: Thu, 25 Feb 2016 16:43:47 +0000
  • Thread-topic: Use case for a "case insensitive equals" qualifier

Hi Samuel,

That’s certainly a possibility we’ve talked about, and for sure it would improve the performance vs. my solution. The tradeoffs to consider are that we have many columns across many tables where this is used, with 100’s of millions of records, and so each new index is a huge chunk of storage space. I think in the end, though, we should probably do as you suggest, at least for a selected subset of those columns.

As a general case of just optimizing the qualifier though, having the 4 index lookups, with the merge, is in practice much faster than one full table scan, so it seems like a useful tool to have.

Regards,
Mark

On Feb 25, 2016, at 10:16 AM, Samuel Pelletier <email@hidden> wrote:

Hi Mark,

I suggest you look into adding a collation to your column or index. I do not know how Oracle does this but it surely exists.

With an index using a collation, your query is trivial and much more efficient because the index is already case insensitive, there is no OR.

A query like the one you build generate multiple index lookup, one for each part of the OR and the server need to merge all those results before iterating into the rows, compute the UPPER for each rows and compare. Imagine the speedup if a single index lookup can be used, I would expect at least 10X faster on large dataset... probably 100X if multiple OR parts returns 1000s rows.

Regards,

Samuel

Le 25 févr. 2016 à 10:16, Morris, Mark <email@hidden> a écrit :

Hi Lon,

We run into this a lot, and I have a rather complex solution for a “case insensitive begins with” qualifier that actually works well, at least with Oracle and its indexing. My method returning this qualifier is actually about 60 lines, but most of that is optimizations for special cases. (We have terabytes of data, so every little bit helps!)

The basic idea, though, is that it only takes a couple of characters to hit on an index and make a huge difference in performance. So…

Trying to case-insensitive-equals match "email@hidden” with the search term “email@hidden”, you would build this sort of qualifier:

(emailAddress like ‘PA*’ OR emailAddress like ‘Pa*’ OR emailAddress like ‘pA*’ OR emailAddress like ‘pa*’) AND UPPER(emailAddress) = UPPER('email@hidden’)

Hope this helps!

Regards,
Mark

On Feb 24, 2016, at 7:12 PM, Chuck Hill <email@hidden> wrote:

If you always want the comparison to be case insensitive, then you should be able to specify a database collation  for the column that is insensitive.   Then just just a regular equals qualifier.  If you want it case sensitive sometimes and insensitive others, that is a different problem.

Chuck

From: <webobjects-dev-bounces+chill=email@hidden> on behalf of Lon Varscsak <email@hidden>
Date: Wednesday, February 24, 2016 at 5:08 PM
To: Paul Hoadley <email@hidden>
Cc: WebObjects Development <email@hidden>
Subject: Re: Use case for a "case insensitive equals" qualifier

Hmm, it would be interesting if there was an alternative, but really the problem lies in the database.  Any wildcard type searches (at least that start with a wildcard) are not likely to use indexes and cause a table scan.  If your database supported the concept of having case-insensitive indexes, then I would think it would be pretty trivial to implement in your own qualifier.  In Sybase (the database I use mostly) you can create a functional index (an index based on a function), but it’s the equivalent creating a second column that’s always lower and maintaining an index on it.  The difference being that the database maintains it for you (which is usually annoying for EOF).

I’ve always wanted a better way to do this too.

-Lon

On Wed, Feb 24, 2016 at 5:13 PM, Paul Hoadley <email@hidden> wrote:
Hello,

Say you have a web application where the login identifier is the user’s email address. This works in the conventional way: the user supplies that address at sign-up, and it serves two in-app functions: login identifier, and actual email address to which notifications can be sent. This is a fairly common pattern among some large, modern web apps.

It turns out that not everyone understands case sensitivity. We are seeing login failures in the wild because a user that signed up as “email@hidden” is now trying to log in with “email@hidden”, or vice versa. Here are some facts:

1. It would seem to be at least reasonably common for modern web apps that use email addresses as login identifiers to ignore case at login time. (For example, I tested a couple I had open in browser tabs: Strava and Bitbucket ignore case.)

2. Although the domain part of an email address is case-insensitive, my understanding is that the relevant RFCs suggest that you shouldn’t make assumptions about the local part. While everything I’ve read claims that in practice it will make no difference, let’s assume that we need to preserve the address as entered at sign-up. (It’s fail-safe to do so, whether we strictly need to or not.)

So, 1 is our aim: ignore case on the login identifier at login time. But because of 2, we don’t want to, say, normalise the email address given at sign-up to lower case and just store that, on the off chance that it makes a difference for mail delivery for that particular user. (Again, it probably won’t, but let’s assume that it could for the exercise.)

What are our options for finding the right User entity at login time?

1. We can jump in and naively use a CaseInsensitiveLike qualifier, but then a user can stick ‘?’ and ‘*’ wildcards in the input. We could strip those out, but they’re actually both valid characters in the local part. I stopped short of trying to escape them, as this route is starting to seem a little dangerous.

2. We could track both the supplied and a lower-cased version of the identifier in separate attributes. This has the advantage of presumably working, but it’s awkward, requiring special attention to changing the normalised attribute when the user-supplied one changes.

Can anyone suggest a better way? What I really need is a CaseInsensitiveEquals qualifier, like Java’s equalsIgnoreCase(). Is there such a thing? Would it be easily implemented?


-- 
Paul Hoadley
http://logicsquad.net/




 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden


 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (email@hidden)
Help/Unsubscribe/Update your Subscription:

This email sent to email@hidden

References: 
 >Use case for a "case insensitive equals" qualifier (From: Paul Hoadley <email@hidden>)
 >Re: Use case for a "case insensitive equals" qualifier (From: Lon Varscsak <email@hidden>)
 >Re: Use case for a "case insensitive equals" qualifier (From: Chuck Hill <email@hidden>)
 >Re: Use case for a "case insensitive equals" qualifier (From: "Morris, Mark" <email@hidden>)
 >Re: Use case for a "case insensitive equals" qualifier (From: Samuel Pelletier <email@hidden>)

  • Prev by Date: Re: Use case for a "case insensitive equals" qualifier
  • Next by Date: Re: Use case for a "case insensitive equals" qualifier
  • Previous by thread: Re: Use case for a "case insensitive equals" qualifier
  • Next by thread: Re: Use case for a "case insensitive equals" qualifier
  • Index(es):
    • Date
    • Thread