Avoiding duplicate records
Avoiding duplicate records
- Subject: Avoiding duplicate records
- From: Miguel Arroz <email@hidden>
- Date: Tue, 15 Jan 2008 14:55:27 +0000
Hi!
I'm thinking how to approach the following problem, and I would
like to know opinions about this, because I may be overcomplicating
this, as I often do.
I need to manage contact lists. A contact is an object with an
email, first name, last name, and some flags. The important thing is
the email, that's what make a contact unique.
A contact list may have tens of thousands of contacts (this is not
a theoretical limit, it's a requirement), and cannot have duplicate
records (ie, two contacts with the same email).
Well, my first approach is to create a restriction on the DB that
will prevent the existence of two records with the same email on the
same contact list.
Then, let's suppose I have a contact list with 10k contacts, and
I'm adding another 10k contacts. The basic approach is:
1) Divide the 10k in batches of 100, to make this manageable.
2) Try to insert the 100 contacts.
3) If an exception raises due to the UNIQUE constraint, remove the
offending object and try again.
This has an obvious problem, which is the fact that in the worst
case, the 100 contacts may be repeated, making this very inefficient.
So, what I though was, if I have a failure:
1) Fo a fetch request to get the contacts with the emails of the
100 contacts batch (ie, blablabla where email = email1 or email =
email2 or email = email3 ...).
2) Remove duplicates in memory using a fast method, like putting
the stuff in NSSets or whatever.
3) Try to save again. Of course, it may still fail (concurrency
sucks) but the probability is much lower.
This is all thought with the assumption that the UNIQUE-related
exception is thrown when the first offending object is inserted, so I
won't get all the information I need in one single exception, which
I'm not 100% sure it's true yet.
So... suggestions! Is this too crappy? :)
Yours
Miguel Arroz
Miguel Arroz
http://www.terminalapp.net
http://www.ipragma.com
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden