Re: Hierarchical Data Handling
Re: Hierarchical Data Handling
- Subject: Re: Hierarchical Data Handling
- From: Mike Schrag <email@hidden>
- Date: Mon, 16 Jul 2007 13:24:33 -0400
I don't recall the specific email you're referring, to, but welcome
to one of the crappiest things to model for performance ... There is
no good way to solve this problem that I've found. There are a bunch
of really clever techniques, but you really can't escape doing a
bunch of updates for each insert if you want to be able to query
fast. Our case of this we have a graph structure rather than a tree
structure, which makes it even nastier. We ended up flattening out
the node relationships, so if A=>B=>C, there is an AB, BC, and AC
record, each of which has a direct reference count and inherited
reference count. This reference count is necessary because if you
have A=>B=>C, A=>D=>B=>C, removing B from A does not actually remove
C from A, because it's a descendent of A through two ways. Instead,
you can decrement the appropriate reference counts, and only when the
sum of direct+inherited = 0 do you ACTUALLY remove the relationship.
All of these methods have their own problems, but this has worked OK
for us. The big hit you take is if you have a larger number of leaf
nodes, because you end up generating a lot of inherited relationship
entries that you would not otherwise have. However, in exchange, you
get basically direct querying (you can just query for leaves in A and
get all the descendants). You could also limit the depth and
represent the hierarchy in a flat structure record, but this ends up
being even strange, in my experience.
Hopefully this will give you a start ... There is a lot written on
this particular topic, but almost all the solutions are surprisingly
bad given how common of a problem it is (this essentially is the
employee-manager model, which has got to be the most common database
model ever made). Oracle has an actual keyword specifically for this
(for trees, not for graphs), but it's really limited with what you
can do once you query against it (at least back in 9i, you could not
join a result that you obtained with a "tree query"). There are some
really clever implementations that use "high-low" values on nodes
that you can use to find all the descendants, but these also require
a lot of recalculation. Basically, there's no free lunch on this one.
ms
On Jul 16, 2007, at 12:01 PM, Cornelius Jaeger wrote:
Hi Mike
I remember seeing a post you sent about dealing with hierarchical
data, but i can't find it anymore.
I have a "convenience" model that i handle using recursion and the
data sets have just grown too large so i think the fetches are too
slow.
something with appending path values using a flat table rather than
a hierarchical one i thinked you talked about, i remember seeing
lots of letters A.B.D.R.D. etc
can you send me that post again or give me a tip/reading assignment
on how to implement hierarchies in one table or flat tables?
cheers
cornelius
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden