Re: Implementing a pivot table / cross table
Re: Implementing a pivot table / cross table
- Subject: Re: Implementing a pivot table / cross table
- From: Jean-François Veillette <email@hidden>
- Date: Fri, 17 Mar 2006 10:01:34 -0500
Le 06-03-17, à 08:51, Rico Landefeld a écrit :
hi jean-françois,
thank you for your answer. could you send me your class ? that would be nice. for better understanding i'll describe my requirements a little bit deeper.
i have a entity "failure" and some "failure attribute" enties like "outcome", "occurence", "origin" with relations to "failure". this failure attributes are hierarchical organized. in case of origin "extern -> subcontractor -> company_1" for example. only the root object of an hierarchy has an relation to "failure". the hierarchy is at most 4 levels deep. i want to analyze the costs and number of failures with failure attributes (entites) as dimensions.
Ok, from what I understand, you would need to fetch all 'Failure' you want to analyse. That is your population.
You define 'dimension' as 'key-path' (by default), or by whatever you want (using a delegate to do the segmentation).
Assuming you can do 'key-path' (you will decide), just find the key-path (no problem with crossing relationship) that give you a value that will be used for segmentation.
For example, people with attribute 'sex' can give you ('M', 'F' and null), so you can segment based on 'sex' and you will get buckets, filled with boys, girls and 'null' sex peoples (if any).
In your case, I do not know you model enough to indicate if a key-path can be found from 'Failure' that will be used to segment it from one bucket from the other. I guess 'Failure' to 'FailureAttribute' is a to-many relationship. If you want to segment 'Failures' based on some sort of elements inside that relationship, either add a method in 'Failure' that will return a unique value based on the content of what is found on that relationship, so you can segment based on that key-path that will return a value that will differentiate your 'Failure' from each other. Or use a delegate and implement whatever you want in there to differentiate Failures for a given dimension.
Or maybe just 'getClass' would be a good 'key-path'.
In your case, once you get an NSArray for a given bucket. to get statistical values out of it, you might want to count the population for that bucket, add the 'cost' (assuming there is such attribute) for those objects, etc.
I'm not sure with your model, so I'll continue with another model you will probably understand.
First, here is the file (hope you can read french):
package org.champlain.tamia;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Enumeration;
import com.webobjects.foundation.NSArray;
import com.webobjects.foundation.NSMutableArray;
import com.webobjects.foundation.NSKeyValueCodingAdditions;
/**
* La classe <code>GestionnaireSyntheses</code> s'occupe de synthétiser des information et de les grouper selon un de leur attribut;
*
* @version $Revision$, $Date$
* <br>© 2005 Jean-François Veillette. Tous droits réservés.
*/
public class GestionnaireSyntheses {
static Delegue delegueDeBase = new DelegueDefaut();
HashMap syntheses;
public GestionnaireSyntheses() {
super();
syntheses = new HashMap();
}
/** Methode d'acces externe pour profiter d'une cache des calculs. */
public HashMap synthesePourObjetPopulationKeyPath(Object o, NSArray a, String kp, Delegue delegue) {
return synthesePourObjetPopulationKeyPath(o, a, new String[]{kp}, delegue);
}
/** Methode d'acces externe pour profiter d'une cache des calculs. */
public HashMap synthesePourObjetPopulationKeyPath(Object o, NSArray a, String[] clefs, Delegue delegue) {
HashSet clefSet = (HashSet)new HashSet(java.util.Arrays.asList(clefs));
HashMap oHash = (HashMap)syntheses.get(o);
if(oHash == null) {
oHash = new HashMap();
syntheses.put(o, oHash);
}
HashMap synthese = (HashMap)oHash.get(clefSet);
if(synthese == null) {
synthese = regroupePopulation(a, clefs, delegue);
oHash.put(clefSet, synthese);
}
return synthese;
}
/** Methode de calcul des syntheses sur 'n' clef d'évaluation croisé. (matrice 'n' dimension) */
public HashMap regroupePopulation(NSArray population, String[] clefs, Delegue delegue) {
if(delegue == null) {
delegue = delegueDeBase;
}
HashMap synthese = new HashMap();
Enumeration enu = population.objectEnumerator();
while(enu.hasMoreElements()) {
NSKeyValueCodingAdditions item = (NSKeyValueCodingAdditions)enu.nextElement();
// génère la clé d'identification de la cellule
HashSet clefSet = new HashSet();
NSMutableArray l = new NSMutableArray();
for(int i = 0 ; i < clefs.length ; i++ ) {
String cl = clefs[i];
Object model = delegue.objetDeRegroupementPourClef(item, cl);
ClefValeur cv = new ClefValeur(cl, model);
l.addObject(cv);
// ajoute a la liste des clefs possible par groupements
HashSet listeParGroupe = (HashSet)synthese.get(cl);
if(listeParGroupe == null) {
listeParGroupe = new HashSet();
synthese.put(cl, listeParGroupe);
}
listeParGroupe.add(cv);
}
ajouteAuxBuckets(synthese, l, item);
}
return synthese;
}
/** Ajoute l'objet 'obj' dans tout les 'bucket' créé par chacune des combinaisons de clef possible dans 'attributs' */
void ajouteAuxBuckets(HashMap synthese, NSArray attributs, Object obj) {
// pour chacune des combinaisons de clef, on prend le 'bucket' et on ajoute à sa population
Enumeration attEnum = attributs.objectEnumerator();
NSMutableArray clefGenere = new NSMutableArray();
while(attEnum.hasMoreElements()) {
// on génère les combinaisons
ClefValeur cv = (ClefValeur)attEnum.nextElement();
Enumeration clefEnum = clefGenere.objectEnumerator();
while(clefEnum.hasMoreElements()) {
HashSet clefs = (HashSet)clefEnum.nextElement();
HashSet nc = (HashSet)clefs.clone();
nc.add(cv);
ajouteAuBucket(synthese, nc, obj);
}
HashSet b = new HashSet();
b.add(cv);
ajouteAuBucket(synthese, b, obj);
clefGenere.addObject(b);
}
}
/** Ajoute l'objet 'obj' dans la liste associé au 'bucket'. */
void ajouteAuBucket(HashMap synthese, HashSet bucket, Object obj) {
NSMutableArray b = (NSMutableArray)synthese.get(bucket);
if(b == null) {
b = new NSMutableArray();
synthese.put(bucket, b);
}
b.addObject(obj);
}
/** Object utilisé pour représenté un élément de clef dans l'objet de synthese produit par la classe GestionnaireSyntheses. */
public static class ClefValeur {
public String clef;
public Object valeur;
public ClefValeur(String c, Object v) {
super();
clef = c;
valeur = v;
}
public int hashCode() {
int vh = valeur == null ? 0 : valeur.hashCode();
int ch = clef == null ? 0 : clef.hashCode();
return ch + vh;
}
public boolean equals(Object b) {
if (b instanceof ClefValeur) {
ClefValeur bb = (ClefValeur)b;
return ((clef == null) || (bb == null) || (valeur == null) ) || clef.equals(bb.clef) && valeur.equals(bb.valeur);
}
return false;
}
public String toString() {
return "<clefValeur clef="+clef+" valeur="+valeur+" hashCode="+hashCode()+" />";
}
}
/** Un délégué est utilisé pour obtenir les valeurs pour chacunes des clefs aupres des objets de la population. L'implantation par défaut utilise valueForKeyPath sur l'objet, mais ceci peut ne pas retourner l'objet désiré pour un regroupement particulier. Par exemple si l'on veut regrouper dans un même groupe les objets dont l'attribut 'age' est entre [0-6], [7-13], [14-18], [19-35], etc. Le délégué peut obtenir une valeur et ainsi regrouper dans un même ensemble des élément ayant apparament des valeurs différentes. */
static public interface Delegue {
public Object objetDeRegroupementPourClef(Object objet, String clef);
}
static public class DelegueDefaut implements Delegue {
public Object objetDeRegroupementPourClef(Object objet, String clef) {
return NSKeyValueCodingAdditions.DefaultImplementation.valueForKeyPath(objet, clef);
}
}
}
For example, to produce a table matrix (2 dimensions), use the following code to segment product (think a movie or a book) appreciations based on 'age' and 'sex'.
public HashMap statistique() {
String[] clefs = new String[]{"age", "codeSexe"};
return new GestionnaireSyntheses().synthesePourObjetPopulationKeyPath(produit(), populationTotal(), clefs, null);
}
The first argument is only used to 'cache' the calculation result in case you get asked the same multiple times. (This is 'buggy' design and shouldn't be part of this api. But it's still there, so we have to deal with it until I correct it.) For now, just put a constant, and ignore what it was meant to do (or take a look at the code and remove the related cache).
Note that 'age' and 'codeSexe' are attributes (key-path) of population objects.
Then to get segmented results :
for crossed dimension population :
/** @TypeInfo Appreciation */
public NSArray populationCroise() {
HashSet hs = new HashSet();
hs.add(sexeItem);
hs.add(ageItem);
return (NSArray)statistique().get(hs);
}
For only one dimension, sex :
/** @TypeInfo Appreciation */
public NSArray populationTotalSexe() {
HashSet hs = new HashSet();
hs.add(sexeItem);
return (NSArray)statistique().get(hs);
}
For only one dimension, age :
/** @TypeInfo Appreciation */
public NSArray populationTotalAge() {
HashSet hs = new HashSet();
hs.add(ageItem);
return (NSArray)statistique().get(hs);
}
For the whole population, ask your model : :-) ... yes it's the same as the population we started with
/** @TypeInfo Appreciation */
public NSArray populationTotal() {
return produit().appreciations();
}
To get the list of 'sex' found in the population :
HashSet hs = (HashSet)statistique().get("codeSexe");
statistique().get(...), return an HashSet if given a segmentation key. the set contain values found for that key.
statistique().get(...), return a NSArray if given a key-value set corresponding to matrix coordinate. the resulting nsarray contain the population found for that segment.
Look at the api for a delegate, in case you need more specialised handling than the default 'key-path'. I used a delegate to group toghether in a single bucket multiples values found in one attribute. For example, group all ages from 10-20 in the same bucket.
This code is not really ready for public (not much documentation, no example, hard to read, french based), but if you want to give it a try, go for it and ask questions, I'll answer. Once figured out, It work great.
maybe this requirements are heavy enough for dyna-reporting?
Dyna-Reporting can do it, but the provided classe can do it just as well.
They both segment everything in memory, you will have to fetch your population first either tool you use.
- jfv
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list (email@hidden)
Help/Unsubscribe/Update your Subscription:
This email sent to email@hidden