CluE  1.0
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Macros Groups Pages
CluE::CFTree< T > Class Template Reference

Clustering feature tree. More...

#include <cftree.h>

Collaboration diagram for CluE::CFTree< T >:
Collaboration graph

Classes

struct  ThresholdCalculator
 Threshold calculation functor class. More...
 

Public Types

enum  ExceptionIds { CANNOT_MERGE_NODE_ITSELF, CANNOT_MERGE_NODE_DIFFERENT_PARENTS, CANNOT_MERGE_LEAF_NODES }
 

Public Member Functions

 CFTree (EuclideanSpaceProvider< T > *euclidianProvider, DissimilarityMeasure< CFEntry< T > > *clusterDistanceMeasure, AttributeCalculator< CFEntry< T > > *thresholdAttribute, ThresholdCalculator *thresholdFunctor, int innerBranching, int leafBranching, double threshold, int maxSize)
 Constructs a CFTree for a given vector space. More...
 
 CFTree (const CFTree< T > &)
 
CFTree< T > & operator= (const CFTree< T > &)
 
virtual ~CFTree ()
 
void insert (T point)
 Inserts a point. More...
 
void insert (CFEntry< T > feature)
 Inserts a CFEntry (clustering feature) More...
 
void rebuild (double newThreshold)
 Rebuild the tree. More...
 
std::vector< CFEntry< T > > leafEntries ()
 Returns the leaf entries. More...
 
int size () const
 Counts the number of nodes. More...
 
size_t numberOfPoints () const
 Counts the number of contained points. More...
 

Private Member Functions

lemon::ListDigraph::Node addNode ()
 Wrapper to maintain the graph maps. More...
 
void eraseNode (lemon::ListDigraph::Node node)
 Wrapper to maintain the graph maps. More...
 
void rebuildTraverse (lemon::ListDigraph::Node currentOldNode, CFTree< T > &newTree, lemon::ListDigraph::Node currentNewNode)
 Used by rebuild() More...
 
void leafEntriesTraverse (lemon::ListDigraph::Node currentNode, std::vector< CFEntry< T > > *vec)
 
void clean (lemon::ListDigraph::Node node)
 Remove empty clustering features. More...
 
std::pair
< lemon::ListDigraph::Node,
lemon::ListDigraph::Node > 
split (lemon::ListDigraph::Node node, CFEntry< T > *keepTrack=0, bool fillUp=false)
 Splits a node. More...
 
lemon::ListDigraph::Node merge (lemon::ListDigraph::Node node1, lemon::ListDigraph::Node node2)
 Merges two node entries. More...
 

Private Attributes

int innerBranching
 
int leafBranching
 
double threshold
 
int maxSize
 
EuclideanSpaceProvider< T > * euclidianProvider
 
DissimilarityMeasure< CFEntry
< T > > * 
clusterDistanceMeasure
 
AttributeCalculator< CFEntry
< T > > * 
thresholdAttribute
 
ThresholdCalculatorthresholdFunctor
 
lemon::ListDigraph tree
 
lemon::ListDigraph::ArcMap
< CFEntry< T > * > 
innerFeatures
 
lemon::ListDigraph::NodeMap
< std::vector< CFEntry< T > * > * > 
leafFeatures
 
lemon::ListDigraph::Node root
 

Detailed Description

template<typename T>
class CluE::CFTree< T >

Clustering feature tree.

Clustering tree as described in T. Zhang, R. Ramakrishan and M. Livny. "BIRCH: A New Data Clustering Algorithm and Its Applications". Data Mining and Knowledge Discovery, 10.1023/A:1009783824328, 1997.

Note
This class expects the following overloaded operators:
  • + : T x T -> T and +=: (vector) sum
  • - : T x T -> T and -=: (vector) subtraction
  • * : double x T -> T: scalar multiplication
  • * : T x T -> double: dot product

Definition at line 34 of file cftree.h.

Member Enumeration Documentation

template<typename T>
enum CluE::CFTree::ExceptionIds
Enumerator
CANNOT_MERGE_NODE_ITSELF 
CANNOT_MERGE_NODE_DIFFERENT_PARENTS 
CANNOT_MERGE_LEAF_NODES 

Definition at line 96 of file cftree.h.

Constructor & Destructor Documentation

template<typename T >
CluE::CFTree< T >::CFTree ( EuclideanSpaceProvider< T > *  euclidianProvider,
DissimilarityMeasure< CFEntry< T > > *  clusterDistanceMeasure,
AttributeCalculator< CFEntry< T > > *  thresholdAttribute,
ThresholdCalculator thresholdFunctor,
int  innerBranching,
int  leafBranching,
double  threshold,
int  maxSize 
)

Constructs a CFTree for a given vector space.

Parameters
euclidianProviderEuclidiean vector space provider
clusterDistanceMeasureTo calculate the distance between to given CFEntries
thresholdAttributeTo calculate the value of the threshold attribute of a given CFEntry
innerBranchingInner branching factor
leafBranchingLeaf branching factor
thresholdCFEntry size threshold
See also
"BIRCH: A New Data Clustering Algorithm and Its Applications", pp. 147-148 (see above)

Definition at line 181 of file cftree.h.

template<typename T >
CluE::CFTree< T >::CFTree ( const CFTree< T > &  x)

Definition at line 201 of file cftree.h.

template<typename T >
CluE::CFTree< T >::~CFTree ( )
virtual

Definition at line 300 of file cftree.h.

Member Function Documentation

template<typename T >
CFTree< T > & CluE::CFTree< T >::operator= ( const CFTree< T > &  x)

Definition at line 238 of file cftree.h.

template<typename T >
void CluE::CFTree< T >::insert ( point)

Inserts a point.

See also
"BIRCH: A New Data Clustering Algorithm and Its Applications", p. 149 (see above)
Warning
Be sure that properly overloaded operators exist! (see note above)

Definition at line 327 of file cftree.h.

template<typename T >
void CluE::CFTree< T >::insert ( CFEntry< T >  feature)

Inserts a CFEntry (clustering feature)

See also
"BIRCH: A New Data Clustering Algorithm and Its Applications", p. 149 (see above)
Warning
Be sure that properly overloaded operators exist! (see note above)

Definition at line 333 of file cftree.h.

template<typename T >
void CluE::CFTree< T >::rebuild ( double  newThreshold)

Rebuild the tree.

Parameters
newThresholdNew threshold

Definition at line 457 of file cftree.h.

template<typename T >
std::vector< CFEntry< T > > CluE::CFTree< T >::leafEntries ( )

Returns the leaf entries.

Definition at line 578 of file cftree.h.

template<typename T >
int CluE::CFTree< T >::size ( ) const

Counts the number of nodes.

Definition at line 637 of file cftree.h.

template<typename T >
size_t CluE::CFTree< T >::numberOfPoints ( ) const

Counts the number of contained points.

Definition at line 642 of file cftree.h.

template<typename T >
lemon::ListDigraph::Node CluE::CFTree< T >::addNode ( )
private

Wrapper to maintain the graph maps.

Definition at line 901 of file cftree.h.

template<typename T >
void CluE::CFTree< T >::eraseNode ( lemon::ListDigraph::Node  node)
private

Wrapper to maintain the graph maps.

Definition at line 908 of file cftree.h.

template<typename T >
void CluE::CFTree< T >::rebuildTraverse ( lemon::ListDigraph::Node  currentOldNode,
CFTree< T > &  newTree,
lemon::ListDigraph::Node  currentNewNode 
)
private

Used by rebuild()

Definition at line 473 of file cftree.h.

template<typename T >
void CluE::CFTree< T >::leafEntriesTraverse ( lemon::ListDigraph::Node  currentNode,
std::vector< CFEntry< T > > *  vec 
)
private

Used by leafEntries()

Definition at line 585 of file cftree.h.

template<typename T >
void CluE::CFTree< T >::clean ( lemon::ListDigraph::Node  node)
private

Remove empty clustering features.

Definition at line 607 of file cftree.h.

template<typename T >
std::pair< lemon::ListDigraph::Node, lemon::ListDigraph::Node > CluE::CFTree< T >::split ( lemon::ListDigraph::Node  node,
CFEntry< T > *  keepTrack = 0,
bool  fillUp = false 
)
private

Splits a node.

Parameters
nodeNode to be splitted
keepTrackReturn the node containing this CFEntry<T>* as first element
fillUpFill up mode (for Merging Refinements)

Definition at line 650 of file cftree.h.

template<typename T >
lemon::ListDigraph::Node CluE::CFTree< T >::merge ( lemon::ListDigraph::Node  node1,
lemon::ListDigraph::Node  node2 
)
private

Merges two node entries.

Exceptions
InvalidArgumentException[CANNOT_MERGE_NODE_ITSELF] Cannot merge node with itself.
InvalidArgumentException[CANNOT_MERGE_NODE_DIFFERENT_PARENTS] Cannot merge nodes with different parents.
InvalidRuntimeConfigurationException[CANNOT_MERGE_LEAF_NODES] Cannot merge leaf entries.

Definition at line 853 of file cftree.h.

Member Data Documentation

template<typename T>
int CluE::CFTree< T >::innerBranching
private

Definition at line 164 of file cftree.h.

template<typename T>
int CluE::CFTree< T >::leafBranching
private

Definition at line 165 of file cftree.h.

template<typename T>
double CluE::CFTree< T >::threshold
private

Definition at line 166 of file cftree.h.

template<typename T>
int CluE::CFTree< T >::maxSize
private

Definition at line 167 of file cftree.h.

template<typename T>
EuclideanSpaceProvider<T>* CluE::CFTree< T >::euclidianProvider
private

Definition at line 169 of file cftree.h.

template<typename T>
DissimilarityMeasure<CFEntry<T> >* CluE::CFTree< T >::clusterDistanceMeasure
private

Definition at line 170 of file cftree.h.

template<typename T>
AttributeCalculator<CFEntry<T> >* CluE::CFTree< T >::thresholdAttribute
private

Definition at line 171 of file cftree.h.

template<typename T>
ThresholdCalculator* CluE::CFTree< T >::thresholdFunctor
private

Definition at line 172 of file cftree.h.

template<typename T>
lemon::ListDigraph CluE::CFTree< T >::tree
private

Definition at line 174 of file cftree.h.

template<typename T>
lemon::ListDigraph::ArcMap<CFEntry<T>*> CluE::CFTree< T >::innerFeatures
private

Definition at line 175 of file cftree.h.

template<typename T>
lemon::ListDigraph::NodeMap<std::vector<CFEntry<T>*>*> CluE::CFTree< T >::leafFeatures
private

Definition at line 176 of file cftree.h.

template<typename T>
lemon::ListDigraph::Node CluE::CFTree< T >::root
private

Definition at line 177 of file cftree.h.


The documentation for this class was generated from the following file: