|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.classifiers.Classifier
weka.classifiers.rules.FRIP
public class FRIP
The FRip algorithm is a fuzzification of the JRip algorithm. It was designed not as a standalone classifier, but as a base-classifier for the FR3 (Fuzzy Round Robin algorithm). The main difference between FRip and JRip is that FRip is only a binary classifier which makes no use of default rules. Furthermore FRip has a changed pruning procedure, which means that the pruning during the IREP* runs was deactivated permanently. It was found out experimentally that this improved the classification rate. The following description from the JRip class was altered to describe the methodology of FRip:
Initialize RS = {}, and for each of both class DO:
1. Building stage:
Repeat 1.1 until the description length (DL) of the ruleset and examples is 64 bits greater than the smallest DL met so far, or there are no positive examples, or the error rate >= 50%.
1.1. Grow phase:
Grow one rule by greedily adding antecedents (or conditions) to the rule until the rule is perfect (i.e. 100% accurate). The procedure tries every possible value of each attribute and selects the condition with highest information gain: p(log(p/t)-log(P/T)).
2. Optimization stage:
after generating the initial ruleset {Ri}, generate and prune two variants of each rule Ri from randomized data using procedure 1.1 and X.1. But one variant is generated from an empty rule while the other is generated by greedily adding antecedents to the original rule. Moreover, the pruning metric used here is (TP+TN)/(P+N).Then the smallest possible DL for each variant and the original rule is computed. The variant with the minimal DL is selected as the final representative of Ri in the ruleset.After all the rules in {Ri} have been examined and if there are still residual positives, more rules are generated based on the residual positives using Building Stage again.
3. Delete the rules from the ruleset that would increase the DL of the whole ruleset if it were in it. and add resultant ruleset to RS.
ENDDO
Fuzzification of RS:
For each rule r in every ruleset in RS DO
4. Fuzzification of antecedents:
Apply greedy strategy to fuzzify the existing antecedents in r the following way:
4.1 Examine all possible support bounds and select the one which gains the highest purity on the training data.
4.2 Set the maximum support bound determined in 4.1 and restart with 4.1 but withouth the fuzzified antecedent.
5. Bounding of open rules:
Bound open sided rules by the last known instance value in that dimension.
6. Fuzzification of the bounded antecedent:
Fuzzify the bounded sides of the rule beyond the edge of the known dataspace.
ENDDO
X.1. Pruning:
Incrementally prune each rule and allow the pruning of any final sequences of the antecedents;The pruning metric is (p-n)/(p+n) -- but it's actually 2p/(p+n) -1, so in this implementation we simply use p/(p+n) (actually (p+1)/(p+n+2), thus if p+n is 0, it's 0.5).
* @article{huehn2008, author = {Jens Christian Hühn and Eyke Hüllermeier}, journal = {}, title = {FR3: A Fuzzy Rule Learner for Inducing Reliable Classifiers}, year = {2008} } @inproceedings{Cohen1995, author = {William W. Cohen}, booktitle = {Twelfth International Conference on Machine Learning}, pages = {115-123}, publisher = {Morgan Kaufmann}, title = {Fast Effective Rule Induction}, year = {1995} }Valid options are:
-F <number of folds> Set number of folds for REP One fold is used as pruning set. (default 3)
-N <min. weights> Set the minimal weights of instances within a split. (default 2.0)
-O <number of runs> Set the number of runs of optimizations. (Default: 2)
-D Set whether turn on the debug mode (Default: false)
-S <seed> The seed of randomization (Default: 1)
-E Whether NOT check the error rate>=0.5 in stopping criteria (default: check)Date created: 08/07/2008
Nested Class Summary | |
---|---|
protected class |
FRIP.Antd
The single antecedent in the rule, which is composed of an attribute and the corresponding value. |
protected class |
FRIP.NominalAntd
The antecedent with nominal attribute |
protected class |
FRIP.NumericAntd
The antecedent with numeric attribute |
protected class |
FRIP.RipperRule
This class implements a single rule that predicts specified class. |
Field Summary | |
---|---|
private double[][] |
dataspaceEdges
The edges of the known dataspace |
private boolean |
m_CheckErr
Whether check the error rate >= 0.5 in stopping criteria |
protected weka.core.Attribute |
m_Class
The class attribute of the data |
weka.core.Instances |
m_dataAllClasses
|
protected boolean |
m_Debug
Whether in a debug mode |
protected weka.core.FastVector |
m_Distributions
The predicted class distribution |
private int |
m_Folds
The number of folds to split data into Grow and Prune for IREP |
(package private) double |
m_MinNo
The minimal number of instance weights within a split |
private int |
m_Optimizations
Runs of optimizations |
protected java.util.Random |
m_Random
Random object used in this class |
protected weka.core.FastVector |
m_Ruleset
The ruleset |
protected weka.core.FastVector |
m_RulesetStats
The RuleStats for the ruleset of each class value |
protected long |
m_Seed
The seed to perform randomization |
protected double |
m_Total
# of all the possible conditions in a rule |
private static double |
MAX_DL_SURPLUS
The limit of description length surplus in ruleset generation |
private static long |
serialVersionUID
for serialization |
Constructor Summary | |
---|---|
FRIP()
|
Method Summary | |
---|---|
void |
buildClassifier(weka.core.Instances instances)
Builds Ripper: For each class it's built in three stages: building, optimization and fuzzification |
java.lang.String |
checkErrorRateTipText()
Returns the tip text for this property |
private boolean |
checkStop(double[] rst,
double minDL,
double dl)
Check whether the stopping criterion meets |
java.lang.String |
debugTipText()
Returns the tip text for this property |
double[] |
distributionForInstance(weka.core.Instance datum)
Classify the test instance with the rule learner and provide the class distributions |
java.util.Enumeration |
enumerateMeasures()
Returns an enumeration of the additional measure names |
java.lang.String |
foldsTipText()
Returns the tip text for this property |
weka.core.Capabilities |
getCapabilities()
Returns default capabilities of the classifier. |
boolean |
getCheckErrorRate()
Gets whether to check for error rate is in stopping criterion |
double[][] |
getDataspaceEdges()
Gets the dataspace edges |
boolean |
getDebug()
Gets whether debug information is output to the console |
int |
getFolds()
Gets the number of folds |
double |
getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure |
double |
getMinNo()
Gets the minimum total weight of the instances in a rule |
int |
getOptimizations()
Gets the the number of optimization runs |
java.lang.String[] |
getOptions()
Gets the current settings of the Classifier. |
java.lang.String |
getRevision()
Returns the revision string. |
weka.core.FastVector |
getRuleset()
Get the ruleset generated by FRipper |
weka.classifiers.rules.RuleStats |
getRuleStats(int pos)
Get the statistics of the ruleset in the given position |
long |
getSeed()
Gets the current seed value to use in randomizing the data |
weka.core.TechnicalInformation |
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on. |
java.lang.String |
globalInfo()
Returns a string describing classifier |
java.util.Enumeration |
listOptions()
Returns an enumeration describing the available options Valid options are: -F number The number of folds for reduced error pruning. |
java.lang.String |
minNoTipText()
Returns the tip text for this property |
java.lang.String |
optimizationsTipText()
Returns the tip text for this property |
protected weka.core.Instances |
rulesetForOneClass(double expFPRate,
weka.core.Instances data,
double classIndex,
double defDL)
Build a ruleset for the given class according to the given data |
java.lang.String |
seedTipText()
Returns the tip text for this property |
void |
setCheckErrorRate(boolean d)
Sets whether to check for error rate is in stopping criterion |
void |
setDataspaceEdges(double[][] d)
Sets the edges of the dataspace |
void |
setDebug(boolean d)
Sets whether debug information is output to the console |
void |
setFolds(int fold)
Sets the number of folds to use |
void |
setMinNo(double m)
Sets the minimum total weight of the instances in a rule |
void |
setOptimizations(int run)
Sets the number of optimization runs |
void |
setOptions(java.lang.String[] options)
Parses a given list of options. |
void |
setSeed(long s)
Sets the seed value to use in randomizing the data |
java.lang.String |
toString()
Prints the all the rules of the rule learner. |
Methods inherited from class weka.classifiers.Classifier |
---|
classifyInstance, forName, makeCopies, makeCopy, runClassifier |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
private static final long serialVersionUID
private static double MAX_DL_SURPLUS
protected weka.core.Attribute m_Class
protected weka.core.FastVector m_Ruleset
protected weka.core.FastVector m_Distributions
private int m_Optimizations
protected java.util.Random m_Random
protected double m_Total
protected long m_Seed
private int m_Folds
double m_MinNo
protected boolean m_Debug
private boolean m_CheckErr
protected weka.core.FastVector m_RulesetStats
private double[][] dataspaceEdges
public weka.core.Instances m_dataAllClasses
Constructor Detail |
---|
public FRIP()
Method Detail |
---|
public java.lang.String globalInfo()
public weka.core.TechnicalInformation getTechnicalInformation()
getTechnicalInformation
in interface weka.core.TechnicalInformationHandler
public java.util.Enumeration listOptions()
-F number
The number of folds for reduced error pruning. One fold is
used as the pruning set. (Default: 3)
-N number
The minimal weights of instances within a split.
(Default: 2)
-O number
Set the number of runs of optimizations. (Default: 2)
-D
Whether turn on the debug mode
-S number
The seed of randomization used in FRipper.(Default: 1)
-E
Whether NOT check the error rate >= 0.5 in stopping criteria.
(default: check)
*
listOptions
in interface weka.core.OptionHandler
listOptions
in class weka.classifiers.Classifier
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-F <number of folds> Set number of folds for REP One fold is used as pruning set. (default 3)
-N <min. weights> Set the minimal weights of instances within a split. (default 2.0)
-O <number of runs> Set the number of runs of optimizations. (Default: 2)
-D Set whether turn on the debug mode (Default: false)
-S <seed> The seed of randomization (Default: 1)
-E Whether NOT check the error rate>=0.5 in stopping criteria (default: check)
-P Whether NOT use pruning (default: use pruning)
setOptions
in interface weka.core.OptionHandler
setOptions
in class weka.classifiers.Classifier
options
- the list of options as an array of strings
java.lang.Exception
- if an option is not supportedpublic java.lang.String[] getOptions()
getOptions
in interface weka.core.OptionHandler
getOptions
in class weka.classifiers.Classifier
public java.util.Enumeration enumerateMeasures()
enumerateMeasures
in interface weka.core.AdditionalMeasureProducer
public double getMeasure(java.lang.String additionalMeasureName)
getMeasure
in interface weka.core.AdditionalMeasureProducer
additionalMeasureName
- the name of the measure to query for its value
java.lang.IllegalArgumentException
- if the named measure is not supportedpublic java.lang.String foldsTipText()
public void setFolds(int fold)
fold
- the number of foldspublic int getFolds()
public java.lang.String minNoTipText()
public void setMinNo(double m)
m
- the minimum total weight of the instances in a rulepublic double getMinNo()
public java.lang.String seedTipText()
public void setSeed(long s)
s
- the new seed valuepublic long getSeed()
public java.lang.String optimizationsTipText()
public void setOptimizations(int run)
run
- the number of optimization runspublic int getOptimizations()
public java.lang.String debugTipText()
debugTipText
in class weka.classifiers.Classifier
public void setDebug(boolean d)
setDebug
in class weka.classifiers.Classifier
d
- whether debug information is output to the consolepublic boolean getDebug()
getDebug
in class weka.classifiers.Classifier
public java.lang.String checkErrorRateTipText()
public void setCheckErrorRate(boolean d)
d
- whether to check for error rate is in stopping criterionpublic boolean getCheckErrorRate()
public void setDataspaceEdges(double[][] d)
d
- The edges of the dataspacepublic double[][] getDataspaceEdges()
public weka.core.FastVector getRuleset()
public weka.classifiers.rules.RuleStats getRuleStats(int pos)
pos
- the position of the stats, assuming correct
public weka.core.Capabilities getCapabilities()
getCapabilities
in interface weka.core.CapabilitiesHandler
getCapabilities
in class weka.classifiers.Classifier
Capabilities
public void buildClassifier(weka.core.Instances instances) throws java.lang.Exception
buildClassifier
in class weka.classifiers.Classifier
instances
- the training data
java.lang.Exception
- if classifier can't be built successfullypublic double[] distributionForInstance(weka.core.Instance datum)
distributionForInstance
in class weka.classifiers.Classifier
datum
- the instance to be classified
protected weka.core.Instances rulesetForOneClass(double expFPRate, weka.core.Instances data, double classIndex, double defDL) throws java.lang.Exception
expFPRate
- the expected FP/(FP+FN) used in DL calculationdata
- the given dataclassIndex
- the given class indexdefDL
- the default DL in the data
java.lang.Exception
- if the ruleset can be built properlyprivate boolean checkStop(double[] rst, double minDL, double dl)
rst
- the statistic of the rulesetminDL
- the min description length so fardl
- the current description length of the ruleset
public java.lang.String toString()
toString
in class java.lang.Object
public java.lang.String getRevision()
weka.core.RevisionHandler
getRevision
in interface weka.core.RevisionHandler
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |