|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectweka.classifiers.Classifier
weka.classifiers.SingleClassifierEnhancer
weka.classifiers.RandomizableSingleClassifierEnhancer
weka.classifiers.meta.GridSearch
public class GridSearch
Performs a grid search of parameter pairs for the a classifier (Y-axis, default is LinearRegression with the "Ridge" parameter) and the PLSFilter (X-axis, "# of Components") and chooses the best pair found for the actual predicting.
The initial grid is worked on with 2-fold CV to determine the values of the parameter pairs for the selected type of evaluation (e.g., accuracy). The best point in the grid is then taken and a 10-fold CV is performed with the adjacent parameter pairs. If a better pair is found, then this will act as new center and another 10-fold CV will be performed (kind of hill-climbing). This process is repeated until no better pair is found or the best pair is on the border of the grid.
In case the best pair is on the border, one can let GridSearch automatically extend the grid and continue the search. Check out the properties 'gridIsExtendable' (option '-extend-grid') and 'maxGridExtensions' (option '-max-grid-extensions <num>').
GridSearch can handle doubles, integers (values are just cast to int) and booleans (0 is false, otherwise true). float, char and long are supported as well.
The best filter/classifier setup can be accessed after the buildClassifier call via the getBestFilter/getBestClassifier methods.
Note on the implementation: after the data has been passed through the filter, a default NumericCleaner filter is applied to the data in order to avoid numbers that are getting too small and might produce NaNs in other schemes.
-E <CC|RMSE|RRSE|MAE|RAE|COMB|ACC|KAP> Determines the parameter used for evaluation: CC = Correlation coefficient RMSE = Root mean squared error RRSE = Root relative squared error MAE = Mean absolute error RAE = Root absolute error COMB = Combined = (1-abs(CC)) + RRSE + RAE ACC = Accuracy KAP = Kappa (default: CC)
-y-property <option> The Y option to test (without leading dash). (default: classifier.ridge)
-y-min <num> The minimum for Y. (default: -10)
-y-max <num> The maximum for Y. (default: +5)
-y-step <num> The step size for Y. (default: 1)
-y-base <num> The base for Y. (default: 10)
-y-expression <expr> The expression for Y. Available parameters: BASE FROM TO STEP I - the current iteration value (from 'FROM' to 'TO' with stepsize 'STEP') (default: 'pow(BASE,I)')
-filter <filter specification> The filter to use (on X axis). Full classname of filter to include, followed by scheme options. (default: weka.filters.supervised.attribute.PLSFilter)
-x-property <option> The X option to test (without leading dash). (default: filter.numComponents)
-x-min <num> The minimum for X. (default: +5)
-x-max <num> The maximum for X. (default: +20)
-x-step <num> The step size for X. (default: 1)
-x-base <num> The base for X. (default: 10)
-x-expression <expr> The expression for the X value. Available parameters: BASE MIN MAX STEP I - the current iteration value (from 'FROM' to 'TO' with stepsize 'STEP') (default: 'pow(BASE,I)')
-extend-grid Whether the grid can be extended. (default: no)
-max-grid-extensions <num> The maximum number of grid extensions (-1 is unlimited). (default: 3)
-sample-size <num> The size (in percent) of the sample to search the inital grid with. (default: 100)
-traversal <ROW-WISE|COLUMN-WISE> The type of traversal for the grid. (default: COLUMN-WISE)
-log-file <filename> The log file to log the messages to. (default: none)
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.functions.LinearRegression)
Options specific to classifier weka.classifiers.functions.LinearRegression:
-D Produce debugging output. (default no debugging output)
-S <number of selection method> Set the attribute selection method to use. 1 = None, 2 = Greedy. (default 0 = M5' method)
-C Do not try to eliminate colinear attributes.
-R <double> Set ridge parameter (default 1.0e-8).
Options specific to filter weka.filters.supervised.attribute.PLSFilter ('-filter'):
-D Turns on output of debugging information.
-C <num> The number of components to compute. (default: 20)
-U Updates the class attribute as well. (default: off)
-M Turns replacing of missing values on. (default: off)
-A <SIMPLS|PLS1> The algorithm to use. (default: PLS1)
-P <none|center|standardize> The type of preprocessing that is applied to the data. (default: center)Examples:
weka.filters.AllFilter
since we
don't need any special data processing and we don't optimize the
filter in this case (data gets always passed through filter!).weka.classifiers.functions.SMO
as classifier
with weka.classifiers.functions.supportVector.RBFKernel
as kernel.
weka.filters.supervised.attribute.PLSFilter
.weka.classifiers.functions.LinearRegression
as
classifier and use no attribute selection and no elimination of
colinear attributes.
PLSFilter
,
LinearRegression
,
NumericCleaner
,
Serialized FormField Summary | |
---|---|
static int |
EVALUATION_ACC
evaluation via: Accuracy |
static int |
EVALUATION_CC
evaluation via: Correlation coefficient |
static int |
EVALUATION_COMBINED
evaluation via: Combined = (1-CC) + RRSE + RAE |
static int |
EVALUATION_KAPPA
evaluation via: kappa statistic |
static int |
EVALUATION_MAE
evaluation via: Mean absolute error |
static int |
EVALUATION_RAE
evaluation via: Relative absolute error |
static int |
EVALUATION_RMSE
evaluation via: Root mean squared error |
static int |
EVALUATION_RRSE
evaluation via: Root relative squared error |
static java.lang.String |
PREFIX_CLASSIFIER
the prefix to indicate that the option is for the classifier |
static java.lang.String |
PREFIX_FILTER
the prefix to indicate that the option is for the filter |
static Tag[] |
TAGS_EVALUATION
evaluation |
static Tag[] |
TAGS_TRAVERSAL
traversal |
static int |
TRAVERSAL_BY_COLUMN
column-wise grid traversal |
static int |
TRAVERSAL_BY_ROW
row-wise grid traversal |
Constructor Summary | |
---|---|
GridSearch()
the default constructor |
Method Summary | |
---|---|
void |
buildClassifier(Instances data)
builds the classifier |
double |
classifyInstance(Instance instance)
Classifies the given instance. |
java.util.Enumeration |
enumerateMeasures()
Returns an enumeration of the measure names. |
java.lang.String |
evaluationTipText()
Returns the tip text for this property |
java.lang.String |
filterTipText()
Returns the tip text for this property |
Classifier |
getBestClassifier()
returns the best Classifier setup |
Filter |
getBestFilter()
returns the best filter setup |
Capabilities |
getCapabilities()
Returns default capabilities of the classifier. |
SelectedTag |
getEvaluation()
Gets the criterion used for evaluating the classifier performance. |
Filter |
getFilter()
Get the kernel filter. |
int |
getGridExtensionsPerformed()
returns the number of grid extensions that took place during the search (only applicable if the grid was extendable). |
boolean |
getGridIsExtendable()
Get whether the grid can be extended dynamically. |
java.io.File |
getLogFile()
Gets current log file. |
int |
getMaxGridExtensions()
Gets the maximum number of grid extensions, -1 for unlimited. |
double |
getMeasure(java.lang.String measureName)
Returns the value of the named measure |
java.lang.String[] |
getOptions()
returns the options of the current setup |
java.lang.String |
getRevision()
Returns the revision string. |
double |
getSampleSizePercent()
Gets the sample size for the initial grid search. |
SelectedTag |
getTraversal()
Gets the type of traversal for the grid. |
weka.classifiers.meta.GridSearch.PointDouble |
getValues()
returns the parameter pair that was found to work best |
double |
getXBase()
Get the value of the base for X. |
java.lang.String |
getXExpression()
Get the expression for the X value. |
double |
getXMax()
Get the value of the Maximum of X. |
double |
getXMin()
Get the value of the minimum of X. |
java.lang.String |
getXProperty()
Get the X property to test (normally the filter). |
double |
getXStep()
Get the value of the step size for X. |
double |
getYBase()
Get the value of the base for Y. |
java.lang.String |
getYExpression()
Get the expression for the Y value. |
double |
getYMax()
Get the value of the Maximum of Y. |
double |
getYMin()
Get the value of the minimum of Y. |
java.lang.String |
getYProperty()
Get the Y property (normally the classifier). |
double |
getYStep()
Get the value of the step size for Y. |
java.lang.String |
globalInfo()
Returns a string describing classifier |
java.lang.String |
gridIsExtendableTipText()
Returns the tip text for this property |
java.util.Enumeration |
listOptions()
Gets an enumeration describing the available options. |
java.lang.String |
logFileTipText()
Returns the tip text for this property |
static void |
main(java.lang.String[] args)
Main method for running this classifier from commandline. |
java.lang.String |
maxGridExtensionsTipText()
Returns the tip text for this property |
java.lang.String |
sampleSizePercentTipText()
Returns the tip text for this property |
void |
setClassifier(Classifier newClassifier)
Set the base learner. |
void |
setEvaluation(SelectedTag value)
Sets the criterion to use for evaluating the classifier performance. |
void |
setFilter(Filter value)
Set the kernel filter (only used for setup). |
void |
setGridIsExtendable(boolean value)
Set whether the grid can be extended dynamically. |
void |
setLogFile(java.io.File value)
Sets the log file to use. |
void |
setMaxGridExtensions(int value)
Sets the maximum number of grid extensions, -1 for unlimited. |
void |
setOptions(java.lang.String[] options)
Parses the options for this object. |
void |
setSampleSizePercent(double value)
Sets the sample size for the initial grid search. |
void |
setTraversal(SelectedTag value)
Sets the type of traversal for the grid. |
void |
setXBase(double value)
Set the value of the base for X. |
void |
setXExpression(java.lang.String value)
Set the expression for the X value. |
void |
setXMax(double value)
Set the value of the Maximum of X. |
void |
setXMin(double value)
Set the value of the minimum of X. |
void |
setXProperty(java.lang.String value)
Set the X property. |
void |
setXStep(double value)
Set the value of the step size for X. |
void |
setYBase(double value)
Set the value of the base for Y. |
void |
setYExpression(java.lang.String value)
Set the expression for the Y value. |
void |
setYMax(double value)
Set the value of the Maximum of Y. |
void |
setYMin(double value)
Set the value of the minimum of Y. |
void |
setYProperty(java.lang.String value)
Set the Y property (normally the classifier). |
void |
setYStep(double value)
Set the value of the step size for Y. |
java.lang.String |
toString()
returns a string representation of the classifier |
java.lang.String |
toSummaryString()
Returns a string that summarizes the object. |
java.lang.String |
traversalTipText()
Returns the tip text for this property |
java.lang.String |
XBaseTipText()
Returns the tip text for this property |
java.lang.String |
XExpressionTipText()
Returns the tip text for this property |
java.lang.String |
XMaxTipText()
Returns the tip text for this property |
java.lang.String |
XMinTipText()
Returns the tip text for this property |
java.lang.String |
XPropertyTipText()
Returns the tip text for this property |
java.lang.String |
XStepTipText()
Returns the tip text for this property |
java.lang.String |
YBaseTipText()
Returns the tip text for this property |
java.lang.String |
YExpressionTipText()
Returns the tip text for this property |
java.lang.String |
YMaxTipText()
Returns the tip text for this property |
java.lang.String |
YMinTipText()
Returns the tip text for this property |
java.lang.String |
YPropertyTipText()
Returns the tip text for this property |
java.lang.String |
YStepTipText()
Returns the tip text for this property |
Methods inherited from class weka.classifiers.RandomizableSingleClassifierEnhancer |
---|
getSeed, seedTipText, setSeed |
Methods inherited from class weka.classifiers.SingleClassifierEnhancer |
---|
classifierTipText, getClassifier |
Methods inherited from class weka.classifiers.Classifier |
---|
debugTipText, distributionForInstance, forName, getDebug, makeCopies, makeCopy, setDebug |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final int EVALUATION_CC
public static final int EVALUATION_RMSE
public static final int EVALUATION_RRSE
public static final int EVALUATION_MAE
public static final int EVALUATION_RAE
public static final int EVALUATION_COMBINED
public static final int EVALUATION_ACC
public static final int EVALUATION_KAPPA
public static final Tag[] TAGS_EVALUATION
public static final int TRAVERSAL_BY_ROW
public static final int TRAVERSAL_BY_COLUMN
public static final Tag[] TAGS_TRAVERSAL
public static final java.lang.String PREFIX_CLASSIFIER
public static final java.lang.String PREFIX_FILTER
Constructor Detail |
---|
public GridSearch()
Method Detail |
---|
public java.lang.String globalInfo()
public java.util.Enumeration listOptions()
listOptions
in interface OptionHandler
listOptions
in class RandomizableSingleClassifierEnhancer
public java.lang.String[] getOptions()
getOptions
in interface OptionHandler
getOptions
in class RandomizableSingleClassifierEnhancer
public void setOptions(java.lang.String[] options) throws java.lang.Exception
-E <CC|RMSE|RRSE|MAE|RAE|COMB|ACC|KAP> Determines the parameter used for evaluation: CC = Correlation coefficient RMSE = Root mean squared error RRSE = Root relative squared error MAE = Mean absolute error RAE = Root absolute error COMB = Combined = (1-abs(CC)) + RRSE + RAE ACC = Accuracy KAP = Kappa (default: CC)
-y-property <option> The Y option to test (without leading dash). (default: classifier.ridge)
-y-min <num> The minimum for Y. (default: -10)
-y-max <num> The maximum for Y. (default: +5)
-y-step <num> The step size for Y. (default: 1)
-y-base <num> The base for Y. (default: 10)
-y-expression <expr> The expression for Y. Available parameters: BASE FROM TO STEP I - the current iteration value (from 'FROM' to 'TO' with stepsize 'STEP') (default: 'pow(BASE,I)')
-filter <filter specification> The filter to use (on X axis). Full classname of filter to include, followed by scheme options. (default: weka.filters.supervised.attribute.PLSFilter)
-x-property <option> The X option to test (without leading dash). (default: filter.numComponents)
-x-min <num> The minimum for X. (default: +5)
-x-max <num> The maximum for X. (default: +20)
-x-step <num> The step size for X. (default: 1)
-x-base <num> The base for X. (default: 10)
-x-expression <expr> The expression for the X value. Available parameters: BASE MIN MAX STEP I - the current iteration value (from 'FROM' to 'TO' with stepsize 'STEP') (default: 'pow(BASE,I)')
-extend-grid Whether the grid can be extended. (default: no)
-max-grid-extensions <num> The maximum number of grid extensions (-1 is unlimited). (default: 3)
-sample-size <num> The size (in percent) of the sample to search the inital grid with. (default: 100)
-traversal <ROW-WISE|COLUMN-WISE> The type of traversal for the grid. (default: COLUMN-WISE)
-log-file <filename> The log file to log the messages to. (default: none)
-S <num> Random number seed. (default 1)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.functions.LinearRegression)
Options specific to classifier weka.classifiers.functions.LinearRegression:
-D Produce debugging output. (default no debugging output)
-S <number of selection method> Set the attribute selection method to use. 1 = None, 2 = Greedy. (default 0 = M5' method)
-C Do not try to eliminate colinear attributes.
-R <double> Set ridge parameter (default 1.0e-8).
Options specific to filter weka.filters.supervised.attribute.PLSFilter ('-filter'):
-D Turns on output of debugging information.
-C <num> The number of components to compute. (default: 20)
-U Updates the class attribute as well. (default: off)
-M Turns replacing of missing values on. (default: off)
-A <SIMPLS|PLS1> The algorithm to use. (default: PLS1)
-P <none|center|standardize> The type of preprocessing that is applied to the data. (default: center)
setOptions
in interface OptionHandler
setOptions
in class RandomizableSingleClassifierEnhancer
options
- the options to use
java.lang.Exception
- if setting of options failspublic void setClassifier(Classifier newClassifier)
setClassifier
in class SingleClassifierEnhancer
newClassifier
- the classifier to use.public java.lang.String filterTipText()
public void setFilter(Filter value)
value
- the kernel filter.public Filter getFilter()
public java.lang.String evaluationTipText()
public void setEvaluation(SelectedTag value)
value
- .the evaluation criterionpublic SelectedTag getEvaluation()
public java.lang.String YPropertyTipText()
public java.lang.String getYProperty()
public void setYProperty(java.lang.String value)
value
- the Y property.public java.lang.String YMinTipText()
public double getYMin()
public void setYMin(double value)
value
- Value to use as minimum of Y.public java.lang.String YMaxTipText()
public double getYMax()
public void setYMax(double value)
value
- Value to use as Maximum of Y.public java.lang.String YStepTipText()
public double getYStep()
public void setYStep(double value)
value
- Value to use as the step size for Y.public java.lang.String YBaseTipText()
public double getYBase()
public void setYBase(double value)
value
- Value to use as the base for Y.public java.lang.String YExpressionTipText()
public java.lang.String getYExpression()
public void setYExpression(java.lang.String value)
value
- Expression for the Y value.public java.lang.String XPropertyTipText()
public java.lang.String getXProperty()
public void setXProperty(java.lang.String value)
value
- the X property.public java.lang.String XMinTipText()
public double getXMin()
public void setXMin(double value)
value
- Value to use as minimum of X.public java.lang.String XMaxTipText()
public double getXMax()
public void setXMax(double value)
value
- Value to use as Maximum of X.public java.lang.String XStepTipText()
public double getXStep()
public void setXStep(double value)
value
- Value to use as the step size for X.public java.lang.String XBaseTipText()
public double getXBase()
public void setXBase(double value)
value
- Value to use as the base for X.public java.lang.String XExpressionTipText()
public java.lang.String getXExpression()
public void setXExpression(java.lang.String value)
value
- Expression for the X value.public java.lang.String gridIsExtendableTipText()
public boolean getGridIsExtendable()
public void setGridIsExtendable(boolean value)
value
- whether the grid can be extended dynamically.public java.lang.String maxGridExtensionsTipText()
public int getMaxGridExtensions()
public void setMaxGridExtensions(int value)
value
- the maximum of grid extensions.public java.lang.String sampleSizePercentTipText()
public double getSampleSizePercent()
public void setSampleSizePercent(double value)
value
- the sample size for the initial grid search.public java.lang.String traversalTipText()
public void setTraversal(SelectedTag value)
value
- the traversal typepublic SelectedTag getTraversal()
public java.lang.String logFileTipText()
public java.io.File getLogFile()
public void setLogFile(java.io.File value)
value
- the log file.public Filter getBestFilter()
public Classifier getBestClassifier()
public java.util.Enumeration enumerateMeasures()
enumerateMeasures
in interface AdditionalMeasureProducer
public double getMeasure(java.lang.String measureName)
getMeasure
in interface AdditionalMeasureProducer
measureName
- the name of the measure to query for its value
public weka.classifiers.meta.GridSearch.PointDouble getValues()
public int getGridExtensionsPerformed()
getGridIsExtendable()
public Capabilities getCapabilities()
getCapabilities
in interface CapabilitiesHandler
getCapabilities
in class SingleClassifierEnhancer
Capabilities
public void buildClassifier(Instances data) throws java.lang.Exception
buildClassifier
in class Classifier
data
- the training instances
java.lang.Exception
- if something goes wrongpublic double classifyInstance(Instance instance) throws java.lang.Exception
classifyInstance
in class Classifier
instance
- the test instance
java.lang.Exception
- if classification can't be done successfullypublic java.lang.String toString()
toString
in class java.lang.Object
public java.lang.String toSummaryString()
toSummaryString
in interface Summarizable
public java.lang.String getRevision()
getRevision
in interface RevisionHandler
public static void main(java.lang.String[] args)
args
- the options
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |