Package mvpa :: Package clfs :: Module base :: Class Classifier
[hide private]
[frames] | no frames]

Class Classifier

source code


Abstract classifier class to be inherited by all classifiers

Required behavior:

For every classifier is has to be possible to be instanciated without having to specify the training pattern.

Repeated calls to the train() method with different training data have to result in a valid classifier, trained for the particular dataset.

It must be possible to specify all classifier parameters as keyword arguments to the constructor.

Recommended behavior:

Derived classifiers should provide access to values -- i.e. that information that is finally used to determine the predicted class label.

Michael: Maybe it works well if each classifier provides a 'values'

state member. This variable is a list as long as and in same order as Dataset.uniquelabels (training data). Each item in the list corresponds to the likelyhood of a sample to belong to the respective class. However the sematics might differ between classifiers, e.g. kNN would probably store distances to class- neighbours, where PLR would store the raw function value of the logistic function. So in the case of kNN low is predictive and for PLR high is predictive. Don't know if there is the need to unify that.

As the storage and/or computation of this information might be demanding its collection should be switchable and off be default.

Nomenclature
Nested Classes [hide private]

Inherited from misc.state.Stateful: __metaclass__

Instance Methods [hide private]
 
__init__(self, train2predict=True, regression=False, retrainable=False, **kwargs)
Cheap initialization.
source code
 
__str__(self)
str(x)
source code
 
_pretrain(self, dataset)
Functionality prior to training
source code
 
_posttrain(self, dataset)
Functionality post training
source code
 
_getFeatureIds(self)
Virtual method to return feature_ids used while training
source code
 
_train(self, dataset)
Function to be actually overriden in derived classes
source code
 
train(self, dataset)
Train classifier on a dataset
source code
 
_prepredict(self, data)
Functionality prior prediction
source code
 
_postpredict(self, data, result)
Functionality after prediction is computed
source code
 
_predict(self, data)
Actual prediction
source code
 
predict(self, data)
Predict classifier on data
source code
 
isTrained(self, dataset=None)
Either classifier was already trained.
source code
 
regression(self) source code
 
_regressionIsBogus(self)
Some classifiers like BinaryClassifier can't be used for regression
source code
 
trained(self)
Either classifier was already trained
source code
 
untrain(self)
Reset trained state
source code
 
train2predict(self)
Either classifier has to be trained to predict
source code
 
getSensitivityAnalyzer(self, **kwargs)
Factory method to return an appropriate sensitivity analyzer for the respective classifier.
source code
 
_getRetrainable(self) source code
 
_setRetrainable(self, value) source code

Inherited from misc.state.Stateful: __getattribute__, __repr__, __setattr__, reset

Inherited from object: __delattr__, __hash__, __new__, __reduce__, __reduce_ex__

Class Variables [hide private]
  trained_labels = StateVariable(enabled= True, doc= "Set of uni...
  trained_dataset = StateVariable(enabled= False, doc= "The data...
  training_confusion = StateVariable(enabled= False, doc= "Confu...
  predictions = StateVariable(enabled= True, doc= "Most recent s...
  values = StateVariable(enabled= False, doc= "Internal classifi...
  training_time = StateVariable(enabled= True, doc= "Time (in se...
  predicting_time = StateVariable(enabled= True, doc= "Time (in ...
  feature_ids = StateVariable(enabled= False, doc= "Feature IDS ...
  _clf_internals = []
Describes some specifics about the classifier -- is that it is doing regression for instance....
  retrainable = property(fget= _getRetrainable, fset= _setRetrai...
Instance Variables [hide private]
  _train2predict
Some classifiers might not need to be trained to predict
  __trainednfeatures
Stores number of features for which classifier was trained.
  _regression
If True - perform regression, not classification
  __retrainable
If True - store anything necessary for efficient retrain
  __trainedidhash
Stores id of the dataset on which it was trained to signal in trained() if it was trained already on the same dataset
Properties [hide private]

Inherited from misc.state.Stateful: descr

Inherited from object: __class__

Method Details [hide private]

__init__(self, train2predict=True, regression=False, retrainable=False, **kwargs)
(Constructor)

source code 
Cheap initialization.
Parameters:
  • enable_states - Names of the state variables which should be enabled additionally to default ones
  • disable_states - Names of the state variables which should be disabled
  • descr - Description of the instance
Overrides: object.__init__

__str__(self)
(Informal representation operator)

source code 
str(x)
Overrides: object.__str__
(inherited documentation)

_posttrain(self, dataset)

source code 
Functionality post training

For instance -- computing confusion matrix
:Parameters:
  dataset : Dataset
    Data which was used for training

_getFeatureIds(self)

source code 

Virtual method to return feature_ids used while training

Is not intended to be called anywhere but from _posttrain, thus classifier is assumed to be trained at this point

train(self, dataset)

source code 

Train classifier on a dataset

Shouldn't be overriden in subclasses unless explicitely needed to do so

predict(self, data)

source code 

Predict classifier on data

Shouldn't be overriden in subclasses unless explicitely needed to do so. Also subclasses trying to call super class's predict should call _predict if within _predict instead of predict() since otherwise it would loop

isTrained(self, dataset=None)

source code 

Either classifier was already trained.

MUST BE USED WITH CARE IF EVER

regression(self)

source code 
Decorators:
  • @property

trained(self)

source code 
Either classifier was already trained
Decorators:
  • @property

train2predict(self)

source code 
Either classifier has to be trained to predict
Decorators:
  • @property

Class Variable Details [hide private]

trained_labels

Value:
StateVariable(enabled= True, doc= "Set of unique labels it has been tr\
ained on")

trained_dataset

Value:
StateVariable(enabled= False, doc= "The dataset it has been trained on\
")

training_confusion

Value:
StateVariable(enabled= False, doc= "Confusion matrix of learning perfo\
rmance")

predictions

Value:
StateVariable(enabled= True, doc= "Most recent set of predictions")

values

Value:
StateVariable(enabled= False, doc= "Internal classifier values the mos\
t recent "+ "predictions are based on")

training_time

Value:
StateVariable(enabled= True, doc= "Time (in seconds) which took classi\
fier to train")

predicting_time

Value:
StateVariable(enabled= True, doc= "Time (in seconds) which took classi\
fier to predict")

feature_ids

Value:
StateVariable(enabled= False, doc= "Feature IDS which were used for th\
e actual training."+ " Some classifiers might internally do feature se\
lection (SMLR)")

retrainable

Value:
property(fget= _getRetrainable, fset= _setRetrainable, doc= "Specifies\
 either classifier should be retrainable")

Instance Variable Details [hide private]

__trainednfeatures

Stores number of features for which classifier was trained. If None -- it wasn't trained at all