Class GNB
source code
Gaussian Naive Bayes Classifier.
GNB is a probabilistic classifier relying on Bayes rule to
estimate posterior probabilities of labels given the data. Naive
assumption in it is an independence of the features, which allows
to combine per-feature likelihoods by a simple product across
likelihoods of"independent" features.
See http://en.wikipedia.org/wiki/Naive_bayes for more information.
Provided here implementation is "naive" on its own -- various
aspects could be improved, but has its own advantages:
- implementation is simple and straightforward
- no data copying while considering samples of specific class
- provides alternative ways to assess prior distribution of the
classes in the case of unbalanced sets of samples (see parameter
prior)
- makes use of NumPy broadcasting mechanism, so should be
relatively efficient
- should work for any dimensionality of samples
GNB is listed both as linear and non-linear classifier, since
specifics of separating boundary depends on the data and/or
parameters: linear separation is achieved whenever samples are
balanced (or prior='uniform') and features have the same variance
across different classes (i.e. if common_variance=True to enforce
this).
Whenever decisions are made based on log-probabilities (parameter
logprob=True, which is the default), then state variable values
if enabled would also contain log-probabilities. Also mention
that normalization by the evidence (P(data)) is disabled by
default since it has no impact per se on classification decision.
You might like set parameter normalize to True if you want to
access properly scaled probabilities in values state variable.
|
|
|
|
|
|
|
|
Inherited from base.Classifier :
__repr__ ,
__str__ ,
clone ,
getSensitivityAnalyzer ,
isTrained ,
predict ,
repredict ,
retrain ,
summary ,
train ,
trained
Inherited from misc.state.ClassWithCollections :
__getattribute__ ,
__new__ ,
__setattr__ ,
reset
Inherited from object :
__delattr__ ,
__format__ ,
__hash__ ,
__reduce__ ,
__reduce_ex__ ,
__sizeof__ ,
__subclasshook__
|
|
_clf_internals = ['gnb', 'linear', 'non-linear', 'binary', 'mu...
Describes some specifics about the classifier -- is that it is
doing regression for instance....
|
|
common_variance = Parameter(False, allowedtype= 'bool', doc= "...
|
|
prior = Parameter('laplacian_smoothing', allowedtype= 'basestr...
|
|
logprob = Parameter(True, allowedtype= 'bool', doc= """Operate...
|
|
normalize = Parameter(False, allowedtype= 'bool', doc= """Norm...
|
Inherited from base.Classifier :
_DEV__doc__ ,
feature_ids ,
predicting_time ,
predictions ,
regression ,
retrainable ,
trained_dataset ,
trained_labels ,
trained_nsamples ,
training_confusion ,
training_time ,
values
Inherited from misc.state.ClassWithCollections :
descr
|
|
means
Means of features per class
|
|
variances
Variances per class, but "vars" is taken ;)
|
|
ulabels
Labels classifier was trained on
|
|
priors
Class probabilities
|
Inherited from object :
__class__
|
Initialize an GNB classifier.
- Overrides:
object.__init__
|
_clf_internals
Describes some specifics about the classifier -- is that it is
doing regression for instance....
- Value:
['gnb', 'linear', 'non-linear', 'binary', 'multiclass']
|
|
common_variance
- Value:
Parameter(False, allowedtype= 'bool', doc= """Use the same variance ac
ross all classes.""")
|
|
prior
- Value:
Parameter('laplacian_smoothing', allowedtype= 'basestring', choices= [
"laplacian_smoothing", "uniform", "ratio"], doc= """How to compute pri
or distribution.""")
|
|
logprob
- Value:
Parameter(True, allowedtype= 'bool', doc= """Operate on log probabilit
ies. Preferable to avoid unneeded
exponentiation and loose precision.
If set, logprobs are stored in `values`""")
|
|
normalize
- Value:
Parameter(False, allowedtype= 'bool', doc= """Normalize (log)prob by P
(data). Requires probabilities thus
for `logprob` case would require exponentiation of 'logpr
ob's, thus
disabled by default since does not impact classification
output.
""")
|
|