Package mvpa :: Package datasets :: Module splitter :: Class NFoldSplitter
[hide private]
[frames] | no frames]

Class NFoldSplitter

source code


Generic N-fold data splitter.

XXX: This docstring is a shame for such an important class!

Constructor information for NFoldSplitter class

Initialize the N-fold splitter.

Documentation for base classes of NFoldSplitter

Documentation for class Splitter

Base class of dataset splitters.

Each splitter should be initialized with all its necessary parameters. The final splitting is done running the splitter object on a certain Dataset via __call__(). This method has to be implemented like a generator, i.e. it has to return every possible split with a yield() call.

Each split has to be returned as a sequence of Datasets. The properties of the splitted dataset may vary between implementations. It is possible to declare a sequence element as 'None'.

Please note, that even if there is only one Dataset returned it has to be an element in a sequence and not just the Dataset object!

Instance Methods [hide private]
 
__init__(self, cvtype=1, **kwargs)
Initialize the N-fold splitter.
source code
 
__str__(self)
String summary over the object
source code
 
_getSplitConfig(self, uniqueattrs)
Returns proper split configuration for N-M fold split.
source code

Inherited from Splitter: __call__, setNPerLabel, splitDataset, splitcfg

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__

Class Variables [hide private]
  __doc__ = enhancedDocString('NFoldSplitter', locals(), Splitter)
Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, cvtype=1, **kwargs)
(Constructor)

source code 
Initialize the N-fold splitter.
Parameters:
  • nperlabel - Number of dataset samples per label to be included in each split. Two special strings are recognized: 'all' uses all available samples (default) and 'equal' uses the maximum number of samples the can be provided by all of the classes. This value might be provided as a sequence whos length matches the number of datasets per split and indicates the configuration for the respective dataset in each split.
  • nrunspersplit, int - Number of times samples for each split are chosen. This is mostly useful if a subset of the available samples is used in each split and the subset is randomly selected for each run (see the nperlabel argument).
  • permute - If set to True, the labels of each generated dataset will be permuted on a per-chunk basis.
  • attr - Sample attribute used to determine splits.
Overrides: object.__init__

__str__(self)
(Informal representation operator)

source code 
String summary over the object
Overrides: object.__str__

_getSplitConfig(self, uniqueattrs)

source code 
Returns proper split configuration for N-M fold split.
Overrides: Splitter._getSplitConfig