Package mvpa :: Package misc :: Module data_generators
[hide private]
[frames] | no frames]

Module data_generators

source code

Miscelaneous data generators for unittests and demos
Functions [hide private]
 
multipleChunks(func, n_chunks, *args, **kwargs)
Replicate datasets multiple times raising different chunks
source code
 
dumbFeatureDataset()
Create a very simple dataset with 2 features and 3 labels
source code
 
dumbFeatureBinaryDataset()
Very simple binary (2 labels) dataset
source code
 
normalFeatureDataset(perlabel=50, nlabels=2, nfeatures=4, nchunks=5, means=None, nonbogus_features=None, snr=1.0)
Generate a dataset where each label is some normally distributed beastie around specified mean (0 if None).
source code
 
pureMultivariateSignal(patterns, signal2noise=1.5, chunks=None)
Create a 2d dataset with a clear multivariate signal, but no univariate information.
source code
 
normalFeatureDataset__(dataset=None, labels=None, nchunks=None, perlabel=50, activation_probability_steps=1, randomseed=None, randomvoxels=False)
NOT FINISHED
source code
 
getMVPattern(s2n)
Simple multivariate dataset
source code
 
wr1996(size=200)
Generate '6d robot arm' dataset (Williams and Rasmussen 1996)
source code
 
sinModulated(n_instances, n_features, flat=False, noise=0.4)
Generate a (quite) complex multidimensional non-linear dataset
source code
 
chirpLinear(n_instances, n_features=4, n_nonbogus_features=2, data_noise=0.4, noise=0.1)
Generates simple dataset for linear regressions
source code
 
linear_awgn(size=10, intercept=0.0, slope=0.4, noise_std=0.01, flat=False)
Generate a dataset from a linear function with Added White Gaussian Noise (AWGN).
source code
 
noisy_2d_fx(size_per_fx, dfx, sfx, center, noise_std=1) source code

Imports: N, Set, Dataset, debug


Function Details [hide private]

multipleChunks(func, n_chunks, *args, **kwargs)

source code 

Replicate datasets multiple times raising different chunks

Given some randomized (noisy) generator of a dataset with a single chunk call generator multiple times and place results into a distinct chunks

normalFeatureDataset(perlabel=50, nlabels=2, nfeatures=4, nchunks=5, means=None, nonbogus_features=None, snr=1.0)

source code 

Generate a dataset where each label is some normally distributed beastie around specified mean (0 if None).

snr is assuming that signal has std 1.0 so we just divide noise by snr

Probably it is a generalization of pureMultivariateSignal where means=[ [0,1], [1,0] ]

Specify either means or nonbogus_features so means get assigned accordingly

wr1996(size=200)

source code 

Generate '6d robot arm' dataset (Williams and Rasmussen 1996)

Was originally created in order to test the correctness of the implementation of kernel ARD. For full details see: http://www.gaussianprocess.org/gpml/code/matlab/doc/regression.html#ard

x_1 picked randomly in [-1.932, -0.453] x_2 picked randomly in [0.534, 3.142] r_1 = 2.0 r_2 = 1.3 f(x_1,x_2) = r_1 cos (x_1) + r_2 cos(x_1 + x_2) + N(0,0.0025) etc.

Expected relevances: ell_1 1.804377 ell_2 1.963956 ell_3 8.884361 ell_4 34.417657 ell_5 1081.610451 ell_6 375.445823 sigma_f 2.379139 sigma_n 0.050835

sinModulated(n_instances, n_features, flat=False, noise=0.4)

source code 

Generate a (quite) complex multidimensional non-linear dataset

Used for regression testing. In the data label is a sin of a x^2 + uniform noise

chirpLinear(n_instances, n_features=4, n_nonbogus_features=2, data_noise=0.4, noise=0.1)

source code 

Generates simple dataset for linear regressions

Generates chirp signal, populates n_nonbogus_features out of n_features with it with different noise level and then provides signal itself with additional noise as labels

linear_awgn(size=10, intercept=0.0, slope=0.4, noise_std=0.01, flat=False)

source code 
Generate a dataset from a linear function with Added White Gaussian Noise (AWGN). It can be multidimensional if 'slope' is a vector. If flat is True (in 1 dimesion) generate equally spaces samples instead of random ones. This is useful for the test phase.