Public Member Functions | |
COligoKernel (int32_t cache_size, int32_t k, float64_t width) | |
~COligoKernel () | |
virtual bool | init (CFeatures *l, CFeatures *r) |
virtual bool | load_init (FILE *) |
virtual bool | save_init (FILE *) |
virtual EKernelType | get_kernel_type () |
virtual const char * | get_name () |
virtual float64_t | compute (int32_t x, int32_t y) |
Static Protected Member Functions | |
static void | encodeOligo (const std::string &sequence, uint32_t k_mer_length, const std::string &allowed_characters, std::vector< std::pair< int32_t, float64_t > > &values) |
encodes the signals of the sequence | |
static void | getSequences (const std::vector< std::string > &sequences, uint32_t k_mer_length, const std::string &allowed_characters, std::vector< std::vector< std::pair< int32_t, float64_t > > > &encoded_sequences) |
encodes all sequences with the encodeOligo function and stores them in 'encoded_sequences' | |
static void | getExpFunctionCache (float64_t sigma, uint32_t sequence_length, std::vector< float64_t > &cache) |
prepares the exp function cache of the oligo kernel | |
static float64_t | kernelOligoFast (const std::vector< std::pair< int32_t, float64_t > > &x, const std::vector< std::pair< int32_t, float64_t > > &y, const std::vector< float64_t > &exp_cache, int32_t max_distance=-1) |
returns the value of the oligo kernel for sequences 'x' and 'y' | |
static float64_t | kernelOligo (const std::vector< std::pair< int32_t, float64_t > > &x, const std::vector< std::pair< int32_t, float64_t > > &y, float64_t sigma_square) |
returns the value of the oligo kernel for sequences 'x' and 'y' | |
Protected Attributes | |
int32_t | k |
float64_t | width |
The class has functions to preprocess the data such that the kernel computation can be pursued faster. The kernel function is then kernelOligoFast or kernelOligo.
Requires significant speedup, should be working but as is might be applicable only to academic small scale problems:
Uses CSqrtDiagKernelNormalizer, as the vanilla kernel seems to be very diagonally dominant.
Definition at line 39 of file OligoKernel.h.
COligoKernel::COligoKernel | ( | int32_t | cache_size, | |
int32_t | k, | |||
float64_t | width | |||
) |
Constructor
cache_size | cache size for kernel | |
k | k-mer length | |
width | sigma^2 |
Definition at line 24 of file OligoKernel.cpp.
COligoKernel::~COligoKernel | ( | ) |
Destructor
Definition at line 30 of file OligoKernel.cpp.
float64_t COligoKernel::compute | ( | int32_t | x, | |
int32_t | y | |||
) | [virtual] |
compute kernel function for features a and b idx_{a,b} denote the index of the feature vectors in the corresponding feature object
abstract base method
x | index a | |
y | index b |
Implements CKernel.
Definition at line 262 of file OligoKernel.cpp.
static void COligoKernel::encodeOligo | ( | const std::string & | sequence, | |
uint32_t | k_mer_length, | |||
const std::string & | allowed_characters, | |||
std::vector< std::pair< int32_t, float64_t > > & | values | |||
) | [static, protected] |
encodes the signals of the sequence
This function stores the oligo function signals in 'values'.
The 'k_mer_length' and the 'allowed_characters' determine, which signals are used. Every pair contains the position of the signal and a numerical value reflecting the signal. The numerical value represents the k_mer to a base n = |allowed_characters|. Example: The value of k_mer CG for the allowed characters ACGT would be 1 * n^1 + 2 * n^0 = 6.
virtual EKernelType COligoKernel::get_kernel_type | ( | ) | [virtual] |
return what type of kernel we are
Implements CKernel.
Definition at line 82 of file OligoKernel.h.
virtual const char* COligoKernel::get_name | ( | ) | [virtual] |
return the kernel's name
Implements CKernel.
Definition at line 88 of file OligoKernel.h.
static void COligoKernel::getExpFunctionCache | ( | float64_t | sigma, | |
uint32_t | sequence_length, | |||
std::vector< float64_t > & | cache | |||
) | [static, protected] |
prepares the exp function cache of the oligo kernel
The oligo kernel was introduced for sequences of fixed length. Let n be the sequence length of sequences x and y. There can only be n different distances between signals in sequence x and sequence y (0, 1, ..., n-1). Therefore, we precompute the corresponding values of the e-function. These values can then be used in kernelOligoFast.
static void COligoKernel::getSequences | ( | const std::vector< std::string > & | sequences, | |
uint32_t | k_mer_length, | |||
const std::string & | allowed_characters, | |||
std::vector< std::vector< std::pair< int32_t, float64_t > > > & | encoded_sequences | |||
) | [static, protected] |
initialize kernel
l | features of left-hand side | |
r | features of right-hand side |
Reimplemented from CStringKernel< char >.
Definition at line 35 of file OligoKernel.cpp.
static float64_t COligoKernel::kernelOligo | ( | const std::vector< std::pair< int32_t, float64_t > > & | x, | |
const std::vector< std::pair< int32_t, float64_t > > & | y, | |||
float64_t | sigma_square | |||
) | [static, protected] |
returns the value of the oligo kernel for sequences 'x' and 'y'
This function computes the kernel value of the oligo kernel, which was introduced by Meinicke et al. in 2004. 'x' and 'y' have to be encoded by encodeOligo.
static float64_t COligoKernel::kernelOligoFast | ( | const std::vector< std::pair< int32_t, float64_t > > & | x, | |
const std::vector< std::pair< int32_t, float64_t > > & | y, | |||
const std::vector< float64_t > & | exp_cache, | |||
int32_t | max_distance = -1 | |||
) | [static, protected] |
returns the value of the oligo kernel for sequences 'x' and 'y'
This function computes the kernel value of the oligo kernel, which was introduced by Meinicke et al. in 2004. 'x' and 'y' are encoded by encodeOligo and 'exp_cache' has to be constructed by getExpFunctionCache.
'max_distance' can be used to speed up the computation even further by restricting the maximum distance between a k_mer at position i in sequence 'x' and a k_mer at position j in sequence 'y'. If i - j > 'max_distance' the value is not added to the kernel value. This approximation is switched off by default (max_distance < 0).
virtual bool COligoKernel::load_init | ( | FILE * | ) | [virtual] |
load kernel init_data
Implements CKernel.
Definition at line 64 of file OligoKernel.h.
virtual bool COligoKernel::save_init | ( | FILE * | ) | [virtual] |
save kernel init_data
Implements CKernel.
Definition at line 73 of file OligoKernel.h.
int32_t COligoKernel::k [protected] |
member variable k
Definition at line 177 of file OligoKernel.h.
float64_t COligoKernel::width [protected] |
width of kernel
Definition at line 179 of file OligoKernel.h.