公有成员 | 保护成员 | 保护属性 | 友元

CCommWordStringKernel类参考


详细描述

The CommWordString kernel may be used to compute the spectrum kernel from strings that have been mapped into unsigned 16bit integers.

These 16bit integers correspond to k-mers. To applicable in this kernel they need to be sorted (e.g. via the SortWordString pre-processor).

It basically uses the algorithm in the unix "comm" command (hence the name) to compute:

\[ k({\bf x},({\bf x'})= \Phi_k({\bf x})\cdot \Phi_k({\bf x'}) \]

where $\Phi_k$ maps a sequence ${\bf x}$ that consists of letters in $\Sigma$ to a feature vector of size $|\Sigma|^k$. In this feature vector each entry denotes how often the k-mer appears in that ${\bf x}$.

Note that this representation is especially tuned to small alphabets (like the 2-bit alphabet DNA), for which it enables spectrum kernels of order up to 8.

For this kernel the linadd speedups are quite efficiently implemented using direct maps.

在文件CommWordStringKernel.h46行定义。

继承图,类CCommWordStringKernel
Inheritance graph
[图例]

所有成员的列表。

公有成员

 CCommWordStringKernel (int32_t size, bool use_sign)
 CCommWordStringKernel (CStringFeatures< uint16_t > *l, CStringFeatures< uint16_t > *r, bool use_sign=false, int32_t size=10)
virtual ~CCommWordStringKernel ()
virtual bool init (CFeatures *l, CFeatures *r)
virtual void cleanup ()
virtual EKernelType get_kernel_type ()
virtual const char * get_name () const
virtual bool init_dictionary (int32_t size)
virtual bool init_optimization (int32_t count, int32_t *IDX, float64_t *weights)
virtual bool delete_optimization ()
virtual float64_t compute_optimized (int32_t idx)
virtual void add_to_normal (int32_t idx, float64_t weight)
virtual void clear_normal ()
virtual EFeatureType get_feature_type ()
void get_dictionary (int32_t &dsize, float64_t *&dweights)
virtual float64_tcompute_scoring (int32_t max_degree, int32_t &num_feat, int32_t &num_sym, float64_t *target, int32_t num_suppvec, int32_t *IDX, float64_t *alphas, bool do_init=true)
char * compute_consensus (int32_t &num_feat, int32_t num_suppvec, int32_t *IDX, float64_t *alphas)
void set_use_dict_diagonal_optimization (bool flag)
bool get_use_dict_diagonal_optimization ()

保护成员

virtual float64_t compute (int32_t idx_a, int32_t idx_b)
virtual float64_t compute_helper (int32_t idx_a, int32_t idx_b, bool do_sort)
virtual float64_t compute_diag (int32_t idx_a)

保护属性

int32_t dictionary_size
float64_tdictionary_weights
bool use_sign
bool use_dict_diagonal_optimization
int32_t * dict_diagonal_optimization

友元

class CVarianceKernelNormalizer
class CSqrtDiagKernelNormalizer
class CAvgDiagKernelNormalizer
class CRidgeKernelNormalizer
class CFirstElementKernelNormalizer
class CTanimotoKernelNormalizer
class CDiceKernelNormalizer

构造及析构函数文档

CCommWordStringKernel ( int32_t  size,
bool  use_sign 
)

constructor

参数:
size cache size
use_sign if sign shall be used

在文件CommWordStringKernel.cpp19行定义。

CCommWordStringKernel ( CStringFeatures< uint16_t > *  l,
CStringFeatures< uint16_t > *  r,
bool  use_sign = false,
int32_t  size = 10 
)

constructor

参数:
l features of left-hand side
r features of right-hand side
use_sign if sign shall be used
size cache size

在文件CommWordStringKernel.cpp28行定义。

~CCommWordStringKernel (  )  [virtual]

在文件CommWordStringKernel.cpp53行定义。


成员函数文档

void add_to_normal ( int32_t  idx,
float64_t  weight 
) [virtual]

add to normal

参数:
idx where to add
weight what to add

重载CKernel

CWeightedCommWordStringKernel重载。

在文件CommWordStringKernel.cpp237行定义。

void cleanup (  )  [virtual]

clean up kernel

重载CKernel

CWeightedCommWordStringKernel重载。

在文件CommWordStringKernel.cpp75行定义。

void clear_normal (  )  [virtual]

clear normal

重载CKernel

在文件CommWordStringKernel.cpp282行定义。

virtual float64_t compute ( int32_t  idx_a,
int32_t  idx_b 
) [protected, virtual]

compute kernel function for features a and b idx_{a,b} denote the index of the feature vectors in the corresponding feature object

参数:
idx_a index a
idx_b index b
返回:
computed kernel function at indices a,b

实现了CKernel

在文件CommWordStringKernel.h212行定义。

char * compute_consensus ( int32_t &  num_feat,
int32_t  num_suppvec,
int32_t *  IDX,
float64_t alphas 
)

compute consensus

参数:
num_feat number of features
num_suppvec number of support vectors
IDX IDX
alphas alphas
返回:
computed consensus

在文件CommWordStringKernel.cpp494行定义。

float64_t compute_diag ( int32_t  idx_a  )  [protected, virtual]

helper to compute only diagonal normalization for training

参数:
idx_a index a
返回:
unnormalized diagonal value

在文件CommWordStringKernel.cpp81行定义。

float64_t compute_helper ( int32_t  idx_a,
int32_t  idx_b,
bool  do_sort 
) [protected, virtual]

helper for compute

参数:
idx_a index a
idx_b index b
do_sort if sorting shall be performed
返回:
computed value

CWeightedCommWordStringKernel重载。

在文件CommWordStringKernel.cpp125行定义。

float64_t compute_optimized ( int32_t  idx  )  [virtual]

compute optimized

参数:
idx index to compute
返回:
optimized value at given index

重载CKernel

CWeightedCommWordStringKernel重载。

在文件CommWordStringKernel.cpp322行定义。

float64_t * compute_scoring ( int32_t  max_degree,
int32_t &  num_feat,
int32_t &  num_sym,
float64_t target,
int32_t  num_suppvec,
int32_t *  IDX,
float64_t alphas,
bool  do_init = true 
) [virtual]

compute scoring

参数:
max_degree maximum degree
num_feat number of features
num_sym number of symbols
target target
num_suppvec number of support vectors
IDX IDX
alphas alphas
do_init if initialization shall be performed
返回:
computed scores

CWeightedCommWordStringKernel重载。

在文件CommWordStringKernel.cpp371行定义。

bool delete_optimization (  )  [virtual]

delete optimization

返回:
if deleting was successful

重载CKernel

在文件CommWordStringKernel.cpp314行定义。

void get_dictionary ( int32_t &  dsize,
float64_t *&  dweights 
)

get dictionary

参数:
dsize dictionary size will be stored in here
dweights dictionary weights will be stored in here

在文件CommWordStringKernel.h150行定义。

virtual EFeatureType get_feature_type (  )  [virtual]

return feature type the kernel can deal with

返回:
feature type WORD

重载CStringKernel< uint16_t >

CWeightedCommWordStringKernel重载。

在文件CommWordStringKernel.h143行定义。

virtual EKernelType get_kernel_type (  )  [virtual]

return what type of kernel we are

返回:
kernel type COMMWORDSTRING

实现了CKernel

CWeightedCommWordStringKernel重载。

在文件CommWordStringKernel.h92行定义。

virtual const char* get_name (  )  const [virtual]

return the kernel's name

返回:
name CommWordString

实现了CSGObject

CWeightedCommWordStringKernel重载。

在文件CommWordStringKernel.h98行定义。

bool get_use_dict_diagonal_optimization (  ) 

get.use.dict.diagonal.optimization

返回:
true if diagonal optimization is on

在文件CommWordStringKernel.h198行定义。

bool init ( CFeatures l,
CFeatures r 
) [virtual]

initialize kernel

参数:
l features of left-hand side
r features of right-hand side
返回:
if initializing was successful

重载CStringKernel< uint16_t >

CWeightedCommWordStringKernel重载。

在文件CommWordStringKernel.cpp61行定义。

bool init_dictionary ( int32_t  size  )  [virtual]

initialize dictionary

参数:
size size

在文件CommWordStringKernel.cpp42行定义。

bool init_optimization ( int32_t  count,
int32_t *  IDX,
float64_t weights 
) [virtual]

initialize optimization

参数:
count count
IDX index
weights weights
返回:
if initializing was successful

重载CKernel

在文件CommWordStringKernel.cpp288行定义。

void set_use_dict_diagonal_optimization ( bool  flag  ) 

set_use_dict_diagonal_optimization

参数:
flag enable diagonal optimization

在文件CommWordStringKernel.h189行定义。


友元及相关函数文档

friend class CAvgDiagKernelNormalizer [friend]

重载CKernel

在文件CommWordStringKernel.h50行定义。

friend class CDiceKernelNormalizer [friend]

重载CKernel

在文件CommWordStringKernel.h54行定义。

friend class CFirstElementKernelNormalizer [friend]

重载CKernel

在文件CommWordStringKernel.h52行定义。

friend class CRidgeKernelNormalizer [friend]

重载CKernel

在文件CommWordStringKernel.h51行定义。

friend class CSqrtDiagKernelNormalizer [friend]

重载CKernel

在文件CommWordStringKernel.h49行定义。

friend class CTanimotoKernelNormalizer [friend]

重载CKernel

在文件CommWordStringKernel.h53行定义。

friend class CVarianceKernelNormalizer [friend]

重载CKernel

在文件CommWordStringKernel.h48行定义。


成员数据文档

int32_t* dict_diagonal_optimization [protected]

array to hold counters for all strings

在文件CommWordStringKernel.h247行定义。

int32_t dictionary_size [protected]

size of dictionary (number of possible strings)

在文件CommWordStringKernel.h236行定义。

dictionary weights - array to hold counters for all possible strings

在文件CommWordStringKernel.h239行定义。

whether diagonal optimization shall be used

在文件CommWordStringKernel.h245行定义。

bool use_sign [protected]

if sign shall be used

在文件CommWordStringKernel.h242行定义。


该类的文档由以下文件生成:

SHOGUN Machine Learning Toolbox - Documentation