As mentioned before SHOGUN interfaces to several programming languages and toolkits such as Matlab(tm), R, Python, Octave. The following sections shall give you an overview over the static interface commands of SHOGUN. For the static interfaces we tried to preserve the syntax of the commands in a consistent manner through all the different languages. However as in some cases this was not possible and we document the subtle differences of syntax and semantic in the respective toolkit. Instead of reading through all this, we suggest to have a look at the large number of examples available in the
interface / examples directory. For example R/examples or python/examples etc.
Overview of Static Interfaces & Testing the Installation
Interface Commands
Command Reference
Since octave is nowadays up to par with matlab a single documentation for both interfaces is sufficient and will be based on octave (matlab can be used synonymously).
To start SHOGUN in octave, start octave and check if it is correctly installed by by typing ( let ">" be the octave prompt )
inside of octave. This should show you some help text.
To start SHOGUN in python, start python and check if it is correctly installed by by typing ( let ">" be the python prompt )
from sg import sg
sg('help')
inside of python. This should show you some help text.
To fire up SHOGUN in R make sure that you have SHOGUN correctly installed in R. You can check this by typing ( let ">" be the R prompt ):
inside of R, this command should list all R packages that have been installed on your system. You should have an entry like:
sg The SHOGUN Machine Learning Toolbox
After you made sure that SHOGUN is installed correctly you can start it via:
you will see some informations of the SHOGUN core (compile options etc). After this command R and SHOGUN are ready to receive your commands.
In general all commands in SHOGUN are issued using the function sg(...). To invoke the SHOGUN command help one types:
and then a help text appears giving a short description of all commands.
These functions transfer
data from the interface to shogun and back. Suppose you have a matlab matrix or R vector "features" which contains your training
data and you want to register this
data, you simply type::
Transfer the features to shogun
- set_features
sg('set_features', 'TRAIN|TEST', features[, DNABINFILE|<ALPHABET>])
- add_features
sg('add_features', 'TRAIN|TEST', features[, DNABINFILE|<ALPHABET>])
Features can be char/byte/word/int/real valued matrices, real values sparse matrices, or strings (lists or cell arrays of strings). When dealing with strings an alphabet name has to be specified (DNA, RAW, ...). Use 'TRAIN' to tell SHOGUN that this is the
data you want to train your classifier and TEST for the test
data.
In contrast to set_features, add_features will create a combined feature object and append the features to it. This is useful when dealing with a set of different features (real valued and strings) and multiple kernels.
In case a single string was set using set_features, it can be "multiplexed" by sliding a window over it using
- from_position_list
sg('from_position_list', 'TRAIN|TEST', winsize, shift[, skip])
or
- obtain_from_sliding_window
sg('obtain_from_sliding_window, winsize, skip)
Deletes the features which we assigned before in the actual SHOGUN session.
Obtain the Features from shogun
- get_features
[features]=sg('get_features', 'TRAIN|TEST')
One proceeds similar when assigning labels to the training
data and obtaining labels from shogun: The commands
- set_labels
sg('set_labels', 'TRAIN', trainlab)
- get_labels
[labels]=sg('get_labels', 'TRAIN|TEST')
tell SHOGUN that the labels of the assigned training
data reside in trainlab, respectively return the current labels.
Kernel and DistanceMatrix specific commands, used to create, obtain and setting the kernel matrix.
Creating a kernel in shogun
- set_kernel
sg('set_kernel KERNELNAME FEATURETYPE CACHESIZE PARAMETERS')
- add_kernel
sg('add_kernel WEIGHT KERNELNAME FEATURETYPE CACHESIZE PARAMETERS')
Here KERNELNAME is the name of the kernel one wishes to use, FEATURETYPE the type of features (e.g. REAL for standard realvalued feature vectors), CACHESIZE the size of the kernel cache in megabytes and PARAMETERS kernel specific additional parameters.
The following kernels are implemented in SHOGUN:
- AUC
- Chi2
- Spectrum
- Const Kernel
- User defined CustomKernel
- Diagonal Kernel
- Kernel from Distance
- Fixed Degree StringKernel
- Gaussian

To work with a gaussian kernel on real values one issues:
sg('set_kernel GAUSSIAN TYPE CACHESIZE SIGMA')
For example:
sg('set_kernel GAUSSIAN REAL 40 1')
creates a gaussian kernel on real values with a cache size of 40MB and a sigma value of one. Available types for the gaussian kernel: REAL, SPARSEREAL.
A linear kernel is created via:
sg('set_kernel LINEAR TYPE CACHESIZE')
For example:
sg('send_command', 'add_kernel 1.0 LINEAR REAL 50')
creates a linear kernel of cache size 50 for real datavalues, with weight 1.0.
Available types for the linear kernel: BYTE, WORD CHAR, REAL, SPARSEREAL.
- Local Alignment StringKernel
- Locality Improved StringKernel
- Polynomial Kernel

A polynomial kernel is created via:
sg('set_kernel POLY TYPE CACHESIZE DEGREE INHOMOGENE NORMALIZE')
For example:
sg('add_kernel 0.1 POLY REAL 50 3 0')
adds a polynomial kernel. Available types for the polynomial kernel: REAL, CHAR, SPARSEREAL.
- Salzberg Kernel
- Sigmoid Kernel To work with a sigmoid kernel on real values one issues:
sg("send_command", "set_kernel SIGMOID TYPE CACHESIZE GAMMA COEFF")
For example:
sg('set_kernel SIGMOID REAL 40 0.1 0.1')
creates a sigmoid kernel on real values with a cache size of 40MB, a gamma value of 0.1 and a coefficient of 0.1. Available types for the gaussian kernel: REAL.
Assign a user defined custom kernel, fo which only the upper triangle may be given (DIAG) or the FULL matrix (FULL), or the full matrix which is then internally stored as a upper triangle (FULL2DIAG).
- set_custom_kernel
sg('set_custom_kernel', kernelmatrix, 'DIAG|FULL|FULL2DIAG')
The purpose of the get_kernel_matrix and get_distance_matrix commands is to return a kernel or distance matrix representing the kernel/distance matrix for the actual
problem.
- get_distance_matrix
[D]=sg('get_distance_matrix')
- get_kernel_matrix
[K]=sg('get_kernel_matrix')
km refers to a matrix object.
- new_svm Creates a new SVM instance.
- init_kernel Initializes the kernel
- svm_train Starts the training of the SVM on the assigned features and kernels.
The get_svm command returns some properties of an SVM such as the Langrange multipliers alpha, the bias b and the index of the support vectors SV (zero based).
- get_svm
[bias, alphas]=sg('get_svm')
- set_svm
sg('set_svm', bias, alphas)
This commands returns a list of arguments, which may need special treatment different in the different target interfaces.
set_svm may be later on used (after creating an SVM classifier) to set alphas and bias again.
The result of the classification of the test samples is obtained via:
- classify
- classify_example
[result]=sg('classify_example', feature_vector_index)
where result is a vector containing the classification result for each datapoint and classify_example only obtains the output for a single example (index zero based).
- get_hmm
- set_hmm
- hmm_classify
- hmm_classify_example
- hmm_likelihood
- get_viterbi_path
- compute_poim_wd
- get_SPEC_consensus
- get_SPEC_scoring
- get_WD_consensus
- get_WD_scoring
Miscellaneous functions.
Returns the svn version number
Gives you a help text.
Sets a debugging log level - useful to trace errors.
- loglevel LEVEL can be one of ALL, WARN, ERROR
- ALL: verbose logging output (useful for debugging).
- WARN: less logging output (useful for error search).
- ERROR: only logging output on critical errors.
For example
gives you a list of instructions.
Let's get started, equipped with the above information on the basic SHOGUN commands you are now able to create your own SHOGUN applications.
Let us discuss an example:
sg('set_features', 'TRAIN', traindat)
registers the training samples which reside in traindat.
sg('set_labels', 'TRAIN', trainlab)
registers the training labels.
sg('set_kernel GAUSSIAN REAL 100 1.0')
creates a new gaussian kernel for reals with cache size 100Mb and width = 1.
- attaches the data to the kernel and does some initialization.
- creates a new SVM object inside the SHOGUN core.
- sets the C value of the new SVM to 20.0.
- starts the training on the samples.
- registers the test samples
- attaches the data to the kernel.
- gives you the classification result as a vector.