Package mvpa :: Package clfs :: Module distance
[hide private]
[frames] | no frames]

Module distance

source code

Distance functions to be used in kernels and elsewhere
Functions [hide private]
 
cartesianDistance(a, b)
Return Cartesian distance between a and b
source code
 
absminDistance(a, b)
Returns dinstance max(|a-b|) XXX There must be better name!
source code
 
manhattenDistance(a, b)
Return Manhatten distance between a and b
source code
 
mahalanobisDistance(x, y=None, w=None)
Caclulcate Mahalanobis distance of the pairs of points.
source code
 
squared_euclidean_distance(data1, data2=None, weight=None)
Compute weighted euclidean distance matrix between two datasets.
source code
 
pnorm_w_python(data1, data2=None, weight=None, p=2, heuristic='auto', use_sq_euclidean=True)
Weighted p-norm between two datasets (pure Python implementation)
source code
 
pnorm_w(data1, data2=None, weight=None, p=2, heuristic='auto', use_sq_euclidean=True)
Weighted p-norm between two datasets (pure Python implementation)
source code

Imports: N, externals, debug, warning, weave, converters


Function Details [hide private]

absminDistance(a, b)

source code 

Returns dinstance max(|a-b|) XXX There must be better name!

Useful to select a whole cube of a given "radius"

mahalanobisDistance(x, y=None, w=None)

source code 

Caclulcate Mahalanobis distance of the pairs of points.

Inverse covariance matrix can be calculated with the following

w = N.linalg.solve(N.cov(x.T), N.identity(x.shape[1]))

or

w = N.linalg.inv(N.cov(x.T))
Parameters:
  • x - first list of points. Rows are samples, columns are features.
  • y - second list of points (optional)
  • w (N.ndarray) - optional inverse covariance matrix between the points. It is computed if not given

squared_euclidean_distance(data1, data2=None, weight=None)

source code 
Compute weighted euclidean distance matrix between two datasets.
Parameters:
  • data1 (N.ndarray) - first dataset
  • data2 (N.ndarray) - second dataset. If None, compute the euclidean distance between the first dataset versus itself. (Defaults to None)
  • weight (N.ndarray) - vector of weights, each one associated to each dimension of the dataset (Defaults to None)

pnorm_w_python(data1, data2=None, weight=None, p=2, heuristic='auto', use_sq_euclidean=True)

source code 

Weighted p-norm between two datasets (pure Python implementation)

||x - x'||_w = (sum_{i=1...N} (w_i*|x_i - x'_i|)**p)**(1/p)

Parameters:
  • data1 (N.ndarray) - First dataset
  • data2 (N.ndarray or None) - Optional second dataset
  • weight (N.ndarray or None) - Optional weights per 2nd dimension (features)
  • p - Power
  • heuristic (basestring) -
    Which heuristic to use:
    • 'samples' -- python sweep over 0th dim
    • 'features' -- python sweep over 1st dim
    • 'auto' decides automatically. If # of features (shape[1]) is much larger than # of samples (shape[0]) -- use 'samples', and use 'features' otherwise
  • use_sq_euclidean (bool) - Either to use squared_euclidean_distance_matrix for computation if p==2

pnorm_w(data1, data2=None, weight=None, p=2, heuristic='auto', use_sq_euclidean=True)

source code 

Weighted p-norm between two datasets (pure Python implementation)

||x - x'||_w = (sum_{i=1...N} (w_i*|x_i - x'_i|)**p)**(1/p)

Parameters:
  • data1 (N.ndarray) - First dataset
  • data2 (N.ndarray or None) - Optional second dataset
  • weight (N.ndarray or None) - Optional weights per 2nd dimension (features)
  • p - Power
  • heuristic (basestring) -
    Which heuristic to use:
    • 'samples' -- python sweep over 0th dim

    • 'features' -- python sweep over 1st dim

    • 'auto' decides automatically. If # of features (shape[1]) is much larger than # of samples (shape[0]) -- use 'samples', and use 'features' otherwise

  • use_sq_euclidean (bool) - Either to use squared_euclidean_distance_matrix for computation if p==2