skdim.id.DANCo

class skdim.id.DANCo(k=10, D=None, calibration_data=None, ver='DANCo', fractal=True, verbose=False, random_state=None)[source]

Intrinsic dimension estimation using the Dimensionality from Angle and Norm Concentration algorithm. [Ceruti2012] [IDLombardi] [IDJohnsson]

Parameters:
k : int, default=10

Neighborhood parameter.

D : int, default=None

Maximal dimension

ver : str, default='DANCo'

Version to use. possible values: ‘DANCo’, ‘MIND_MLi’, ‘MIND_MLk’.

calibration_data : dict, default=None

Precomputed calibration data.

fractal : bool, default=True

Whether to return fractal rather than integer dimension

verbose : bool, default=False

Methods

fit(X[, y]) A reference implementation of a fitting function.
fit_pw(X[, precomputed_knn, smooth, …]) Creates an array of pointwise ID estimates (self.dimension_pw_) by fitting the estimator in kNN of each point.
fit_transform(X[, y]) Fit estimator and return ID
fit_transform_pw(X[, precomputed_knn, …]) Returns an array of pointwise ID estimates by fitting the estimator in kNN of each point.
get_params([deep]) Get parameters for this estimator.
set_params(**params) Set the parameters of this estimator.
transform([X]) Predict dimension after a previous call to self.fit
transform_pw([X]) Return an array of pointwise ID estimates after a previous call to self.fit_pw
fit(X, y=None)[source]

A reference implementation of a fitting function.

Parameters:
X : {array-like}, shape (n_samples, n_features)

A data set for which the intrinsic dimension is estimated.

y : dummy parameter to respect the sklearn API

Returns:

  • self (object) – Returns self.
  • self.dimension_ (int (or float if fractal is True)) – The estimated intrinsic dimension
  • self.kl_divergence (float) – The KL divergence between data and reference data for the estimated dimension (if ver == ‘DANCo’).
  • self.calibration_data (dict) – Calibration data that can be reused when applying DANCo to data sets of the same size with the same neighborhood parameter k.

fit_pw(X, precomputed_knn=None, smooth=False, n_neighbors=100, n_jobs=1)

Creates an array of pointwise ID estimates (self.dimension_pw_) by fitting the estimator in kNN of each point.

Parameters:
X : np.array (n_samples x n_neighbors)

Dataset to fit

precomputed_knn : np.array (n_samples x n_dims)

An array of precomputed (sorted) nearest neighbor indices

n_neighbors

Number of nearest neighbors to use (ignored when using precomputed_knn)

n_jobs : int

Number of processes

smooth : bool, default = False

Additionally computes a smoothed version of pointwise estimates by

taking the ID of a point as the average ID of each point in its neighborhood (self.dimension_pw_)

smooth_

Returns:

self (object) – Returns self

fit_transform(X, y=None)

Fit estimator and return ID

Parameters:
X : {array-like}, shape (n_samples, n_features)

The training input samples.

Returns:

dimension_ ({int, float}) – The estimated intrinsic dimension

fit_transform_pw(X, precomputed_knn=None, smooth=False, n_neighbors=100, n_jobs=1)

Returns an array of pointwise ID estimates by fitting the estimator in kNN of each point.

Parameters:
X : np.array (n_samples x n_neighbors)

Dataset to fit

precomputed_knn : bool

An array of precomputed (sorted) nearest neighbor indices

n_neighbors

Number of nearest neighbors to use (ignored when using precomputed_knn)

n_jobs : int

Number of processes

smooth : bool, default = False

Additionally computes a smoothed version of pointwise estimates by

taking the ID of a point as the average ID of each point in its neighborhood (self.dimension_pw_)

smooth_

Returns:

  • dimension_pw_ (np.array with dtype {int, float}) – Pointwise ID estimates
  • dimension_pw_smooth_ (np.array with dtype float) – Smoothed pointwise ID estimates returned if self.fit_pw(smooth=True)

get_params(deep=True)

Get parameters for this estimator.

Parameters:
deep : bool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params (dict) – Parameter names mapped to their values.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**params : dict

Estimator parameters.

Returns:

self (estimator instance) – Estimator instance.

transform(X=None)

Predict dimension after a previous call to self.fit

Parameters:
X : Dummy parameter

Returns:

dimension_ ({int, float}) – The estimated ID

transform_pw(X=None)

Return an array of pointwise ID estimates after a previous call to self.fit_pw

Parameters:
X : Dummy parameter

Returns:

  • dimension_pw_ (np.array with dtype {int, float}) – Pointwise ID estimates
  • dimension_pw_smooth_ (np.array with dtype float) – Smoothed pointwise ID estimates returned if self.fit_pw(smooth=True)