skdim.id.FisherS¶

class skdim.id.FisherS(conditional_number=10, project_on_sphere=1, alphas=None, produce_plots=False, verbose=0, limit_maxdim=False)[source]¶

Intrinsic dimension estimation using the Fisher Separability algorithm. [Albergante2019]

Parameters:

conditional_number : float, default=10: A positive real value used to select the top principal components. We consider only PCs with eigen values which are not less than the maximal eigenvalue divided by conditional_number
project_on_sphere : bool, default=True: A boolean value indicating if projecting on a sphere should be performed.
test_alphas : 2D np.array with dtype float: A row vector of floats, with alpha range, the values must be given increasing within (0,1) interval. Default is np.arange(.6,1,.02)[None].
produce_plots : bool, default=False: A boolean value indicating if the standard plots need to be drawn.
verbose : bool: Whether to print the number of retained principal components
limit_maxdim : bool: Whether to cap estimated maxdim to the embedding dimension

Methods

`buildSeparabilityGraph`(data, alpha[, top_edges])	weighted directed separability graph, represented by a list of tuples (point i, point j) and an array of weights each tuple is the observation that point i is inseparable from j, the weight is <x_i,x_j>/<xi,xi>-alpha
`check_symmetric`(a[, rtol, atol])
`fit`(X[, y])	A reference implementation of a fitting function.
`fit_pw`(X[, precomputed_knn, smooth, …])	Creates an array of pointwise ID estimates (self.dimension_pw_) by fitting the estimator in kNN of each point.
`fit_transform`(X[, y])	Fit estimator and return ID
`fit_transform_pw`(X[, precomputed_knn, …])	Returns an array of pointwise ID estimates by fitting the estimator in kNN of each point.
`getSeparabilityGraph`([idx, top_edges])
`get_params`([deep])	Get parameters for this estimator.
`plotSeparabilityGraph`(x, y, edges[, alpha])
`point_inseparability_to_pointID`([idx, …])	Turn pointwise inseparability probability into pointwise global ID Inputs : args : same as SeparabilityAnalysis kwargs : idx : int, string int for custom alpha index ‘all_inseparable’ to choose alpha where lal points have non-zero inseparability probability ‘selected’ to keep global alpha selected force_definite_dim : bool whether to force fully separable points to take the minimum detectable inseparability value (1/(n-1)) (i.e., maximal detectable dimension)
`set_params`(**params)	Set the parameters of this estimator.
`transform`([X])	Predict dimension after a previous call to self.fit
`transform_pw`([X])	Return an array of pointwise ID estimates after a previous call to self.fit_pw

static buildSeparabilityGraph(data, alpha, top_edges=10000)[source]¶

weighted directed separability graph, represented by a list of tuples (point i, point j) and an array of weights each tuple is the observation that point i is inseparable from j, the weight is <x_i,x_j>/<xi,xi>-alpha

data is a matrix of data which is assumed to be properly normalized alpha parameter is a signle value in this case top_edges is the number of edges to return. if top_edges is negative then all edges will be returned

fit(X, y=None)[source]¶

A reference implementation of a fitting function.

Parameters:

X : {array-like}, shape (n_samples, n_features): The training input samples.
y : dummy parameter to respect the sklearn API

Returns:

self (object) – Returns self.
self.dimension_ (float) – The estimated intrinsic dimension
self.n_alpha (1D np.array, float) – Effective dimension profile as a function of alpha
self.n_single (float) – A single estimate for the effective dimension
self.p_alpha (2D np.array, float) – Distributions as a function of alpha, matrix with columns corresponding to the alpha values, and with rows corresponding to objects.
self.separable_fraction (1D np.array, float) – Separable fraction of data points as a function of alpha
self.alphas (2D np.array, float) – Input alpha values

fit_pw(X, precomputed_knn=None, smooth=False, n_neighbors=100, n_jobs=1)¶

Creates an array of pointwise ID estimates (self.dimension_pw_) by fitting the estimator in kNN of each point.

Parameters:

X : np.array (n_samples x n_neighbors)

Dataset to fit

precomputed_knn : np.array (n_samples x n_dims)

An array of precomputed (sorted) nearest neighbor indices

n_neighbors

Number of nearest neighbors to use (ignored when using precomputed_knn)

n_jobs : int

Number of processes

smooth : bool, default = False

Additionally computes a smoothed version of pointwise estimates by: taking the ID of a point as the average ID of each point in its neighborhood (self.dimension_pw_)

smooth_

Returns:

self (object) – Returns self

fit_transform(X, y=None)¶

Fit estimator and return ID

Parameters:	X : {array-like}, shape (n_samples, n_features) The training input samples.
Returns:	dimension_ ({int, float}) – The estimated intrinsic dimension

fit_transform_pw(X, precomputed_knn=None, smooth=False, n_neighbors=100, n_jobs=1)¶

Returns an array of pointwise ID estimates by fitting the estimator in kNN of each point.

Parameters:

X : np.array (n_samples x n_neighbors)

Dataset to fit

precomputed_knn : bool

An array of precomputed (sorted) nearest neighbor indices

n_neighbors

Number of nearest neighbors to use (ignored when using precomputed_knn)

n_jobs : int

Number of processes

smooth : bool, default = False

Additionally computes a smoothed version of pointwise estimates by: taking the ID of a point as the average ID of each point in its neighborhood (self.dimension_pw_)

smooth_

Returns:

dimension_pw_ (np.array with dtype {int, float}) – Pointwise ID estimates
dimension_pw_smooth_ (np.array with dtype float) – Smoothed pointwise ID estimates returned if self.fit_pw(smooth=True)

get_params(deep=True)¶

Get parameters for this estimator.

Parameters:	deep : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:	params (dict) – Parameter names mapped to their values.

point_inseparability_to_pointID(idx='all_inseparable', force_definite_dim=True, verbose=True)[source]¶

Turn pointwise inseparability probability into pointwise global ID Inputs :

args : same as SeparabilityAnalysis kwargs :

idx : int, string

int for custom alpha index ‘all_inseparable’ to choose alpha where lal points have non-zero inseparability probability ‘selected’ to keep global alpha selected

force_definite_dim : bool

whether to force fully separable points to take the minimum detectable inseparability value (1/(n-1)) (i.e., maximal detectable dimension)

set_params(**params)¶

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:	*params : dict* Estimator parameters.
Returns:	self (estimator instance) – Estimator instance.

transform(X=None)¶

Predict dimension after a previous call to self.fit

Parameters:	X : Dummy parameter
Returns:	dimension_ ({int, float}) – The estimated ID

transform_pw(X=None)¶

Return an array of pointwise ID estimates after a previous call to self.fit_pw

Parameters:

X : Dummy parameter

Returns:

dimension_pw_ (np.array with dtype {int, float}) – Pointwise ID estimates
dimension_pw_smooth_ (np.array with dtype float) – Smoothed pointwise ID estimates returned if self.fit_pw(smooth=True)