Generate a mostly low rank matrix with bell-shaped singular values
Most of the variance can be explained by a bell-shaped curve of width
effective_rank: the low rank part of the singular values profile is:
(1 - tail_strength) * exp(-1.0 * (i / effective_rank) ** 2)
The remaining singular values’ tail is fat, decreasing as:
tail_strength * exp(-0.1 * i / effective_rank).
The low rank part of the profile can be considered the structured
signal part of the data while the tail can be considered the noisy
part of the data that cannot be summarized by a low number of linear
components (singular vectors).
gray level pictures of faces
TF-IDF vectors of text documents crawled from the web
Read more in the User Guide.
n_samples (int, optional (default=100)) – The number of samples.
n_features (int, optional (default=100)) – The number of features.
effective_rank (int, optional (default=10)) – The approximate number of singular vectors required to explain most of
the data by linear combinations.
tail_strength (float between 0.0 and 1.0, optional (default=0.5)) – The relative importance of the fat noisy tail of the singular values
random_state (int, RandomState instance or None (default)) – Determines random number generation for dataset creation. Pass an int
for reproducible output across multiple function calls.
chunk_size (int or tuple of int or tuple of ints, optional) – Desired chunk size on each dimension
X – The matrix.
array of shape [n_samples, n_features]