6.5. Unsupervised dimensionality reduction#
If your number of features is high, it may be useful to reduce it with an
unsupervised step prior to supervised steps. Many of the
Unsupervised learning methods implement a transform
method that
can be used to reduce the dimensionality. Below we discuss two specific
example of this pattern that are heavily used.
6.5.1. PCA: principal component analysis#
decomposition.PCA
looks for a combination of features that
capture well the variance of the original features. See Decomposing signals in components (matrix factorization problems).
6.5.2. Random projections#
The module: random_projection
provides several tools for data
reduction by random projections. See the relevant section of the
documentation: Random Projection.
6.5.3. Feature agglomeration#
cluster.FeatureAgglomeration
applies
Hierarchical clustering to group together features that behave
similarly.