Principal components analysis using pandas dataframe

Most sklearn objects work with pandas dataframes just fine, would something like this work for you? import pandas as pd import numpy as np from sklearn.decomposition import PCA df = pd.DataFrame(data=np.random.normal(0, 1, (20, 10))) pca = PCA(n_components=5) You can access the components themselves with pca.components_

Recovering features names of explained_variance_ratio_ in PCA with sklearn

This information is included in the pca attribute: components_. As described in the documentation, pca.components_ outputs an array of [n_components, n_features], so to get how components are linearly related with the different features you have to: Note: each coefficient represents the correlation between a particular pair of component and feature import pandas as pd import … Read more

Principal component analysis in Python

Months later, here’s a small class PCA, and a picture: #!/usr/bin/env python “”” a small class for Principal Component Analysis Usage: p = PCA( A, fraction=0.90 ) In: A: an array of e.g. 1000 observations x 20 variables, 1000 rows x 20 columns fraction: use principal components that account for e.g. 90 % of the … Read more