PCA Calculator – Principal Component Analysis
Upload or paste your data and instantly compute principal components, eigenvalues, explained variance, scores and loadings.
1. Input data
Paste a numeric data matrix (rows = observations, columns = variables). Use commas or tabs as separators. Optionally include a header row and/or a first column with observation labels.
2. Options
Used to suggest how many components to keep.
Run PCA
How this PCA calculator works
This tool performs a standard Principal Component Analysis (PCA) on your numeric data matrix. You can choose between:
- Covariance PCA (center only) – appropriate when variables are on comparable scales.
- Correlation PCA (standardize) – recommended when variables have different units or variances.
Step-by-step algorithm
- Build the data matrix \(X\) of size \(n \times p\) (n observations, p variables).
- Center each column: subtract its mean \(\mu_j\).
- Optionally standardize each column by its standard deviation \(s_j\): \(z_{ij} = (x_{ij} - \mu_j)/s_j\).
-
Compute the covariance or correlation matrix:
\[ S = \frac{1}{n-1} X_c^\top X_c \]
-
Eigen-decomposition of \(S\):
\[ S v_k = \lambda_k v_k \]where \(\lambda_k\) are eigenvalues and \(v_k\) the eigenvectors (loadings).
- Sort eigenvalues in descending order and reorder eigenvectors accordingly.
-
Compute scores (principal components):
\[ Z = X_c V \]where \(V\) is the matrix of eigenvectors.
Explained variance
Each eigenvalue \(\lambda_k\) measures the variance captured by component \(k\). The proportion of variance explained is:
The calculator uses your target cumulative variance (e.g. 90%) to suggest how many components to retain.
Interpreting loadings and scores
- Loadings show how strongly each original variable contributes to a component.
- Scores are the coordinates of each observation in the reduced-dimensional space.
- Observations close together in the first 2–3 components are similar in terms of the original variables.
When to use PCA
- To reduce dimensionality before clustering or regression.
- To visualize high-dimensional data in 2D or 3D.
- To remove multicollinearity between predictors.
Frequently asked questions
Do I need to normalize my data?
If all variables are measured in the same units and have similar variance, centering may be enough. If not, select Standardize (z-scores) so that each variable has mean 0 and variance 1.
What if my dataset has missing values?
This calculator currently ignores rows that contain non-numeric values in any variable. For serious analysis, consider imputing missing values (e.g. mean imputation, k-NN, or model-based methods) before running PCA.
Is this PCA the same as in R, Python, or SPSS?
Yes, the core math is the same: eigen-decomposition of the
covariance or correlation matrix. Minor numerical differences
can arise from rounding and implementation details, but results
should closely match functions like prcomp in R or
sklearn.decomposition.PCA in Python.