PCA Calculator – Principal Component Analysis

Upload or paste your data and instantly compute principal components, eigenvalues, explained variance, scores and loadings.

1. Input data

Paste a numeric data matrix (rows = observations, columns = variables). Use commas or tabs as separators. Optionally include a header row and/or a first column with observation labels.

2. Options

Used to suggest how many components to keep.

Run PCA

How this PCA calculator works

This tool performs a standard Principal Component Analysis (PCA) on your numeric data matrix. You can choose between:

  • Covariance PCA (center only) – appropriate when variables are on comparable scales.
  • Correlation PCA (standardize) – recommended when variables have different units or variances.

Step-by-step algorithm

  1. Build the data matrix \(X\) of size \(n \times p\) (n observations, p variables).
  2. Center each column: subtract its mean \(\mu_j\).
  3. Optionally standardize each column by its standard deviation \(s_j\): \(z_{ij} = (x_{ij} - \mu_j)/s_j\).
  4. Compute the covariance or correlation matrix:
    \[ S = \frac{1}{n-1} X_c^\top X_c \]
  5. Eigen-decomposition of \(S\):
    \[ S v_k = \lambda_k v_k \]
    where \(\lambda_k\) are eigenvalues and \(v_k\) the eigenvectors (loadings).
  6. Sort eigenvalues in descending order and reorder eigenvectors accordingly.
  7. Compute scores (principal components):
    \[ Z = X_c V \]
    where \(V\) is the matrix of eigenvectors.

Explained variance

Each eigenvalue \(\lambda_k\) measures the variance captured by component \(k\). The proportion of variance explained is:

\[ \text{Var}_k = \frac{\lambda_k}{\sum_{j=1}^{p} \lambda_j}, \quad \text{CumVar}_k = \sum_{i=1}^{k} \text{Var}_i \]

The calculator uses your target cumulative variance (e.g. 90%) to suggest how many components to retain.

Interpreting loadings and scores

  • Loadings show how strongly each original variable contributes to a component.
  • Scores are the coordinates of each observation in the reduced-dimensional space.
  • Observations close together in the first 2–3 components are similar in terms of the original variables.

When to use PCA

  • To reduce dimensionality before clustering or regression.
  • To visualize high-dimensional data in 2D or 3D.
  • To remove multicollinearity between predictors.

Frequently asked questions

Do I need to normalize my data?

If all variables are measured in the same units and have similar variance, centering may be enough. If not, select Standardize (z-scores) so that each variable has mean 0 and variance 1.

What if my dataset has missing values?

This calculator currently ignores rows that contain non-numeric values in any variable. For serious analysis, consider imputing missing values (e.g. mean imputation, k-NN, or model-based methods) before running PCA.

Is this PCA the same as in R, Python, or SPSS?

Yes, the core math is the same: eigen-decomposition of the covariance or correlation matrix. Minor numerical differences can arise from rounding and implementation details, but results should closely match functions like prcomp in R or sklearn.decomposition.PCA in Python.