Covariance Matrix Calculator

Compute the covariance matrix and correlation matrix from raw multivariate data. Supports sample and population covariance, summary statistics and PCA-ready outputs for statistics, data science and finance.

Full original guide (expanded)

Covariance Matrix Calculator

Multivariate variance & correlation

Paste or type your multivariate data and this tool computes the covariance matrix and, optionally, the corresponding correlation matrix. Choose between sample and population covariance and get summary statistics that are ready for PCA, portfolio theory, multivariate normal models and more.

sample & population covariance correlation matrix summary statistics CSV-style input PCA-ready

Data input & configuration

Separate values by commas, semicolons or whitespace. Each row must contain the same number of numeric values.

Comma-separated names. If left blank, the calculator uses X1, X2, ….

Covariance type

Use sample covariance when estimating from data (default).

Additional outputs

What is a covariance matrix?

Suppose you have a random vector \[ X = (X_1, X_2, \dots, X_p)^\top. \] The covariance matrix of \(X\) is the \(p \times p\) matrix \[ \Sigma = \operatorname{Cov}(X) = \begin{bmatrix} \operatorname{Var}(X_1) & \operatorname{Cov}(X_1, X_2) & \dots & \operatorname{Cov}(X_1, X_p) \\ \operatorname{Cov}(X_2, X_1) & \operatorname{Var}(X_2) & \dots & \operatorname{Cov}(X_2, X_p) \\ \vdots & \vdots & \ddots & \vdots \\ \operatorname{Cov}(X_p, X_1) & \operatorname{Cov}(X_p, X_2) & \dots & \operatorname{Var}(X_p) \end{bmatrix}. \]

It compactly summarises all pairwise linear relationships between your variables: variances on the diagonal, covariances off the diagonal.

Sample vs population covariance matrix

In practice you estimate covariance from a data matrix \(X\) with \(n\) observations (rows) and \(p\) variables (columns). Let \(x_{i j}\) be the value of variable \(j\) in observation \(i\), and let \(\bar{x}_j\) be the sample mean of variable \(j\).

Sample covariance matrix

For \(n\) observations and \(p\) variables, the sample covariance between variables \(j\) and \(k\) is \[ s_{j k} = \frac{1}{n - 1} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k). \] The sample covariance matrix \(S\) collects all \(s_{j k}\) in a \(p \times p\) matrix.

Dividing by \(n - 1\) gives an unbiased estimator of the population covariance when data are independent and identically distributed.

Population covariance matrix

If you can treat the data as the entire population, you may use \[ \sigma_{j k} = \frac{1}{n} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k), \] i.e. divide by \(n\) instead of \(n - 1\). This is the population covariance.

From covariance matrix to correlation matrix

The covariance between variables depends on the measurement units (for example, metres vs kilometres). To remove the effect of scale you can compute the correlation matrix, whose entries are the Pearson correlation coefficients: \[ \rho_{j k} = \frac{\operatorname{Cov}(X_j, X_k)}{\sqrt{\operatorname{Var}(X_j)\operatorname{Var}(X_k)}}. \]

In matrix form, if \(D\) is the diagonal matrix of standard deviations, the correlation matrix is \[ R = D^{-1} \Sigma D^{-1}. \] The diagonal entries of \(R\) are all 1, and each off-diagonal element lies between −1 and +1.

Applications of the covariance matrix

  • Principal component analysis (PCA) — PCA diagonalises the covariance matrix to find directions of maximum variance.
  • Portfolio theory — in quantitative finance, the covariance matrix of asset returns is central to risk modelling and optimisation.
  • Multivariate normal models — the covariance matrix parameterises the spread and shape of multivariate Gaussian distributions.
  • State estimation & Kalman filtering — process and measurement noise covariances are key design inputs.
  • Machine learning & data preprocessing — covariance is used in whitening, feature scaling and understanding feature interactions.

Numerical considerations

  • If the number of variables \(p\) is close to or larger than the number of observations \(n\), the covariance matrix may become singular or highly unstable.
  • Strong collinearity (nearly linear relationships between variables) can cause near-singularity and large condition numbers, which affect matrix inversion and eigen-decomposition.
  • For high-dimensional or ill-conditioned problems, consider regularised estimators (for example, shrinkage covariance) rather than the plain sample covariance.

Related multivariate tools


Audit: Complete
Formula (LaTeX) + variables + units
This section shows the formulas used by the calculator engine, plus variable definitions and units.
Formula (extracted LaTeX)
\[X = (X_1, X_2, \dots, X_p)^\top.\]
X = (X_1, X_2, \dots, X_p)^\top.
Formula (extracted LaTeX)
\[\Sigma = \operatorname{Cov}(X) = \begin{bmatrix} \operatorname{Var}(X_1) & \operatorname{Cov}(X_1, X_2) & \dots & \operatorname{Cov}(X_1, X_p) \\ \operatorname{Cov}(X_2, X_1) & \operatorname{Var}(X_2) & \dots & \operatorname{Cov}(X_2, X_p) \\ \vdots & \vdots & \ddots & \vdots \\ \operatorname{Cov}(X_p, X_1) & \operatorname{Cov}(X_p, X_2) & \dots & \operatorname{Var}(X_p) \end{bmatrix}.\]
\Sigma = \operatorname{Cov}(X) = \begin{bmatrix} \operatorname{Var}(X_1) & \operatorname{Cov}(X_1, X_2) & \dots & \operatorname{Cov}(X_1, X_p) \\ \operatorname{Cov}(X_2, X_1) & \operatorname{Var}(X_2) & \dots & \operatorname{Cov}(X_2, X_p) \\ \vdots & \vdots & \ddots & \vdots \\ \operatorname{Cov}(X_p, X_1) & \operatorname{Cov}(X_p, X_2) & \dots & \operatorname{Var}(X_p) \end{bmatrix}.
Formula (extracted LaTeX)
\[s_{j k} = \frac{1}{n - 1} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k).\]
s_{j k} = \frac{1}{n - 1} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k).
Formula (extracted LaTeX)
\[\sigma_{j k} = \frac{1}{n} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k),\]
\sigma_{j k} = \frac{1}{n} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k),
Formula (extracted LaTeX)
\[\rho_{j k} = \frac{\operatorname{Cov}(X_j, X_k)}{\sqrt{\operatorname{Var}(X_j)\operatorname{Var}(X_k)}}.\]
\rho_{j k} = \frac{\operatorname{Cov}(X_j, X_k)}{\sqrt{\operatorname{Var}(X_j)\operatorname{Var}(X_k)}}.
Formula (extracted LaTeX)
\[R = D^{-1} \Sigma D^{-1}.\]
R = D^{-1} \Sigma D^{-1}.
Formula (extracted text)
Sample covariance matrix For \(n\) observations and \(p\) variables, the sample covariance between variables \(j\) and \(k\) is \[ s_{j k} = \frac{1}{n - 1} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k). \] The sample covariance matrix \(S\) collects all \(s_{j k}\) in a \(p \times p\) matrix. Dividing by \(n - 1\) gives an unbiased estimator of the population covariance when data are independent and identically distributed.
Formula (extracted text)
Population covariance matrix If you can treat the data as the entire population, you may use \[ \sigma_{j k} = \frac{1}{n} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k), \] i.e. divide by \(n\) instead of \(n - 1\). This is the population covariance.
Variables and units
  • No variables provided in audit spec.
Sources (authoritative):
Changelog
Version: 0.1.0-draft
Last code update: 2026-01-19
0.1.0-draft · 2026-01-19
  • Initial audit spec draft generated from HTML extraction (review required).
  • Verify formulas match the calculator engine and convert any text-only formulas to LaTeX.
  • Confirm sources are authoritative and relevant to the calculator methodology.
Verified by Ugo Candido on 2026-01-19
Profile · LinkedIn
Formulas

(Formulas preserved from original page content, if present.)

Version 0.1.0-draft
Citations

Add authoritative sources relevant to this calculator (standards bodies, manuals, official docs).

Changelog
  • 0.1.0-draft — 2026-01-19: Initial draft (review required).