What is the difference between sample and population covariance?

Population covariance divides by N, where N is the total number of observations in the population. Sample covariance divides by N − 1 to correct bias when you only observe a sample. In practice, sample covariance is used when you estimate from data.

How should I format the input data?

Each row should be one observation, and each column should be one variable. You can separate values within a row by commas, semicolons or whitespace. All rows must have the same number of numeric entries and you need at least two variables and two observations.

When should I use the covariance matrix instead of the correlation matrix?

The covariance matrix keeps the original units and is needed for computations such as multivariate normal densities, Kalman filters, and portfolio optimisation. The correlation matrix standardises variances to 1 and is useful to compare strength and direction of linear relationships independently of the scale of each variable.

Why can the covariance matrix be singular or ill-conditioned?

If one variable is a linear combination of others or the number of observations is too small relative to the number of variables, the covariance matrix can become singular or nearly singular. In that case, some multivariate methods such as inversion or PCA need regularisation or dimensionality reduction.

Covariance Matrix Calculator

Q: What is a covariance matrix?

A covariance matrix is a square matrix that summarises the covariances between all pairs of components of a random vector. Its diagonal entries are variances of individual variables and off-diagonal entries are covariances between variables.

Q: What does this covariance matrix calculator do?

The calculator takes tabular data with multiple variables, computes the sample or population means of each variable, and then returns the covariance matrix and optionally the corresponding correlation matrix. It also reports basic summary statistics such as the number of observations and the variances of each variable.

Compute the covariance matrix and correlation matrix from raw multivariate data. Supports sample and population covariance, summary statistics and PCA-ready outputs for statistics, data science and finance.

Full original guide (expanded)

Covariance Matrix Calculator

Multivariate variance & correlation

Paste or type your multivariate data and this tool computes the covariance matrix and, optionally, the corresponding correlation matrix. Choose between sample and population covariance and get summary statistics that are ready for PCA, portfolio theory, multivariate normal models and more.

sample & population covariance correlation matrix summary statistics CSV-style input PCA-ready

Data input & configuration

Data table (one row per observation)

Separate values by commas, semicolons or whitespace. Each row must contain the same number of numeric values.

Variable labels (optional)

Comma-separated names. If left blank, the calculator uses X1, X2, ….

Covariance type

Use sample covariance when estimating from data (default).

Additional outputs

Compute correlation matrix

Show summary statistics

Results

Covariance matrix

Correlation matrix

The correlation between variables i and j is cov(i, j) / (σ_i σ_j). Diagonal entries are 1.0.

Show calculation details

This tool uses standard numerical formulas for sample and population covariance. For extremely high-dimensional data or numerically ill-conditioned problems, consider using specialised numerical libraries with regularisation and high-precision linear algebra.

What is a covariance matrix?

Suppose you have a random vector \[ X = (X_1, X_2, \dots, X_p)^\top. \] The covariance matrix of \(X\) is the \(p \times p\) matrix \[ \Sigma = \operatorname{Cov}(X) = \begin{bmatrix} \operatorname{Var}(X_1) & \operatorname{Cov}(X_1, X_2) & \dots & \operatorname{Cov}(X_1, X_p) \\ \operatorname{Cov}(X_2, X_1) & \operatorname{Var}(X_2) & \dots & \operatorname{Cov}(X_2, X_p) \\ \vdots & \vdots & \ddots & \vdots \\ \operatorname{Cov}(X_p, X_1) & \operatorname{Cov}(X_p, X_2) & \dots & \operatorname{Var}(X_p) \end{bmatrix}. \]

It compactly summarises all pairwise linear relationships between your variables: variances on the diagonal, covariances off the diagonal.

Sample vs population covariance matrix

In practice you estimate covariance from a data matrix \(X\) with \(n\) observations (rows) and \(p\) variables (columns). Let \(x_{i j}\) be the value of variable \(j\) in observation \(i\), and let \(\bar{x}_j\) be the sample mean of variable \(j\).

Sample covariance matrix

For \(n\) observations and \(p\) variables, the sample covariance between variables \(j\) and \(k\) is \[ s_{j k} = \frac{1}{n - 1} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k). \] The sample covariance matrix \(S\) collects all \(s_{j k}\) in a \(p \times p\) matrix.

Dividing by \(n - 1\) gives an unbiased estimator of the population covariance when data are independent and identically distributed.

Population covariance matrix

If you can treat the data as the entire population, you may use \[ \sigma_{j k} = \frac{1}{n} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k), \] i.e. divide by \(n\) instead of \(n - 1\). This is the population covariance.

From covariance matrix to correlation matrix

The covariance between variables depends on the measurement units (for example, metres vs kilometres). To remove the effect of scale you can compute the correlation matrix, whose entries are the Pearson correlation coefficients: \[ \rho_{j k} = \frac{\operatorname{Cov}(X_j, X_k)}{\sqrt{\operatorname{Var}(X_j)\operatorname{Var}(X_k)}}. \]

In matrix form, if \(D\) is the diagonal matrix of standard deviations, the correlation matrix is \[ R = D^{-1} \Sigma D^{-1}. \] The diagonal entries of \(R\) are all 1, and each off-diagonal element lies between −1 and +1.

Applications of the covariance matrix

Principal component analysis (PCA) — PCA diagonalises the covariance matrix to find directions of maximum variance.
Portfolio theory — in quantitative finance, the covariance matrix of asset returns is central to risk modelling and optimisation.
Multivariate normal models — the covariance matrix parameterises the spread and shape of multivariate Gaussian distributions.
State estimation & Kalman filtering — process and measurement noise covariances are key design inputs.
Machine learning & data preprocessing — covariance is used in whitening, feature scaling and understanding feature interactions.

Numerical considerations

If the number of variables \(p\) is close to or larger than the number of observations \(n\), the covariance matrix may become singular or highly unstable.
Strong collinearity (nearly linear relationships between variables) can cause near-singularity and large condition numbers, which affect matrix inversion and eigen-decomposition.
For high-dimensional or ill-conditioned problems, consider regularised estimators (for example, shrinkage covariance) rather than the plain sample covariance.

Related multivariate tools

Audit: Complete

Formula (LaTeX) + variables + units

This section shows the formulas used by the calculator engine, plus variable definitions and units.

Formula (extracted LaTeX)

\[X = (X_1, X_2, \dots, X_p)^\top.\]

X = (X_1, X_2, \dots, X_p)^\top.

Formula (extracted LaTeX)

\[\Sigma = \operatorname{Cov}(X) = \begin{bmatrix} \operatorname{Var}(X_1) & \operatorname{Cov}(X_1, X_2) & \dots & \operatorname{Cov}(X_1, X_p) \\ \operatorname{Cov}(X_2, X_1) & \operatorname{Var}(X_2) & \dots & \operatorname{Cov}(X_2, X_p) \\ \vdots & \vdots & \ddots & \vdots \\ \operatorname{Cov}(X_p, X_1) & \operatorname{Cov}(X_p, X_2) & \dots & \operatorname{Var}(X_p) \end{bmatrix}.\]

\Sigma = \operatorname{Cov}(X) = \begin{bmatrix} \operatorname{Var}(X_1) & \operatorname{Cov}(X_1, X_2) & \dots & \operatorname{Cov}(X_1, X_p) \\ \operatorname{Cov}(X_2, X_1) & \operatorname{Var}(X_2) & \dots & \operatorname{Cov}(X_2, X_p) \\ \vdots & \vdots & \ddots & \vdots \\ \operatorname{Cov}(X_p, X_1) & \operatorname{Cov}(X_p, X_2) & \dots & \operatorname{Var}(X_p) \end{bmatrix}.

Formula (extracted LaTeX)

\[s_{j k} = \frac{1}{n - 1} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k).\]

s_{j k} = \frac{1}{n - 1} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k).

Formula (extracted LaTeX)

\[\sigma_{j k} = \frac{1}{n} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k),\]

\sigma_{j k} = \frac{1}{n} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k),

Formula (extracted LaTeX)

\[\rho_{j k} = \frac{\operatorname{Cov}(X_j, X_k)}{\sqrt{\operatorname{Var}(X_j)\operatorname{Var}(X_k)}}.\]

\rho_{j k} = \frac{\operatorname{Cov}(X_j, X_k)}{\sqrt{\operatorname{Var}(X_j)\operatorname{Var}(X_k)}}.

Formula (extracted LaTeX)

\[R = D^{-1} \Sigma D^{-1}.\]

R = D^{-1} \Sigma D^{-1}.

Formula (extracted text)

Sample covariance matrix For \(n\) observations and \(p\) variables, the sample covariance between variables \(j\) and \(k\) is \[ s_{j k} = \frac{1}{n - 1} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k). \] The sample covariance matrix \(S\) collects all \(s_{j k}\) in a \(p \times p\) matrix. Dividing by \(n - 1\) gives an unbiased estimator of the population covariance when data are independent and identically distributed.

Formula (extracted text)

Population covariance matrix If you can treat the data as the entire population, you may use \[ \sigma_{j k} = \frac{1}{n} \sum_{i=1}^n (x_{i j} - \bar{x}_j)\,(x_{i k} - \bar{x}_k), \] i.e. divide by \(n\) instead of \(n - 1\). This is the population covariance.

Variables and units

No variables provided in audit spec.

Sources (authoritative):

One-Way ANOVA Calculator — calcdomain.com · Accessed 2026-01-19
https://calcdomain.com/one-way-anova
A/B Test Significance Calculator — calcdomain.com · Accessed 2026-01-19
https://calcdomain.com/a-b-test-significance
QR Decomposition — calcdomain.com · Accessed 2026-01-19
https://calcdomain.com/qr-decomposition
Complex Number Calculator — calcdomain.com · Accessed 2026-01-19
https://calcdomain.com/complex-number
Fast Fourier Transform (FFT) — calcdomain.com · Accessed 2026-01-19
https://calcdomain.com/fft
Percent Error Calculator — calcdomain.com · Accessed 2026-01-19
https://calcdomain.com/percent-error
Significant Figures Calculator — calcdomain.com · Accessed 2026-01-19
https://calcdomain.com/significant-figures
Modular Inverse Calculator — calcdomain.com · Accessed 2026-01-19
https://calcdomain.com/modular-inverse

Changelog

Version: 0.1.0-draft
Last code update: 2026-01-19

0.1.0-draft · 2026-01-19

Initial audit spec draft generated from HTML extraction (review required).
Verify formulas match the calculator engine and convert any text-only formulas to LaTeX.
Confirm sources are authoritative and relevant to the calculator methodology.

Verified by Ugo Candido on 2026-01-19
Profile · LinkedIn

Good practices when using covariance

Check for obvious data entry errors or unit mismatches before computing covariance.
Standardise or scale variables when comparing magnitudes across very different units.
Inspect the correlation matrix to understand strong positive or negative relationships.
Be cautious when the number of variables is large relative to the sample size.
For downstream inversion or PCA, consider regularisation if the matrix is nearly singular.

Formulas

(Formulas preserved from original page content, if present.)

Version 0.1.0-draft

Citations

Add authoritative sources relevant to this calculator (standards bodies, manuals, official docs).

Changelog

0.1.0-draft — 2026-01-19: Initial draft (review required).