G-test Calculator (Likelihood-Ratio Test)

This calculator implements the statistical G-test (likelihood-ratio test) for categorical data. It is not related to medical or prenatal screening products sometimes called “G-test”.

Use it to compute the G statistic, degrees of freedom and p-value for goodness-of-fit problems and contingency tables (tests of independence).

G-test calculator

Test type

All inputs are counts (frequencies). Use comma or dot as decimal separator; counts are rounded to non-negative numbers.

Between 2 and 10 categories.

For “custom expected”, the calculator rescales your expected counts to match the total sample size.

df = k − 1 − parameters. For many textbook problems this is 0.

Output uses the chi-square approximation to obtain the p-value.

G-test formula (likelihood-ratio test)

The G-test compares observed counts with expected counts under a null hypothesis. The test statistic is

\( G = 2 \sum_i O_i \ln\left(\dfrac{O_i}{E_i}\right) \)

  • \( O_i \) = observed count in cell i
  • \( E_i \) = expected count in cell i under the null hypothesis

For contingency tables, the sum runs over all cells \((i, j)\) and the same formula applies:

\( G = 2 \sum_{i,j} O_{ij} \ln\left(\dfrac{O_{ij}}{E_{ij}}\right) \)

Under standard conditions (independent observations, not-too-small expected counts), G is approximately chi-square distributed with the same degrees of freedom as the corresponding chi-square test.

G-test for goodness-of-fit

In a goodness-of-fit setting you test whether observed frequencies follow a specific theoretical distribution (uniform, binomial, Poisson, etc.) or specified probabilities. You provide:

  • observed counts \( O_1, \dots, O_k \)
  • expected pattern under the null \( E_1, \dots, E_k \) (or probabilities)

The degrees of freedom are typically \( k - 1 - m \), where \( k \) is the number of categories and \( m \) is the number of parameters estimated from the data (for example, estimating a mean or rate).

G-test of independence (contingency table)

For an \( r \times c \) contingency table of counts, the null hypothesis is that the row and column variables are independent. The expected count in cell \((i,j)\) is

\( E_{ij} = \dfrac{(\text{row total}_i) \cdot (\text{column total}_j)}{\text{grand total}} \)

The degrees of freedom are \( \text{df} = (r - 1)(c - 1) \), exactly as for the Pearson chi-square test of independence.

G-test vs chi-square test

The Pearson chi-square test uses the statistic \( \chi^2 = \sum (O - E)^2 / E \), whereas the G-test uses the likelihood-ratio statistic \( G = 2 \sum O \ln(O/E) \). For large samples they are asymptotically equivalent and usually lead to very similar p-values.

  • The G-test arises naturally from likelihood-ratio principles and is convenient in log-linear models and generalized linear modeling.
  • The chi-square test is more familiar and historically widespread in introductory courses.
  • For small samples, both tests rely on approximations; exact or Monte Carlo methods are often recommended.

Assumptions and common pitfalls

  • Independence: Each observation contributes to exactly one cell and does not depend on other observations.
  • Expected counts: Expected cell counts should not be too small (rules of thumb often require most \( E_i \) ≥ 5).
  • Sparse tables: Many structural zeros or very sparse tables may violate the chi-square approximation; consider exact tests or specialized modeling.

Important: This page concerns the statistical G-test for categorical data. It is not intended for medical diagnostics or prenatal screening and does not replace professional advice.

G-test – frequently asked questions