G-test Calculator (Likelihood-Ratio Test)
This calculator implements the statistical G-test (likelihood-ratio test) for categorical data. It is not related to medical or prenatal screening products sometimes called “G-test”.
Use it to compute the G statistic, degrees of freedom and p-value for goodness-of-fit problems and contingency tables (tests of independence).
G-test calculator
All inputs are counts (frequencies). Use comma or dot as decimal separator; counts are rounded to non-negative numbers.
Between 2 and 10 categories.
For “custom expected”, the calculator rescales your expected counts to match the total sample size.
df = k − 1 − parameters. For many textbook problems this is 0.
Output uses the chi-square approximation to obtain the p-value.
G-test formula (likelihood-ratio test)
The G-test compares observed counts with expected counts under a null hypothesis. The test statistic is
\( G = 2 \sum_i O_i \ln\left(\dfrac{O_i}{E_i}\right) \)
- \( O_i \) = observed count in cell i
- \( E_i \) = expected count in cell i under the null hypothesis
For contingency tables, the sum runs over all cells \((i, j)\) and the same formula applies:
\( G = 2 \sum_{i,j} O_{ij} \ln\left(\dfrac{O_{ij}}{E_{ij}}\right) \)
Under standard conditions (independent observations, not-too-small expected counts), G is approximately chi-square distributed with the same degrees of freedom as the corresponding chi-square test.
G-test for goodness-of-fit
In a goodness-of-fit setting you test whether observed frequencies follow a specific theoretical distribution (uniform, binomial, Poisson, etc.) or specified probabilities. You provide:
- observed counts \( O_1, \dots, O_k \)
- expected pattern under the null \( E_1, \dots, E_k \) (or probabilities)
The degrees of freedom are typically \( k - 1 - m \), where \( k \) is the number of categories and \( m \) is the number of parameters estimated from the data (for example, estimating a mean or rate).
G-test of independence (contingency table)
For an \( r \times c \) contingency table of counts, the null hypothesis is that the row and column variables are independent. The expected count in cell \((i,j)\) is
\( E_{ij} = \dfrac{(\text{row total}_i) \cdot (\text{column total}_j)}{\text{grand total}} \)
The degrees of freedom are \( \text{df} = (r - 1)(c - 1) \), exactly as for the Pearson chi-square test of independence.
G-test vs chi-square test
The Pearson chi-square test uses the statistic \( \chi^2 = \sum (O - E)^2 / E \), whereas the G-test uses the likelihood-ratio statistic \( G = 2 \sum O \ln(O/E) \). For large samples they are asymptotically equivalent and usually lead to very similar p-values.
- The G-test arises naturally from likelihood-ratio principles and is convenient in log-linear models and generalized linear modeling.
- The chi-square test is more familiar and historically widespread in introductory courses.
- For small samples, both tests rely on approximations; exact or Monte Carlo methods are often recommended.
Assumptions and common pitfalls
- Independence: Each observation contributes to exactly one cell and does not depend on other observations.
- Expected counts: Expected cell counts should not be too small (rules of thumb often require most \( E_i \) ≥ 5).
- Sparse tables: Many structural zeros or very sparse tables may violate the chi-square approximation; consider exact tests or specialized modeling.
Important: This page concerns the statistical G-test for categorical data. It is not intended for medical diagnostics or prenatal screening and does not replace professional advice.