Allele Frequency Calculator
This professional-grade allele frequency calculator helps students, educators, and researchers compute allele frequencies (p and q), expected genotype frequencies under Hardy–Weinberg equilibrium (HWE), heterozygosity, minor allele frequency (MAF), a chi-square HWE test (when genotype counts are provided), and 95% confidence intervals — all with WCAG-compliant, mobile-first UX.
Interactive Calculator
Results
Note: HWE test is computed only when observed genotype counts are provided and N > 0 with valid expected counts.
Data Source and Methodology
Authoritative sources:
- G.H. Hardy (1908), Mendelian proportions in a mixed population, Science 28(706):49–50. Link
- W. Weinberg (1908), Über den Nachweis der Vererbung beim Menschen, Jahreshefte des Vereins für vaterländische Naturkunde in Württemberg.
- J.F. Crow and M. Kimura (1970), An Introduction to Population Genetics Theory. Harper & Row.
- Wigginton, Cutler, Abecasis (2005), A note on exact tests of Hardy–Weinberg equilibrium, AJHG 76:887–893. Link
- Nature Education: Hardy–Weinberg Equilibrium. Link
All calculations are strictly based on the formulas and data provided by these sources.
The Formula Explained
If AA, Aa, aa are genotype counts and N = AA + Aa + aa:
Allele frequencies: $$p = \\frac{2\\,AA + Aa}{2N}, \\quad q = 1 - p$$
Hardy–Weinberg expected genotype frequencies: $$P(AA) = p^2,\\quad P(Aa) = 2pq,\\quad P(aa) = q^2$$
Expected counts: $$E[AA] = N p^2,\\quad E[Aa] = N (2pq),\\quad E[aa] = N q^2$$
Heterozygosity: $$H = 2pq$$
Goodness-of-fit statistic (chi-square): $$\\chi^2 = \\sum_{g\\in\\{AA,Aa,aa\\}} \\frac{(O_g - E_g)^2}{E_g} \\quad \\text{with 1 degree of freedom}$$
For df = 1, the p-value is: $$p\\text{-value} = \\operatorname{erfc}\\!\\left(\\sqrt{\\chi^2/2}\\right)$$
Wilson 95% CI for p using k A-alleles in n = 2N trials: $$\\hat{p}_W = \\frac{\\hat{p} + z^2/(2n)}{1 + z^2/n},\\quad \\text{margin} = \\frac{z}{1+z^2/n}\\sqrt{\\frac{\\hat{p}(1-\\hat{p})}{n} + \\frac{z^2}{4n^2}}$$
Glossary of Variables
- AA, Aa, aa: observed genotype counts.
- N: sample size in individuals (N = AA + Aa + aa; for allele counts, N = (A + a)/2).
- p: frequency of allele A; q: frequency of allele a (q = 1 − p).
- MAF: minor allele frequency, min(p, q).
- Expected counts: E[AA] = N p^2, E[Aa] = N 2pq, E[aa] = N q^2 (assuming HWE).
- Heterozygosity (H): 2pq, the expected fraction of heterozygotes under HWE.
- Chi-square and p-value: measure agreement between observed and expected genotypes under HWE (df = 1).
- 95% CI for p: Wilson interval treating 2N chromosomes as trials and number of A alleles as successes.
How It Works: A Step-by-Step Example
Suppose you observed AA = 36, Aa = 48, aa = 16 (N = 100). Then:
- Compute p: p = (2×36 + 48) / (2×100) = 0.60; q = 0.40.
- Expected genotype frequencies: AA = p^2 = 0.36, Aa = 2pq = 0.48, aa = q^2 = 0.16.
- Expected counts: E[AA] = 36, E[Aa] = 48, E[aa] = 16.
- HWE test: χ² = Σ (O−E)²/E = 0; p-value = 1 → no evidence against HWE.
- MAF = min(0.60, 0.40) = 0.40; Heterozygosity = 2pq = 0.48. The 95% CI for p is computed using the Wilson method.
Frequently Asked Questions (FAQ)
What input should I use: genotype counts, allele counts, or p?
Use genotype counts to compute everything including an HWE test. If you only have allele counts, you can compute p and expected genotypes but not a valid HWE test. If you already know p and N, you can generate expected counts and heterozygosity.
When is the chi-square HWE test appropriate?
It is appropriate for bi-allelic loci with sufficiently large expected counts (common rule of thumb: all expected ≥ 5). For small samples or rare alleles, consider exact tests (e.g., Wigginton et al., 2005).
How is the 95% CI for p calculated?
We use a Wilson score interval treating the 2N chromosomes as binomial trials with k = 2×AA + Aa A-alleles.
Can I change allele labels?
This tool uses A and a for clarity. You can map them to your locus alleles (e.g., reference vs alternate).
Why do I get “even sum required” with allele counts?
In diploid organisms, the total number of alleles is 2×N. Therefore, A + a must be even to correspond to an integer number of individuals.
Is deviation from HWE always biologically meaningful?
No. Deviations can arise from technical issues (e.g., genotyping error) or sampling noise, not only from evolutionary forces. Always consider context and replicate data quality checks.