Compare two sample averages with statistical confidence. Choose the right model for samples and variance. Review p values, intervals, assumptions, and interpretation in seconds.
| Example | Mean 1 | Mean 2 | Deviation 1 | Deviation 2 | n1 | n2 | Model | P Value | Decision |
|---|---|---|---|---|---|---|---|---|---|
| Equal variance example | 81.200 | 76.100 | 10.300 | 9.800 | 40 | 38 | Independent two-sample t test with equal variances | 0.028150 | Reject the null hypothesis at the selected significance level. |
| Welch example | 52.400 | 47.100 | 8.200 | 7.500 | 35 | 32 | Welch two-sample t test with unequal variances | 0.007437 | Reject the null hypothesis at the selected significance level. |
| Known deviation example | 101.500 | 98.700 | 4.200 | 3.900 | 50 | 45 | Two-sample z test using known standard deviations | 0.000755 | Reject the null hypothesis at the selected significance level. |
Observed mean difference: d = mean1 − mean2
Tested difference: d* = d − Δ0, where Δ0 is the hypothesized difference.
Pooled variance: Sp² = [((n1 − 1)s1²) + ((n2 − 1)s2²)] / (n1 + n2 − 2)
Standard error: SE = Sp × √(1/n1 + 1/n2)
Test statistic: t = d* / SE
Degrees of freedom: df = n1 + n2 − 2
Standard error: SE = √(s1²/n1 + s2²/n2)
Test statistic: t = d* / SE
Welch degrees of freedom: df = (s1²/n1 + s2²/n2)² / [((s1²/n1)² / (n1 − 1)) + ((s2²/n2)² / (n2 − 1))]
Standard error: SE = √(σ1²/n1 + σ2²/n2)
Test statistic: z = d* / SE
Two-sided p value = 2 × smaller tail area.
Left-tailed p value = cumulative probability at the test statistic.
Right-tailed p value = 1 − cumulative probability at the test statistic.
Difference interval = d ± critical value × SE
This calculator compares two means and measures whether the observed gap is large enough to question the null hypothesis. It supports the three most common settings for this task. Use the equal variance model when both groups are believed to share a common spread. Use Welch when spreads may differ. Use the z test when the population deviations are known in advance.
The p value depends on the standard error. The standard error depends on sample size and variability. A small difference can become significant when the error is small. A larger difference may remain non-significant when variability is high. Model choice changes both the standard error and the reference distribution. That is why the same means can lead to slightly different p values under different assumptions.
Start with the observed mean difference. Then review the p value. If it falls below your alpha level, the data provide evidence against the null claim. Next, inspect the confidence interval. If the interval excludes the hypothesized difference, that supports the same conclusion. The effect size adds practical context. A statistically significant result can still have a very small real impact. Use the graph to see where the test statistic falls under the null distribution.
The p value measures how unusual the observed mean difference would be if the null hypothesis were true. Smaller values indicate stronger evidence against that null claim.
Choose Welch when group spreads may differ or sample sizes are unbalanced. It is often the safer default because it adjusts the standard error and degrees of freedom.
It lets you test whether the true difference equals a value other than zero. For a standard comparison of equal means, enter 0.
They reflect different research questions. Use two-sided for any difference, left for a smaller first mean, and right for a larger first mean.
It gives a plausible range for the true mean difference. Narrow intervals suggest more precision. Wide intervals indicate more uncertainty.
No. A small p value shows statistical evidence, not effect size. Review Cohen's d, Hedges' g, and the raw mean difference as well.
Yes, but interpret carefully. Small samples can produce unstable variance estimates and wide intervals. Assumptions matter more when data are limited.
Check independent observations, reasonable measurement quality, and roughly appropriate distribution assumptions. Also confirm whether equal variance is a fair choice before using that model.
Important Note: All the Calculators listed in this site are for educational purpose only and we do not guarentee the accuracy of results. Please do consult with other sources as well.