where:
- Xˉ is the sample mean,
- μ is the population mean,
- σ is the population standard deviation, and
- n is the sample size.
Types of Z-tests
There are three main types of Z-tests:
- One-Sample Z-test: This test compares the mean of a single sample to a known population mean. It is used when you want to assess whether the sample mean significantly deviates from the population mean, assuming the population variance is known. For example, a one-sample z-test might be used to determine if the average height of a group of more than 30 people differs from the known national average height.
- Two-Sample Z-test: This test compares the means of two independent samples to determine if there is a significant difference between them. It is used when both samples are large and the population variances are known. An example of this would be comparing the average test scores of students from two different schools to see if there is a significant difference in performance between the two schools.
- Proportion Z-test: This test compares the proportion of a certain characteristic in a sample to a known population proportion or between two sample proportions. It is used to evaluate whether the observed proportion in the sample significantly differs from what is expected based on the population proportion. For instance, a proportion Z-test might be used to compare the proportion of voters favoring a particular candidate in a sample to the proportion observed in previous elections.
There are additional variations of the test, such as the paired Z-test, the Z-test for regression coefficients, and the Z-test for differences in means.
Assumptions of the Z-test
The Z-test relies on certain assumptions to provide valid results:
- Known Population Variance: The Z-test assumes that the population variance is known. This is a key distinction from the t-test, where the population variance is typically unknown. The known variance allows for using the z-distribution to assess the significance of the test statistic.
- Large Sample Size: The Z-test assumes a large sample size, typically greater than 30. With larger samples, the sampling distribution of the sample mean approaches a normal distribution, even if the original data are not normally distributed, according to the Central Limit Theorem.
- Normal Distribution of the Population: The data are assumed to be drawn from a normally distributed population. This assumption is less critical for large samples but still important when the sample size is moderate.
Key Differences Between t-tests and Z-tests
The t-test and Z-test are used to compare sample statistics to population parameters, but they differ in their underlying assumptions, applications, and the conditions under which they are most appropriate. Let us analyze and understand the differences between the two tests:
Sample size considerations
- t-test: The t-test is typically used when the sample size is small, generally less than 30. It is designed to be robust when the sample size does not meet the threshold needed for applying the Central Limit Theorem.
- Z-test: The Z-test is used when the sample size is large, typically greater than 30. In large samples, the sampling distribution of the mean is approximately normal, which justifies using the Z-test.
Population variance knowledge
- t-test: The t-test is used when the population variance is unknown. Instead of the population variance, the sample variance is used to calculate the test statistic. The t-distribution, which has heavier tails than the normal distribution, accounts for the additional uncertainty due to estimating the population variance.
- Z-test: The Z-test requires that the population variance is known. This is a key assumption because it allows the use of the standard normal distribution to calculate the test statistic. When the population variance is known, the Z-test provides more precise estimates.
Distribution assumptions
- t-test: The t-test assumes that the data within each group are approximately normally distributed. This is particularly important when dealing with small sample sizes. The test statistic in a t-test follows a t-distribution, which has wider tails than the normal distribution. This accounts for the additional variability and uncertainty when estimating the population standard deviation from a small sample.
- Z-test: The Z-test assumes that the data are normally distributed or that the sample size is large enough to apply for the Central Limit Theorem. The Central Limit Theorem ensures that, for large samples, the sampling distribution of the mean is approximately normal, even if the underlying data are not perfectly normal.
Practical applications and use cases
- t-test: The t-test is commonly used in small-sample studies, such as pilot studies, where the population variance is unknown. Examples include comparing the effectiveness of two treatments in a small group or assessing changes within the same group over time.
- Z-test: The Z-test is used in large-sample studies or when dealing with well-established populations where the variance is known. It is often applied in quality control, survey analysis, and large-scale experimental studies.
Here is table with the key differences:
Key differences between t-test and Z-test. Image by Author.
Conclusion
This tutorial introduced you to hypothesis testing and two commonly used tests—t-tests and z-tests. We also learned each test’s definitions, different types, and assumptions and further understood their key differences. We concluded which test is best to be used in which scenario, thus enabling you to establish relationships between variables confidently through hypothesis testing.
After solidifying the statistical concepts behind hypothesis testing with our Introduction to Statistics course, I would encourage you to implement these concepts through any of the popular technologies through the following resources:
- Hypothesis Testing in Python course
- Hypothesis Testing in R course
- Hypothesis Testing (chi-square test) in Excel tutorial
Happy learning!