The t-Test Paired Two Sample for Means tool performs a paired two-sample Student's t-Test to ascertain if the null hypothesis (means of two populations are equal) can be accepted or rejected. This test does not assume that the variances of both populations are equal.  Paired t-tests are typically used to test the means of a population before and after some treatment, i.e. two samples of math scores from students before and after a lesson.

The result of this tool is a calculated t-value.  This value can be negative or positive, depending on the data.  Assuming that the population means are equal:

  • If t < 0, P(T <= t) one-tail is the probability that a value of the t-Statistic would be observed that is more negative than t.
  • If t >0, P(T<=t) one tail is the probability that a value of the t-Statistic would be observed  that is more positive than t. 
  • P(T <=t) two tail is the probability that a value of the t-Statistic would be observed that is larger in absolute value than t. 

The example datasets below were taken from a population of 10 students.  The students were given the same test at the beginning and end of the school year.  Use the Paired t-Test to determine if the average score of the 2nd test has improved over the average score of the 1st test.

To run the t-test:

  1. On the XLMiner Analysis ToolPak pane, click t-Test Paired Two Sample for Means.
  2. Enter A2:A11 for Variable 1 Range.  This is our first set of values, the values recorded at the beginning of the school year. 
  3. Enter B2:B11 for Variable 2 Range.  This is our second set of values, the values recorded at the end of the school year.    
  4. Enter "0" for Hypothesized Mean Difference.  This means that we are testing that the means between the two samples are equal. 
  5. Uncheck Labels since we did not include the column headings in our Variable 1 and 2 Ranges.  
  6. Keep the Alpha  = 0.05.
  7. Enter D1 for the Output Range.
  8. Click OK.

Apa itu t test Paired Two Sample for Means?

The results are below.

Apa itu t test Paired Two Sample for Means?

  • Cells E4 and F4 contain the mean of each sample, Variable 1 = Beginning and Variable 2 = End. 
  • Cells E5 and F5 contain the variance of each sample. 
  • Cells E6 and F6 contain the number of observations in each sample. 
  • Cell E7 contains the Pearson Correlation which indicates that the two variables are rather closely correlated.
  • Cell E8 contains our entry for the Hypothesized Mean Difference.
  • Cells E9 contains the degrees of freedom, 10 – 1.
  • Cell E10 contains the result of the actual t-test.  We will compare this value to the t-Critical two-tail statistic.   Note:  Use a one-tail test if you have a direction in your hypothesis, i.e. if testing that a value is above or below some level.
  • In this example P(T <= t) two tail (0.0000321) gives the probability that the absolute value of the t-Statistic (7.633) would be observed that is larger in absolute value than the Critical t value (2.26).  Since the p – value is less than our alpha, 0.05, we reject the null hypothesis that there is no significant difference in the means of each sample.    

The paired sample t-test, sometimes called the dependent sample t-test, is a statistical procedure used to determine whether the mean difference between two sets of observations is zero. In a paired sample t-test, each subject or entity is measured twice, resulting in pairs of observations. Common applications of the paired sample t-test include case-control studies or repeated-measures designs. Suppose you are interested in evaluating the effectiveness of a company training program. One approach you might consider would be to measure the performance of a sample of employees before and after completing the program, and analyze the differences using a paired sample t-test.

Apa itu t test Paired Two Sample for Means?

Like many statistical procedures, the paired sample t-test has two competing hypotheses, the null hypothesis and the alternative hypothesis. The null hypothesis assumes that the true mean difference between the paired samples is zero. Under this model, all observable differences are explained by random variation. Conversely, the alternative hypothesis assumes that the true mean difference between the paired samples is not equal to zero. The alternative hypothesis can take one of several forms depending on the expected outcome. If the direction of the difference does not matter, a two-tailed hypothesis is used. Otherwise, an upper-tailed or lower-tailed hypothesis can be used to increase the power of the test. The null hypothesis remains the same for each type of alternative hypothesis. The paired sample t-test hypotheses are formally defined below:

  • • The null hypothesis (\(H_0\)) assumes that the true mean difference (\(\mu_d\)) is equal to zero.
  • • The two-tailed alternative hypothesis (\(H_1\)) assumes that \(\mu_d\) is not equal to zero.
  • • The upper-tailed alternative hypothesis (\(H_1\)) assumes that \(\mu_d\) is greater than zero.
  • • The lower-tailed alternative hypothesis (\(H_1\)) assumes that \(\mu_d\) is less than zero.

The mathematical representations of the null and alternative hypotheses are defined below:

  • \(H_0:\ \mu_d\ =\ 0\)
  • \(H_1:\ \mu_d\ \ne\ 0\)    (two-tailed)
  • \(H_1:\ \mu_d\ >\ 0\)    (upper-tailed)
  • \(H_1:\ \mu_d\ <\ 0\)    (lower-tailed)

Note. It is important to remember that hypotheses are never about data, they are about the processes which produce the data. In the formulas above, the value of \(\mu_d\) is unknown. The goal of hypothesis testing is to determine the hypothesis (null or alternative) with which the data are more consistent.


As a parametric procedure (a procedure which estimates unknown parameters), the paired sample t-test makes several assumptions. Although t-tests are quite robust, it is good practice to evaluate the degree of deviation from these assumptions in order to assess the quality of the results. In a paired sample t-test, the observations are defined as the differences between two sets of values, and each assumption refers to these differences, not the original data values. The paired sample t-test has four main assumptions:

  • • The dependent variable must be continuous (interval/ratio).
  • • The observations are independent of one another.
  • • The dependent variable should be approximately normally distributed.
  • • The dependent variable should not contain any outliers.

Level of Measurement

The paired sample t-test requires the sample data to be numeric and continuous, as it is based on the normal distribution. Continuous data can take on any value within a range (income, height, weight, etc.). The opposite of continuous data is discrete data, which can only take on a few values (Low, Medium, High, etc.). Occasionally, discrete data can be used to approximate a continuous scale, such as with Likert-type scales.


Independence of observations is usually not testable, but can be reasonably assumed if the data collection process was random without replacement. In our example, it is reasonable to assume that the participating employees are independent of one another.


To test the assumption of normality, a variety of methods are available, but the simplest is to inspect the data visually using a tool like a histogram (Figure 1). Real-world data are almost never perfectly normal, so this assumption can be considered reasonably met if the shape looks approximately symmetric and bell-shaped. The data in the example figure below is approximately normally distributed.

Apa itu t test Paired Two Sample for Means?
Histogram of an approximately normally distributed variable.


Outliers are rare values that appear far away from the majority of the data. Outliers can bias the results and potentially lead to incorrect conclusions if not handled properly. One method for dealing with outliers is to simply remove them. However, removing data points can introduce other types of bias into the results, and potentially result in losing critical information. If outliers seem to have a lot of influence on the results, a nonparametric test such as the Wilcoxon Signed Rank Test may be appropriate to use instead. Outliers can be identified visually using a boxplot (Figure 2).

Apa itu t test Paired Two Sample for Means?
Apa itu t test Paired Two Sample for Means?
Boxplots of a variable without outliers (left) and with an outlier (right).


The procedure for a paired sample t-test can be summed up in four steps. The symbols to be used are defined below:

  • \(D\ =\ \)Differences between two paired samples
  • \(d_i\ =\ \)The \(i^{th}\) observation in \(D\)
  • \(n\ =\ \)The sample size
  • \(\overline{d}\ =\ \)The sample mean of the differences
  • \(\hat{\sigma}\ =\ \)The sample standard deviation of the differences
  • \(T\ =\)The critical value of a t-distribution with (\(n\ -\ 1\)) degrees of freedom
  • \(t\ =\ \)The t-statistic (t-test statistic) for a paired sample t-test
  • \(p\ =\ \)The \(p\)-value (probability value) for the t-statistic.

The four steps are listed below:

  • 1. Calculate the sample mean.
  • \(\overline{d}\ =\ \cfrac{d_1\ +\ d_2\ +\ \cdots\ +\ d_n}{n}\)
  • 2. Calculate the sample standard deviation.
  • \(\hat{\sigma}\ =\ \sqrt{\cfrac{(d_1\ -\ \overline{d})^2\ +\ (d_2\ -\ \overline{d})^2\ +\ \cdots\ +\ (d_n\ -\ \overline{d})^2}{n\ -\ 1}}\)
  • 3. Calculate the test statistic.
  • \(t\ =\ \cfrac{\overline{d}\ -\ 0}{\hat{\sigma}/\sqrt{n}}\)
  • 4. Calculate the probability of observing the test statistic under the null hypothesis. This value is obtained by comparing t to a t-distribution with (\(n\ -\ 1\)) degrees of freedom. This can be done by looking up the value in a table, such as those found in many statistical textbooks, or with statistical software for more accurate results.
  • \(p\ =\ 2\ \cdot\ Pr(T\ >\ |t|)\)    (two-tailed)
  • \(p\ =\ Pr(T\ >\ t)\)    (upper-tailed)
  • \(p\ =\ Pr(T\ <\ t)\)    (lower-tailed)

determine whether the results provide sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis.


There are two types of significance to consider when interpreting the results of a paired sample t-test, statistical significance and practical significance.

Statistical Significance

Statistical significance is determined by looking at the p-value. The p-value gives the probability of observing the test results under the null hypothesis. The lower the p-value, the lower the probability of obtaining a result like the one that was observed if the null hypothesis was true. Thus, a low p-value indicates decreased support for the null hypothesis. However, the possibility that the null hypothesis is true and that we simply obtained a very rare result can never be ruled out completely. The cutoff value for determining statistical significance is ultimately decided on by the researcher, but usually a value of .05 or less is chosen. This corresponds to a 5% (or less) chance of obtaining a result like the one that was observed if the null hypothesis was true.

Practical Significance

Practical significance depends on the subject matter. It is not uncommon, especially with large sample sizes, to observe a result that is statistically significant but not practically significant. In most cases, both types of significance are required in order to draw meaningful conclusions.

