How to Conduct a Proportion Test in R

Intro

During analysis, it is often required to test a sample proportion to a theoretical or known proportion to see if there is a change. For example, let’s say we conduct a survey at the end of a course every semester to see if students enjoyed the class. We may have know that over the past 5 years we have had 73% of students say they enjoyed the class. We can then use this information to conduct a proportion test on the next semester and see if the enjoyment has changed. In this article, we will learn how to conduct a proportion test in R.

Loading the Data

For our tutorial, let’s create some fake data. Below we create a sample of 40 students who say “yes” when they enjoyed the class and “no” when they disliked the class.

set.seed(1)

options = c("yes", "no")
samp = sample(options, size = 40, replace = TRUE)

str(samp)

##  chr [1:40] "yes" "no" "yes" "yes" "no" "yes" "yes" "yes" "no" "no" "yes" ...

Conducting the Z-Test

We start with a two-sided z-test. This test will tell us whether or not our sample proportion is equal to our theoretical proportion. They null hypothesis is that they are equal.

We need a few paces of information. First, we need the proportion from our sample, we then need the size of our sample (n), and the theoretical proportion (p). We pass these all to the prop.test function and we can see the result below.

p = .73 # We will assume we have this from previous years
n = length(samp)

yes.answers = samp[samp == 'yes'] # Get all the yes's

prop.test(
  length(yes.answers),
  n = n,
  p = p,
  alternative = "two.sided",
  correct = FALSE)

## 
##  1-sample proportions test without continuity correction
## 
## data:  length(yes.answers) out of n, null probability p
## X-squared = 3.4297, df = 1, p-value = 0.06403
## alternative hypothesis: true p is not equal to 0.73
## 95 percent confidence interval:
##  0.4459589 0.7365167
## sample estimates:
##   p 
## 0.6

From the test above, we can see a p-value of p-value = 0.06403. Which would fail to reject the null hypothesis at the .05 or .01 levels. Thus, we do not have evidence to say that our proportion is significantly different.

Let’s briefly look at two more examples. Above we tested if the sample proportion was not equal to our theoretical proportion. Below, we will test is the sample proportion is less or greater than the theoretical proportion.

prop.test(
  length(yes.answers),
  n = n,
  p = p,
  alternative = "less",
  correct = FALSE)

## 
##  1-sample proportions test without continuity correction
## 
## data:  length(yes.answers) out of n, null probability p
## X-squared = 3.4297, df = 1, p-value = 0.03202
## alternative hypothesis: true p is less than 0.73
## 95 percent confidence interval:
##  0.0000000 0.7171352
## sample estimates:
##   p 
## 0.6

prop.test(
  length(yes.answers),
  n = n,
  p = p,
  alternative = "greater",
  correct = FALSE)

## 
##  1-sample proportions test without continuity correction
## 
## data:  length(yes.answers) out of n, null probability p
## X-squared = 3.4297, df = 1, p-value = 0.968
## alternative hypothesis: true p is greater than 0.73
## 95 percent confidence interval:
##  0.4701942 1.0000000
## sample estimates:
##   p 
## 0.6

Conduction a Binomial Test

Another proportion test we can conduct is a Binomial Exact Test. R also has a method called the binom.test to allow us to conduct these. Let’s look at some examples using the sample problem above. We will look at similar hypothesis, equal, less, and greater than.

binom.test(length(yes.answers), n = n, p = p)

## 
##  Exact binomial test
## 
## data:  length(yes.answers) and n
## number of successes = 24, number of trials = 40, p-value = 0.07442
## alternative hypothesis: true probability of success is not equal to 0.73
## 95 percent confidence interval:
##  0.4332671 0.7513500
## sample estimates:
## probability of success 
##                    0.6

binom.test(length(yes.answers), n = n, p = p, alternative = "less")

## 
##  Exact binomial test
## 
## data:  length(yes.answers) and n
## number of successes = 24, number of trials = 40, p-value = 0.05092
## alternative hypothesis: true probability of success is less than 0.73
## 95 percent confidence interval:
##  0.0000000 0.7305962
## sample estimates:
## probability of success 
##                    0.6

binom.test(length(yes.answers), n = n, p = p, alternative = "greater")

## 
##  Exact binomial test
## 
## data:  length(yes.answers) and n
## number of successes = 24, number of trials = 40, p-value = 0.9754
## alternative hypothesis: true probability of success is greater than 0.73
## 95 percent confidence interval:
##  0.4577833 1.0000000
## sample estimates:
## probability of success 
##                    0.6

How to Conduct a Proportion Test in R

05.21.2021

Intro

Loading the Data

Conducting the Z-Test

Conduction a Binomial Test

How to Create a Scatter Plot with ggplot2 in R

How to Create an Area Plot with ggplot2 in R