Much data in the real world will have missing values. R denotes this data as NAs if it detects them (you could also have empty strings or 0s depending on context). While not always advisable, see Imputing, you will want to remove this values in R. In this article, we will learn how to remove NAs from a data frame.
We can use the
na.omit function in R which will remove rows with NAs and return us a new data frame.
df = data.frame( x = c(1, NA, 3, 4), y = c(1, 2, NA, 4) ) df # x y # 1 1 1 # 2 NA 2 # 3 3 NA # 4 4 4 new.df = na.omit(df) new.df # x y # 1 1 1 # 4 4 4
You can see that we now only have two rows left. This is a reason why you don't always drop rows with NAs. Imputing is a good alternative.