Often you would like to compare variables that have different ranges. For example, you may want to compare houses based on sq. feet and cost. Sq. feet may be ranged in the thousands and cost my be in the 100 thousands. You can cmpare these eaiser by normalizing the data. In this article, we will learn how to normalize or create z-scores in R.
Following from our example, we can use the
scale method to normalized our data. This will create relaive scrores based on the mean and standard deviation. Let's see an example.
house.sq.ft = c(1000, 800, 1500, 13000) house.cost = c(100000, 150000, 200000, 180000) scale(house.sq.ft) # [,1] # [1,] -0.5161753 # [2,] -0.5497477 # [3,] -0.4322444 # [4,] 1.4981673 scale(house.cost) # [,1] # [1,] -1.3220429 # [2,] -0.1724404 # [3,] 0.9771621 # [4,] 0.5173211
Here we can see that after we use scale, we can see the values scored based on how close they are two their respective means. No, we can compare the two features.