Mean Imputation in R (Example)
This tutorial explains how to perform a mean imputation in the R programming language.
Example Data
vec <- c(4, NA, 7, 5, 7, 1, 6, 3, NA, 5) # Create example vector |
vec <- c(4, NA, 7, 5, 7, 1, 6, 3, NA, 5) # Create example vector
Our example data is a simple numeric vector with some NA values. Of cause, the same approach could be applied to a column of a data frame.
Imputing Missing Values by Mean
In order to impute the NA values in our data by the mean, we can use the is.na function and the mean function as follows:
vec[is.na(vec)] <- mean(vec[!is.na(vec)]) # Mean imputation |
vec[is.na(vec)] <- mean(vec[!is.na(vec)]) # Mean imputation
Our updated vector without missing data looks as follows:
vec # Print updated vector # 4.00 4.75 7.00 5.00 7.00 1.00 6.00 3.00 4.75 5.00 |
vec # Print updated vector # 4.00 4.75 7.00 5.00 7.00 1.00 6.00 3.00 4.75 5.00
The mean of our vector is 4.75 and for that reason all NA values were substituted (i.e. replaced) by the mean.