Mode Imputation in R (Example)

This tutorial explains how to impute missing values by the mode in the R programming language.

Create Function for Computation of Mode in R

R does not provide a built-in function for the calculation of the mode. For that reason we need to create our own function:

my_mode <- function(x) {                                     # Create mode function 
  unique_x <- unique(x)
  mode <- unique_x[which.max(tabulate(match(x, unique_x)))]
  mode
}

Example Data

Our data with missing values looks as follows:

vec <- factor(c(4, NA, 7, 5, 7, 1, 6, 3, NA, 5, 5))          # Create example vector

Mode Imputation in R

Now we can apply mode substitution as follows:

vec[is.na(vec)] <- my_mode(vec[!is.na(vec)])                 # Mode imputation
vec                                                          # Print imputed vector
# [1] 4 5 7 5 7 1 6 3 5 5 5
# Levels: 1 3 4 5 6 7

Note that we imputed a simple categorical vector in this example. However, we could apply the same R code to the column of a more complex data frame as well.

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed

Menu
Top