Avoid NA Value when Summarizing data.table in R (Example Code)
In this article you’ll learn how to delete NA values when summarizing a data.table in the R programming language.
Preparing the Example
install.packages("data.table") # Install data.table package library("data.table") # Load data.table |
install.packages("data.table") # Install data.table package library("data.table") # Load data.table
my_dt <- data.table(A = c(NA, 1:10, NA), # Example data B = 101:112, GR = rep(LETTERS[1:4], each = 3)) my_dt # Display example data.table # A B GR # 1: NA 101 A # 2: 1 102 A # 3: 2 103 A # 4: 3 104 B # 5: 4 105 B # 6: 5 106 B # 7: 6 107 C # 8: 7 108 C # 9: 8 109 C # 10: 9 110 D # 11: 10 111 D # 12: NA 112 D |
my_dt <- data.table(A = c(NA, 1:10, NA), # Example data B = 101:112, GR = rep(LETTERS[1:4], each = 3)) my_dt # Display example data.table # A B GR # 1: NA 101 A # 2: 1 102 A # 3: 2 103 A # 4: 3 104 B # 5: 4 105 B # 6: 5 106 B # 7: 6 107 C # 8: 7 108 C # 9: 8 109 C # 10: 9 110 D # 11: 10 111 D # 12: NA 112 D
Example: Summarize data.table & Remove NA
my_dt_new <- my_dt[, lapply(.SD, sum, na.rm = TRUE), by = GR] # Summarize data.table my_dt_new # GR A B # 1: A 3 306 # 2: B 12 315 # 3: C 21 324 # 4: D 19 333 |
my_dt_new <- my_dt[, lapply(.SD, sum, na.rm = TRUE), by = GR] # Summarize data.table my_dt_new # GR A B # 1: A 3 306 # 2: B 12 315 # 3: C 21 324 # 4: D 19 333