R Merging Unequal Data Frames & Replacing NA with Zero (Example Code)
On this page, I’ll show how to join two unequal data frames and replace missing values by zero in the R programming language.
Creation of Example Data
df1 <- data.frame(IDs = 1:6, # Example data frame No 1 col1 = 13:8, col2 = 8:3, col3 = 11:16) df1 # Structure of first data # IDs col1 col2 col3 # 1 1 13 8 11 # 2 2 12 7 12 # 3 3 11 6 13 # 4 4 10 5 14 # 5 5 9 4 15 # 6 6 8 3 16 df2 <- data.frame(IDs = 4:9, # Example data frame No 2 var1 = 15:20, var2 = 10:5) df2 # Structure of second data # IDs var1 var2 # 1 4 15 10 # 2 5 16 9 # 3 6 17 8 # 4 7 18 7 # 5 8 19 6 # 6 9 20 5 |
df1 <- data.frame(IDs = 1:6, # Example data frame No 1 col1 = 13:8, col2 = 8:3, col3 = 11:16) df1 # Structure of first data # IDs col1 col2 col3 # 1 1 13 8 11 # 2 2 12 7 12 # 3 3 11 6 13 # 4 4 10 5 14 # 5 5 9 4 15 # 6 6 8 3 16 df2 <- data.frame(IDs = 4:9, # Example data frame No 2 var1 = 15:20, var2 = 10:5) df2 # Structure of second data # IDs var1 var2 # 1 4 15 10 # 2 5 16 9 # 3 6 17 8 # 4 7 18 7 # 5 8 19 6 # 6 9 20 5
Example: Merge & Replace NA with 0
df_all <- merge(df1, df2, # Merging df1 & df2 by = "IDs", all = TRUE) df_all # Printing merged data # IDs col1 col2 col3 var1 var2 # 1 1 13 8 11 NA NA # 2 2 12 7 12 NA NA # 3 3 11 6 13 NA NA # 4 4 10 5 14 15 10 # 5 5 9 4 15 16 9 # 6 6 8 3 16 17 8 # 7 7 NA NA NA 18 7 # 8 8 NA NA NA 19 6 # 9 9 NA NA NA 20 5 |
df_all <- merge(df1, df2, # Merging df1 & df2 by = "IDs", all = TRUE) df_all # Printing merged data # IDs col1 col2 col3 var1 var2 # 1 1 13 8 11 NA NA # 2 2 12 7 12 NA NA # 3 3 11 6 13 NA NA # 4 4 10 5 14 15 10 # 5 5 9 4 15 16 9 # 6 6 8 3 16 17 8 # 7 7 NA NA NA 18 7 # 8 8 NA NA NA 19 6 # 9 9 NA NA NA 20 5
df_all[is.na(df_all)] <- 0 # Replace NA with zero df_all # Printing merged data after replacing NA with zero # IDs col1 col2 col3 var1 var2 # 1 1 13 8 11 0 0 # 2 2 12 7 12 0 0 # 3 3 11 6 13 0 0 # 4 4 10 5 14 15 10 # 5 5 9 4 15 16 9 # 6 6 8 3 16 17 8 # 7 7 0 0 0 18 7 # 8 8 0 0 0 19 6 # 9 9 0 0 0 20 5 |
df_all[is.na(df_all)] <- 0 # Replace NA with zero df_all # Printing merged data after replacing NA with zero # IDs col1 col2 col3 var1 var2 # 1 1 13 8 11 0 0 # 2 2 12 7 12 0 0 # 3 3 11 6 13 0 0 # 4 4 10 5 14 15 10 # 5 5 9 4 15 16 9 # 6 6 8 3 16 17 8 # 7 7 0 0 0 18 7 # 8 8 0 0 0 19 6 # 9 9 0 0 0 20 5