Retain Unique Rows Based On Selected Variables in R (Example Code)
In this tutorial, I’ll explain how to keep only data frame rows that are not duplicated in a particular column in R programming.
Creation of Example Data
my_df <- data.frame(col1 = rep(1:3, each = 3), # Construct data frame col2 = rep(LETTERS[1:3], each = 3), col3 = letters[9:1], col4 = 9:1) my_df # Show example data in console # col1 col2 col3 col4 # 1 1 A i 9 # 2 1 A h 8 # 3 1 A g 7 # 4 2 B f 6 # 5 2 B e 5 # 6 2 B d 4 # 7 3 C c 3 # 8 3 C b 2 # 9 3 C a 1 |
my_df <- data.frame(col1 = rep(1:3, each = 3), # Construct data frame col2 = rep(LETTERS[1:3], each = 3), col3 = letters[9:1], col4 = 9:1) my_df # Show example data in console # col1 col2 col3 col4 # 1 1 A i 9 # 2 1 A h 8 # 3 1 A g 7 # 4 2 B f 6 # 5 2 B e 5 # 6 2 B d 4 # 7 3 C c 3 # 8 3 C b 2 # 9 3 C a 1
Example: Applying duplicated() Function to Retain Only Unique Rows of Data Frame
my_df[!duplicated(my_df[ , c("col1", "col2")]), ] # Using duplicated() function # col1 col2 col3 col4 # 1 1 A i 9 # 4 2 B f 6 # 7 3 C c 3 |
my_df[!duplicated(my_df[ , c("col1", "col2")]), ] # Using duplicated() function # col1 col2 col3 col4 # 1 1 A i 9 # 4 2 B f 6 # 7 3 C c 3