R Select Unique Rows of Data Frame Based On Certain Variables (Example Code)
In this article you’ll learn how to remove duplicates in specific columns in the R programming language.
Introduction of Example Data
my_df <- data.frame(first_ID = c("a", "a", "b", "b", "c"), # Create data frame second_ID = c("a", "a", "b", "c", "c"), values = 1:5) my_df # Showing example data # first_ID second_ID values # 1 a a 1 # 2 a a 2 # 3 b b 3 # 4 b c 4 # 5 c c 5 |
my_df <- data.frame(first_ID = c("a", "a", "b", "b", "c"), # Create data frame second_ID = c("a", "a", "b", "c", "c"), values = 1:5) my_df # Showing example data # first_ID second_ID values # 1 a a 1 # 2 a a 2 # 3 b b 3 # 4 b c 4 # 5 c c 5
Example: Delete Lines that are Duplicated in Specific Columns
my_df_new <- my_df[!duplicated( # Remove duplicated rows my_df[ , c("first_ID", "second_ID")]), ] my_df_new # Printing the updated data # first_ID second_ID values # 1 a a 1 # 3 b b 3 # 4 b c 4 # 5 c c 5 |
my_df_new <- my_df[!duplicated( # Remove duplicated rows my_df[ , c("first_ID", "second_ID")]), ] my_df_new # Printing the updated data # first_ID second_ID values # 1 a a 1 # 3 b b 3 # 4 b c 4 # 5 c c 5