R Select Unique Rows of Data Frame Based On Certain Variables (Example Code)

In this article you’ll learn how to remove duplicates in specific columns in the R programming language.

Introduction of Example Data

my_df <- data.frame(first_ID = c("a", "a", "b", "b", "c"),  # Create data frame
                    second_ID = c("a", "a", "b", "c", "c"),
                    values = 1:5)
my_df                                                       # Showing example data
#   first_ID second_ID values
# 1        a         a      1
# 2        a         a      2
# 3        b         b      3
# 4        b         c      4
# 5        c         c      5

Example: Delete Lines that are Duplicated in Specific Columns

my_df_new <- my_df[!duplicated(                             # Remove duplicated rows
  my_df[ , c("first_ID", "second_ID")]), ]
my_df_new                                                   # Printing the updated data
#   first_ID second_ID values
# 1        a         a      1
# 3        b         b      3
# 4        b         c      4
# 5        c         c      5

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed

Menu
Top