Display Unique Rows & Values in a data.table in R (2 Examples)

This tutorial illustrates how to get the unique values of certain column combinations and how to remove duplicate rows from a data.table object in R.

Setting up the Examples

Install and load data.table.

install.packages("data.table")                                                    # Install data.table package
library("data.table")                                                             # Load data.table

Load the iris dataset for the examples.

data(iris)                                                                        # Load iris data set
head(iris)
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa
iris_dt <- data.table::copy(iris)                                                 # Replicate iris data set
setDT(iris_dt)                                                                    # Convert iris to a data.table

Example 1: Unique Values of a Column

For the example, we create an additional column in the iris data.table called Sepal.Length.class. Sepal.Length.class is a factor variable that divides Sepal.Length into different classes.

iris_dt_2 <- iris_dt[, Sepal.Length.class := cut(Sepal.Length,
                                                 breaks = c(4, 4.5, 5, 5.5, 8))]  # Create new column Sepal.Length.class

table(iris_dt_2$Sepal.Length.class)                                               # Table new column Sepal.Length.class
# (4,4.5] (4.5,5] (5,5.5] (5.5,8] 
#       5      27      27      91

The following code line displays the unique values of variable Sepal.Length.class for each value of variable Species. For that, we use the by-argument as shown below.

iris_dt_2[, unique(Sepal.Length.class), by = Species]                             # Show unique values of Sepal.Length.class by Species
#       Species      V1
# 1:     setosa (5,5.5]
# 2:     setosa (4.5,5]
# 3:     setosa (4,4.5]
# 4:     setosa (5.5,8]
# 5: versicolor (5.5,8]
# 6: versicolor (5,5.5]
# 7: versicolor (4.5,5]
# 8:  virginica (5.5,8]
# 9:  virginica (4.5,5]

Example 2: Unique Rows

In this example, we remove duplicate rows from the iris data.table. As shown below, we take the columns of variables Sepal.Length.class and Species and reduce the data to the unique rows of these two variables.

iris_dt_3 <- unique(iris_dt_2[, list(Sepal.Length.class, Species)])               # Unique rows for columns Sepal.Length.class and Species
iris_dt_3
#    Sepal.Length.class    Species
# 1:            (5,5.5]     setosa
# 2:            (4.5,5]     setosa
# 3:            (4,4.5]     setosa
# 4:            (5.5,8]     setosa
# 5:            (5.5,8] versicolor
# 6:            (5,5.5] versicolor
# 7:            (4.5,5] versicolor
# 8:            (5.5,8]  virginica
# 9:            (4.5,5]  virginica

When we take the complete dataset iris_dt_2, we can also take a look at the dimensions of the complete data and the data reduced to those rows which are unique.

dim(iris_dt_2)                                                                    # Dimension of original data
# [1] 150   6
dim(unique(iris_dt_2))                                                            # Dimension of data with unique rows
# [1] 149   6

In this example, there is only one duplicate row.

Related Tutorials

Have a look at the following R programming tutorials. They focus on topics such as variables, extracting data, and missing data.

 

Anna-Lena Wölwer R Programming & Survey Statistics

Note: This article was created in collaboration with Anna-Lena Wölwer. Anna-Lena is a researcher and programmer who creates tutorials on statistical methodology as well as on the R programming language. You may find more info about Anna-Lena and her other articles on her profile page.

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed

Menu
Top