Group Data Based On Two Variables in R (Example Code)

On this page, I’ll show how to group a data set by multiple columns in the R programming language.

Example Data

data(iris)                      # Load example data
iris_new <- iris                # Modify example data
iris_new$subgroup <- letters[1:3]
head(iris_new)                  # Display head of example data
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species subgroup
# 1          5.1         3.5          1.4         0.2  setosa        a
# 2          4.9         3.0          1.4         0.2  setosa        b
# 3          4.7         3.2          1.3         0.2  setosa        c
# 4          4.6         3.1          1.5         0.2  setosa        a
# 5          5.0         3.6          1.4         0.2  setosa        b
# 6          5.4         3.9          1.7         0.4  setosa        c

Example: Grouping Data by Multiple Variables

install.packages("dplyr")       # Install dplyr package
library("dplyr")                # Load dplyr package
iris_grouped <- iris_new %>%    # Grouping data
  group_by(Species, subgroup) %>%
  dplyr::summarise(my_mean = mean(Sepal.Length)) %>% 
  as.data.frame()
iris_grouped                    # Displaying new data
#      Species subgroup  my_mean
# 1     setosa        a 5.052941
# 2     setosa        b 5.011765
# 3     setosa        c 4.950000
# 4 versicolor        a 5.770588
# 5 versicolor        b 6.018750
# 6 versicolor        c 6.023529
# 7  virginica        a 6.756250
# 8  virginica        b 6.447059
# 9  virginica        c 6.570588

Related Tutorials & Further Resources

Below, you may find some additional resources on topics such as character strings, groups, and dplyr:

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed

Menu
Top