Group Data Based On Two Variables in R (Example Code)
On this page, I’ll show how to group a data set by multiple columns in the R programming language.
Example Data
data(iris) # Load example data iris_new <- iris # Modify example data iris_new$subgroup <- letters[1:3] head(iris_new) # Display head of example data # Sepal.Length Sepal.Width Petal.Length Petal.Width Species subgroup # 1 5.1 3.5 1.4 0.2 setosa a # 2 4.9 3.0 1.4 0.2 setosa b # 3 4.7 3.2 1.3 0.2 setosa c # 4 4.6 3.1 1.5 0.2 setosa a # 5 5.0 3.6 1.4 0.2 setosa b # 6 5.4 3.9 1.7 0.4 setosa c |
data(iris) # Load example data iris_new <- iris # Modify example data iris_new$subgroup <- letters[1:3] head(iris_new) # Display head of example data # Sepal.Length Sepal.Width Petal.Length Petal.Width Species subgroup # 1 5.1 3.5 1.4 0.2 setosa a # 2 4.9 3.0 1.4 0.2 setosa b # 3 4.7 3.2 1.3 0.2 setosa c # 4 4.6 3.1 1.5 0.2 setosa a # 5 5.0 3.6 1.4 0.2 setosa b # 6 5.4 3.9 1.7 0.4 setosa c
Example: Grouping Data by Multiple Variables
install.packages("dplyr") # Install dplyr package library("dplyr") # Load dplyr package |
install.packages("dplyr") # Install dplyr package library("dplyr") # Load dplyr package
iris_grouped <- iris_new %>% # Grouping data group_by(Species, subgroup) %>% dplyr::summarise(my_mean = mean(Sepal.Length)) %>% as.data.frame() iris_grouped # Displaying new data # Species subgroup my_mean # 1 setosa a 5.052941 # 2 setosa b 5.011765 # 3 setosa c 4.950000 # 4 versicolor a 5.770588 # 5 versicolor b 6.018750 # 6 versicolor c 6.023529 # 7 virginica a 6.756250 # 8 virginica b 6.447059 # 9 virginica c 6.570588 |
iris_grouped <- iris_new %>% # Grouping data group_by(Species, subgroup) %>% dplyr::summarise(my_mean = mean(Sepal.Length)) %>% as.data.frame() iris_grouped # Displaying new data # Species subgroup my_mean # 1 setosa a 5.052941 # 2 setosa b 5.011765 # 3 setosa c 4.950000 # 4 versicolor a 5.770588 # 5 versicolor b 6.018750 # 6 versicolor c 6.023529 # 7 virginica a 6.756250 # 8 virginica b 6.447059 # 9 virginica c 6.570588
Related Tutorials & Further Resources
Below, you may find some additional resources on topics such as character strings, groups, and dplyr: