R Extract Top N Highest Values of Column by Group (Example Code)
In this tutorial, I’ll explain how to extract the N highest values within each group of a data frame column in the R programming language.
Example Data
data(iris) # Loading iris head(iris) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 1 5.1 3.5 1.4 0.2 setosa # 2 4.9 3.0 1.4 0.2 setosa # 3 4.7 3.2 1.3 0.2 setosa # 4 4.6 3.1 1.5 0.2 setosa # 5 5.0 3.6 1.4 0.2 setosa # 6 5.4 3.9 1.7 0.4 setosa |
data(iris) # Loading iris head(iris) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 1 5.1 3.5 1.4 0.2 setosa # 2 4.9 3.0 1.4 0.2 setosa # 3 4.7 3.2 1.3 0.2 setosa # 4 4.6 3.1 1.5 0.2 setosa # 5 5.0 3.6 1.4 0.2 setosa # 6 5.4 3.9 1.7 0.4 setosa
Example: Select Top N Highest Values of Data Frame Column by Group
iris_highest <- iris[order(iris$Sepal.Length, # Sorting values of data decreasing = TRUE), ] |
iris_highest <- iris[order(iris$Sepal.Length, # Sorting values of data decreasing = TRUE), ]
iris_highest <- Reduce(rbind, # Extracting highest values by group by(iris_highest, iris_highest["Species"], head, n = 2)) |
iris_highest <- Reduce(rbind, # Extracting highest values by group by(iris_highest, iris_highest["Species"], head, n = 2))
iris_highest # Showing updated data in RStudio # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 15 5.8 4.0 1.2 0.2 setosa # 16 5.7 4.4 1.5 0.4 setosa # 51 7.0 3.2 4.7 1.4 versicolor # 53 6.9 3.1 4.9 1.5 versicolor # 132 7.9 3.8 6.4 2.0 virginica # 118 7.7 3.8 6.7 2.2 virginica |
iris_highest # Showing updated data in RStudio # Sepal.Length Sepal.Width Petal.Length Petal.Width Species # 15 5.8 4.0 1.2 0.2 setosa # 16 5.7 4.4 1.5 0.4 setosa # 51 7.0 3.2 4.7 1.4 versicolor # 53 6.9 3.1 4.9 1.5 versicolor # 132 7.9 3.8 6.4 2.0 virginica # 118 7.7 3.8 6.7 2.2 virginica
Further Resources & Related Articles
Below, you may find some additional resources on topics such as counting and groups.