GroupBy Two & Three Group Columns of pandas DataFrame in Python (2 Examples)
In this Python post you’ll learn how to group the values in a pandas DataFrame by two or more columns.
Preparing the Examples
import pandas as pd # Load pandas library |
import pandas as pd # Load pandas library
my_df = pd.DataFrame({'A':range(19, 28), # Constructing a pandas DataFrame 'B':[6, 7, 3, 9, 1, 3, 8, 8, 9], 'C':range(20, 11, - 1), 'GRP_a':['gr1', 'gr1', 'gr2', 'gr3', 'gr1', 'gr2', 'gr2', 'gr3', 'gr3'], 'GRP_b':['x', 'x', 'x', 'x', 'y', 'y', 'y', 'y', 'y'], 'GRP_c':['a', 'b', 'c', 'c', 'a', 'b', 'b', 'a', 'a']}) print(my_df) # A B C GRP_a GRP_b GRP_c # 0 19 6 20 gr1 x a # 1 20 7 19 gr1 x b # 2 21 3 18 gr2 x c # 3 22 9 17 gr3 x c # 4 23 1 16 gr1 y a # 5 24 3 15 gr2 y b # 6 25 8 14 gr2 y b # 7 26 8 13 gr3 y a # 8 27 9 12 gr3 y a |
my_df = pd.DataFrame({'A':range(19, 28), # Constructing a pandas DataFrame 'B':[6, 7, 3, 9, 1, 3, 8, 8, 9], 'C':range(20, 11, - 1), 'GRP_a':['gr1', 'gr1', 'gr2', 'gr3', 'gr1', 'gr2', 'gr2', 'gr3', 'gr3'], 'GRP_b':['x', 'x', 'x', 'x', 'y', 'y', 'y', 'y', 'y'], 'GRP_c':['a', 'b', 'c', 'c', 'a', 'b', 'b', 'a', 'a']}) print(my_df) # A B C GRP_a GRP_b GRP_c # 0 19 6 20 gr1 x a # 1 20 7 19 gr1 x b # 2 21 3 18 gr2 x c # 3 22 9 17 gr3 x c # 4 23 1 16 gr1 y a # 5 24 3 15 gr2 y b # 6 25 8 14 gr2 y b # 7 26 8 13 gr3 y a # 8 27 9 12 gr3 y a
Example 1: Calculate Sum by Two Group Indicators
print(my_df.groupby(['GRP_a', 'GRP_b']).sum()) # Computing the column sum by two groups # A B C # GRP_a GRP_b # gr1 x 39 13 39 # y 23 1 16 # gr2 x 21 3 18 # y 49 11 29 # gr3 x 22 9 17 # y 53 17 25 |
print(my_df.groupby(['GRP_a', 'GRP_b']).sum()) # Computing the column sum by two groups # A B C # GRP_a GRP_b # gr1 x 39 13 39 # y 23 1 16 # gr2 x 21 3 18 # y 49 11 29 # gr3 x 22 9 17 # y 53 17 25
Example 2: Calculate Mean Value by Three Group Indicators
print(my_df.groupby(['GRP_a', 'GRP_b', 'GRP_c']).mean()) # Computing the column sum by multiple groups # A B C # GRP_a GRP_b GRP_c # gr1 x a 19.0 6.0 20.0 # b 20.0 7.0 19.0 # y a 23.0 1.0 16.0 # gr2 x c 21.0 3.0 18.0 # y b 24.5 5.5 14.5 # gr3 x c 22.0 9.0 17.0 # y a 26.5 8.5 12.5 |
print(my_df.groupby(['GRP_a', 'GRP_b', 'GRP_c']).mean()) # Computing the column sum by multiple groups # A B C # GRP_a GRP_b GRP_c # gr1 x a 19.0 6.0 20.0 # b 20.0 7.0 19.0 # y a 23.0 1.0 16.0 # gr2 x c 21.0 3.0 18.0 # y b 24.5 5.5 14.5 # gr3 x c 22.0 9.0 17.0 # y a 26.5 8.5 12.5
Related Tutorials & Further Resources
Below, you may find some additional resources on topics such as groups, descriptive statistics, counting, and text elements:
- Mean Imputation of Columns in pandas DataFrame in Python
- Rearrange Columns of pandas DataFrame in Python
- Count Distinct Values by Group of pandas DataFrame Column in Python
- Join Text of Two Columns in pandas DataFrame in Python
- Ordering pandas DataFrame Rows by Multiple Columns in Python
- Python Subset Multiple Columns of Pandas DataFrame