How to Use the pandas Library in Python Programming (3 Examples)
This page illustrates how to apply the functions of the pandas library in the Python programming language.
Setting up the Examples
import pandas as pd # Load pandas library |
import pandas as pd # Load pandas library
my_df = pd.DataFrame({"A":range(3, 12), # Construct pandas DataFrame in Python "B":["a", "x", "b", "y", "y", "c", "y", "d", "x"], "C":range(1, 10)}) print(my_df) # A B C # 0 3 a 1 # 1 4 x 2 # 2 5 b 3 # 3 6 y 4 # 4 7 y 5 # 5 8 c 6 # 6 9 y 7 # 7 10 d 8 # 8 11 x 9 |
my_df = pd.DataFrame({"A":range(3, 12), # Construct pandas DataFrame in Python "B":["a", "x", "b", "y", "y", "c", "y", "d", "x"], "C":range(1, 10)}) print(my_df) # A B C # 0 3 a 1 # 1 4 x 2 # 2 5 b 3 # 3 6 y 4 # 4 7 y 5 # 5 8 c 6 # 6 9 y 7 # 7 10 d 8 # 8 11 x 9
Example 1: Appending New Variable to pandas DataFrame in Python
D = ["d", "h", "h", "a", "h", "d", "a", "d", "d"] # Constructing new column print(D) # ['d', 'h', 'h', 'a', 'h', 'd', 'a', 'd', 'd'] |
D = ["d", "h", "h", "a", "h", "d", "a", "d", "d"] # Constructing new column print(D) # ['d', 'h', 'h', 'a', 'h', 'd', 'a', 'd', 'd']
my_df1 = my_df.assign(D = D) # Adding new column to DataFrame print(my_df1) # A B C D # 0 3 a 1 d # 1 4 x 2 h # 2 5 b 3 h # 3 6 y 4 a # 4 7 y 5 h # 5 8 c 6 d # 6 9 y 7 a # 7 10 d 8 d # 8 11 x 9 d |
my_df1 = my_df.assign(D = D) # Adding new column to DataFrame print(my_df1) # A B C D # 0 3 a 1 d # 1 4 x 2 h # 2 5 b 3 h # 3 6 y 4 a # 4 7 y 5 h # 5 8 c 6 d # 6 9 y 7 a # 7 10 d 8 d # 8 11 x 9 d
Example 2: Removing Rows of pandas DataFrame in Python
my_df2 = my_df[my_df.B != "y"] # Dropping rows of DataFrame print(my_df2) # A B C # 0 3 a 1 # 1 4 x 2 # 2 5 b 3 # 5 8 c 6 # 7 10 d 8 # 8 11 x 9 |
my_df2 = my_df[my_df.B != "y"] # Dropping rows of DataFrame print(my_df2) # A B C # 0 3 a 1 # 1 4 x 2 # 2 5 b 3 # 5 8 c 6 # 7 10 d 8 # 8 11 x 9
Example 3: Computing Mean of pandas DataFrame Variable in Python
my_df_mean = my_df["C"].mean() # Calculate mean of column print(my_df_mean) # 5.0 |
my_df_mean = my_df["C"].mean() # Calculate mean of column print(my_df_mean) # 5.0