Compare Column Names of Two pandas DataFrames in Python (2 Examples)
In this article you’ll learn how to compare and identify differences between the headers of two pandas DataFrames in Python.
Preparing the Examples
import pandas as pd # Load pandas library |
import pandas as pd # Load pandas library
df_A = pd.DataFrame({'A':range(1, 6), # Construct two pandas DataFrames 'B':[2, 5, 1, 3, 9], 'C':range(15, 10, - 1)}) print(df_A) # A B C # 0 1 2 15 # 1 2 5 14 # 2 3 1 13 # 3 4 3 12 # 4 5 9 11 |
df_A = pd.DataFrame({'A':range(1, 6), # Construct two pandas DataFrames 'B':[2, 5, 1, 3, 9], 'C':range(15, 10, - 1)}) print(df_A) # A B C # 0 1 2 15 # 1 2 5 14 # 2 3 1 13 # 3 4 3 12 # 4 5 9 11
df_B = pd.DataFrame({'A':range(1, 5), 'C':[5, 5, 1, 3], 'D':[5, 8, 9, 4], 'E':range(15, 11, - 1)}) print(df_B) # A C D E # 0 1 5 5 15 # 1 2 5 8 14 # 2 3 1 9 13 # 3 4 3 4 12 |
df_B = pd.DataFrame({'A':range(1, 5), 'C':[5, 5, 1, 3], 'D':[5, 8, 9, 4], 'E':range(15, 11, - 1)}) print(df_B) # A C D E # 0 1 5 5 15 # 1 2 5 8 14 # 2 3 1 9 13 # 3 4 3 4 12
Example 1: Identify Column Names Contained in Only One of the pandas DataFrames
print(df_A.columns.difference(df_B.columns)) # Only in df_A # Index(['B'], dtype='object') |
print(df_A.columns.difference(df_B.columns)) # Only in df_A # Index(['B'], dtype='object')
print(df_B.columns.difference(df_A.columns)) # Only in df_B # Index(['D', 'E'], dtype='object') |
print(df_B.columns.difference(df_A.columns)) # Only in df_B # Index(['D', 'E'], dtype='object')
Example 2: Identify Column Names Contained in Both of the pandas DataFrames
print(df_A.columns.intersection(df_B.columns)) # Index(['A', 'C'], dtype='object') |
print(df_A.columns.intersection(df_B.columns)) # Index(['A', 'C'], dtype='object')