Compare Column Names of Two pandas DataFrames in Python (2 Examples)

In this article you’ll learn how to compare and identify differences between the headers of two pandas DataFrames in Python.

Preparing the Examples

import pandas as pd                             # Load pandas library
df_A = pd.DataFrame({'A':range(1, 6),           # Construct two pandas DataFrames
                     'B':[2, 5, 1, 3, 9],
                     'C':range(15, 10, - 1)})
print(df_A)
#    A  B   C
# 0  1  2  15
# 1  2  5  14
# 2  3  1  13
# 3  4  3  12
# 4  5  9  11
df_B = pd.DataFrame({'A':range(1, 5),
                     'C':[5, 5, 1, 3],
                     'D':[5, 8, 9, 4],
                     'E':range(15, 11, - 1)})
print(df_B)
#    A  C  D   E
# 0  1  5  5  15
# 1  2  5  8  14
# 2  3  1  9  13
# 3  4  3  4  12

Example 1: Identify Column Names Contained in Only One of the pandas DataFrames

print(df_A.columns.difference(df_B.columns))    # Only in df_A
# Index(['B'], dtype='object')
print(df_B.columns.difference(df_A.columns))    # Only in df_B
# Index(['D', 'E'], dtype='object')

Example 2: Identify Column Names Contained in Both of the pandas DataFrames

print(df_A.columns.intersection(df_B.columns))
# Index(['A', 'C'], dtype='object')

Leave a Reply

Your email address will not be published.

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed

Menu
Top