Compare Differences Between Two pandas DataFrames in Python (Example Code)

This post explains how to identify different rows between two pandas DataFrames in Python programming.

Preparing the Example

import pandas as pd                      # Import pandas library to Python
df_A = pd.DataFrame({'A':range(1, 6),    # Construct two pandas DataFrames
                     'B':[2, 5, 1, 3, 9],
                     'C':range(15, 10, - 1)})
print(df_A)
#    A  B   C
# 0  1  2  15
# 1  2  5  14
# 2  3  1  13
# 3  4  3  12
# 4  5  9  11
df_B = pd.DataFrame({'A':range(1, 5),
                     'B':[5, 5, 1, 3],
                     'C':range(15, 11, - 1)})
print(df_B)
#    A  B   C
# 0  1  5  15
# 1  2  5  14
# 2  3  1  13
# 3  4  3  12

Example: Return Different Rows Between Two pandas DataFrames in Python

df_diff = df_A.merge(df_B,               # Identify different rows
                     indicator = True,
                     how = 'outer').loc[lambda x : x['_merge'] != 'both']
print(df_diff)
#    A  B   C      _merge
# 0  1  2  15   left_only
# 4  5  9  11   left_only
# 5  1  5  15  right_only

Related Articles & Further Resources

In addition, you could read the related articles which I have published on this website. You can find a selection of tutorials on topics such as extracting data and merging below.

Leave a Reply

Your email address will not be published.

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed

Menu
Top