Compare Differences Between Two pandas DataFrames in Python (Example Code)
This post explains how to identify different rows between two pandas DataFrames in Python programming.
Preparing the Example
import pandas as pd # Import pandas library to Python |
import pandas as pd # Import pandas library to Python
df_A = pd.DataFrame({'A':range(1, 6), # Construct two pandas DataFrames 'B':[2, 5, 1, 3, 9], 'C':range(15, 10, - 1)}) print(df_A) # A B C # 0 1 2 15 # 1 2 5 14 # 2 3 1 13 # 3 4 3 12 # 4 5 9 11 |
df_A = pd.DataFrame({'A':range(1, 6), # Construct two pandas DataFrames 'B':[2, 5, 1, 3, 9], 'C':range(15, 10, - 1)}) print(df_A) # A B C # 0 1 2 15 # 1 2 5 14 # 2 3 1 13 # 3 4 3 12 # 4 5 9 11
df_B = pd.DataFrame({'A':range(1, 5), 'B':[5, 5, 1, 3], 'C':range(15, 11, - 1)}) print(df_B) # A B C # 0 1 5 15 # 1 2 5 14 # 2 3 1 13 # 3 4 3 12 |
df_B = pd.DataFrame({'A':range(1, 5), 'B':[5, 5, 1, 3], 'C':range(15, 11, - 1)}) print(df_B) # A B C # 0 1 5 15 # 1 2 5 14 # 2 3 1 13 # 3 4 3 12
Example: Return Different Rows Between Two pandas DataFrames in Python
df_diff = df_A.merge(df_B, # Identify different rows indicator = True, how = 'outer').loc[lambda x : x['_merge'] != 'both'] print(df_diff) # A B C _merge # 0 1 2 15 left_only # 4 5 9 11 left_only # 5 1 5 15 right_only |
df_diff = df_A.merge(df_B, # Identify different rows indicator = True, how = 'outer').loc[lambda x : x['_merge'] != 'both'] print(df_diff) # A B C _merge # 0 1 2 15 left_only # 4 5 9 11 left_only # 5 1 5 15 right_only
Related Articles & Further Resources
In addition, you could read the related articles which I have published on this website. You can find a selection of tutorials on topics such as extracting data and merging below.