Remove Duplicate Rows in pandas DataFrame in Python (Example Code)

In this tutorial you’ll learn how to remove duplicate rows from a pandas DataFrame in the Python programming language.

Creation of Example Data

import pandas as pd                               # Import pandas

my_df = pd.DataFrame({'A':[5, 5, 5, 1, 2, 8],    # Construct example DataFrame in Python
                      'B':[5, 5, 1, 8, 9, 2],
                      'C':['a', 'a', 'c', 'd', 'e', 'f']})
print(my_df)                                     # Display example DataFrame in console
#    A  B  C
# 0  5  5  a
# 1  5  5  a
# 2  5  1  c
# 3  1  8  d
# 4  2  9  e
# 5  8  2  f

Example: Removing Duplicate Rows in pandas DataFrame Using drop_duplicates() Function

my_df = my_df.drop_duplicates()                  # Drop duplicates
print(my_df)                                     # Display updated DataFrame
#    A  B  C
# 0  5  5  a
# 2  5  1  c
# 3  1  8  d
# 4  2  9  e
# 5  8  2  f

Leave a Reply Cancel reply

Menu

Top