w3resource

Pandas: Remove the duplicates of a specific column in a given dataframe

Pandas Filter: Exercise-5 with Solution

Write a Pandas program to remove the duplicates from 'WHO region' column of World alcohol consumption dataset.

Test Data:

   Year       WHO region                Country Beverage Types  Display Value
0  1986  Western Pacific               Viet Nam           Wine           0.00
1  1986         Americas                Uruguay          Other           0.50
2  1985           Africa           Cte d'Ivoire           Wine           1.62
3  1986         Americas               Colombia           Beer           4.27
4  1987         Americas  Saint Kitts and Nevis           Beer           1.98   

Sample Solution:

Python Code :

import pandas as pd
# World alcohol consumption data
w_a_con = pd.read_csv('world_alcohol.csv')
print("World alcohol consumption sample data:")
print(w_a_con.head())

print("\nAfter removing the duplicates of WHO region column:")
print(w_a_con.drop_duplicates('WHO region'))

Sample Output:

World alcohol consumption sample data:
   Year       WHO region      ...      Beverage Types Display Value
0  1986  Western Pacific      ...                Wine          0.00
1  1986         Americas      ...               Other          0.50
2  1985           Africa      ...                Wine          1.62
3  1986         Americas      ...                Beer          4.27
4  1987         Americas      ...                Beer          1.98

[5 rows x 5 columns]

After removing the duplicates of WHO region column:
    Year             WHO region      ...      Beverage Types Display Value
0   1986        Western Pacific      ...                Wine          0.00
1   1986               Americas      ...               Other          0.50
2   1985                 Africa      ...                Wine          1.62
13  1984  Eastern Mediterranean      ...               Other          0.00
18  1984                 Europe      ...             Spirits          1.62
20  1986        South-East Asia      ...                Wine          0.00

[6 rows x 5 columns]

Click to download world_alcohol.csv

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous:Write a Pandas program to find and drop the missing values from World alcohol consumption dataset.
Next: Write a Pandas program to find out the alcohol consumption of a given year from the world alcohol consumption dataset.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz



Python: Tips of the Day

Merging two dicts in Python 3.5+ with a single expression

Example:

# How to merge two dictionaries
# in Python 3.5+

x = {'p': 1, 'q': 3}
y = {'q': 5, 'r': 8}

z = {**x, **y}

z
{'r': 4, 'p': 1, 'q': 3}

z = dict(x, **y)
print(z)

Output:

{'p': 1, 'q': 5, 'r': 8}