w3resource

Pandas Series: drop_duplicates() function

Remove Pandas series with duplicate values

The drop_duplicates() function is used to get Pandas series with duplicate values removed.

Syntax:

Series.drop_duplicates(self, keep='first', inplace=False)
Pandas Series drop_duplicates image

Parameters:

Name Description Type/Default Value Required / Optional
keep
  • ‘first’ : Drop duplicates except for the first occurrence.
  • ‘last’ : Drop duplicates except for the last occurrence.
  • False : Drop all duplicates.
{‘first’, ‘last’, False}
Default Value: ‘first’
Required
inplace If True, performs operation inplace and returns None. bool
Default Value: False
Required

Returns: Series
Series with duplicates dropped.

Example - Generate a Series with duplicated entries:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(['cat', 'cow', 'cat', 'dog', 'cat', 'fox'],
              name='animal')
s

Output:

0    cat
1    cow
2    cat
3    dog
4    cat
5    fox
Name: animal, dtype: object
Pandas Series drop_duplicates image

Example - With the ‘keep’ parameter, the selection behaviour of duplicated values can be changed. The value ‘first’ keeps the first occurrence for each set of duplicated entries. The default value of keep is ‘first’:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(['cat', 'cow', 'cat', 'dog', 'cat', 'fox'],
              name='animal')
s.drop_duplicates()

Output:

0    cat
1    cow
3    dog
5    fox
Name: animal, dtype: object

Example - The value ‘last’ for parameter ‘keep’ keeps the last occurrence for each set of duplicated entries:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(['cat', 'cow', 'cat', 'dog', 'cat', 'fox'],
              name='animal')
s.drop_duplicates(keep='last')

Output:

1    cow
3    dog
4    cat
5    fox
Name: animal, dtype: object

Example - The value False for parameter ‘keep’ discards all sets of duplicated entries. Setting the value of ‘inplace’ to True performs the operation inplace and returns None:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(['cat', 'cow', 'cat', 'dog', 'cat', 'fox'],
              name='animal')
s.drop_duplicates(keep=False, inplace=True)
s

Output:

1    cow
3    dog
5    fox
Name: animal, dtype: object

Previous: Series-droplevel() function
Next: Indicate duplicate Series values



Follow us on Facebook and Twitter for latest update.