w3resource

Pandas Series: replace() function

Replace Pandas series values given in to_replace with value

The replace() function is used to replace values given in to_replace with value.

Values of the Series are replaced with other values dynamically. This differs from updating with .loc or .iloc, which require you to specify a location to update with some value.

Syntax:

Series.replace(self, to_replace=None, value=None, inplace=False, limit=None, regex=False, method='pad')
Pandas Series replace image
Name Description Type/Default Value Required / Optional
to_replace   Values that will be replaced.
  • numeric, str or regex:
  • list of str, regex, or numeric:
  • dict:
  • None:
str, regex, list, dict, Series, int, float, or None Required
value Value to replace any values matching to_replace with. For a DataFrame a dict of values can be used to specify which value to use for each column (columns not in the dict will not be filled). Regular expressions, strings and lists or dicts of such objects are also allowed. scalar, dict, list, str, regex
Default Value: None
Required
inplace If True, in place. Note: this will modify any other views on this object (e.g. a column from a DataFrame). Returns the caller if this is True. bool
Default Value: False
Required
limit Maximum size gap to forward or backward fill. int
Default Value: None
Required
regex Whether to interpret to_replace and/or value as regular expressions. If this is True then to_replace must be a string. Alternatively, this could be a regular expression or a list, dict, or array of regular expressions in which case to_replace must be None. bool or same types as to_replace
Default Value: False
Required
method The method to use when for replacement, when to_replace is a scalar, list or tuple and value is None. {‘pad’, ‘ffill’, ‘bfill’, None} Required

Returns: Series - Object after replacement.

Raises:
AssertionError

  • If regex is not a bool and to_replace is not None.

TypeError

  • If to_replace is a dict and value is not a list, dict, ndarray, or Series
  • If to_replace is None and regex is not compilable into a regular expression or is a list, dict, ndarray, or Series.
  • When replacing multiple bool or datetime64 objects and the arguments to to_replace does not match the type of the value being replaced

ValueError

  • If a list or an ndarray is passed to to_replace and value but they are not the same length.

 

Example - Scalar 'to_replace' and 'value':

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series([0, 2, 3, 4, 5])
s.replace(0, 6)

Output:

0    6
1    2
2    3
3    4
4    5
dtype: int64
Pandas Series replace image

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': [0, 2, 3, 4, 5],
                   'Y': [6, 7, 8, 9, 1],
                   'Z': ['p', 'q', 'r', 's', 't']})
df.replace(0, 5)

Output:

  X	Y	Z
0	5	6	p
1	2	7	q
2	3	8	r
3	4	9	s
4	5	1	t

Example - List-like 'to_replace':

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': [0, 2, 3, 4, 5],
                   'Y': [6, 7, 8, 9, 1],
                   'Z': ['p', 'q', 'r', 's', 't']})
df.replace([0, 2, 3, 4], 5)

Output:

  X	Y	Z
0	5	6	p
1	5	7	q
2	5	8	r
3	5	9	s
4	5	1	t

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': [0, 2, 3, 4, 5],
                   'Y': [6, 7, 8, 9, 1],
                   'Z': ['p', 'q', 'r', 's', 't']})
df.replace([0, 2, 3, 4], [4, 3, 2, 1])

Output:

  X	Y	Z
0	4	6	p
1	3	7	q
2	2	8	r
3	1	9	s
4	5	1	t

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series([0, 2, 3, 4, 5])
s.replace([2, 3], method='bfill')

Output:

0    0
1    4
2    4
3    4
4    5
dtype: int64

Example - dict-like 'to_replace':

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': [0, 2, 3, 4, 5],
                   'Y': [6, 7, 8, 9, 1],
                   'Z': ['p', 'q', 'r', 's', 't']})
df.replace({0: 20, 1: 80})

Output:

  X	Y	Z
0	20	6	p
1	2	7	q
2	3	8	r
3	4	9	s
4	5	80	t

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': [0, 2, 3, 4, 5],
                   'Y': [6, 7, 8, 9, 1],
                   'Z': ['p', 'q', 'r', 's', 't']})
df.replace({'X': 0, 'Y': 6}, 80)

Output:

  X	Y	Z
0	80	80	p
1	2	7	q
2	3	8	r
3	4	9	s
4	5	1	t

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': [0, 2, 3, 4, 5],
                   'Y': [6, 7, 8, 9, 1],
                   'Z': ['p', 'q', 'r', 's', 't']})
df.replace({'X': {0: 100, 3: 200}})

Output:

  X	Y	Z
0	100	6	p
1	2	7	q
2	200	8	r
3	4	9	s
4	5	1	t

Example - Regular expression 'to_replace':

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': ['bbb', 'fff', 'bii'],
                   'Y': ['abc', 'brr', 'pqr']})
df.replace(to_replace=r'^ba.$', value='new', regex=True)

Output:

   X	 Y
0	bbb	abc
1	fff	brr
2	bii	pqr

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': ['bbb', 'fff', 'bii'],
                   'Y': ['abc', 'brr', 'pqr']})
df.replace({'X': r'^ba.$'}, {'X': 'new'}, regex=True)

Output:

    X	  Y
0	bbb	abc
1	fff	brr
2	bii	pqr

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': ['bbb', 'fff', 'bii'],
                   'Y': ['abc', 'brr', 'pqr']})
df.replace(regex=r'^ba.$', value='new')

Output:

    X	  Y
0	bbb	abc
1	fff	brr
2	bii	pqr

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': ['bbb', 'fff', 'bii'],
                   'Y': ['abc', 'brr', 'pqr']})
df.replace(regex={r'^ba.$': 'new', 'fff': 'pqr'})

Output:

    X	  Y
0	bbb	abc
1	pqr	brr
2	bii	pqr

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': ['bbb', 'fff', 'bii'],
                   'Y': ['abc', 'brr', 'pqr']})
df.replace(regex=[r'^ba.$', 'fff'], value='new')

Output:

    X	  Y
0	bbb	abc
1	new	brr
2	bii	pqr

Note that when replacing multiple bool or datetime64 objects, the data types in the to_replace parameter must match the data type of the value being replaced:

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'X': [True, False, True],
                   'Y': [False, True, False]})
df.replace({'a string': 'new value', True: False}) # raises Traceback (most recent call last): ... TypeError: Cannot compare types 'ndarray(dtype=bool)' and 'str'

This raises a TypeError because one of the dict keys is not of the correct type for replacement.

Compare the behavior of s.replace({'p': None}) and s.replace('p', None) to understand the peculiarities of the to_replace parameter:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series([10, 'p', 'p', 'q', 'p'])

When one uses a dict as the to_replace value, it is like the value(s) in the dict are equalto the value parameter.

Example - s.replace({'p': None}) is equivalent to s.replace(to_replace={'p': None}, value=None, method=None):

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series([10, 'p', 'p', 'q', 'p'])
s.replace({'p': None})

Output:

0      10
1    None
2    None
3       q
4    None
dtype: object

When value=None and to_replace is a scalar, list or tuple, replace uses the method parameter (default ‘pad’) to do the replacement. So this is why the ‘p’ values are being replaced by 10 in rows 1 and 2 and ‘q’ in row 4 in this case.

Example - The command s.replace('p', None) is actually equivalent to s.replace(to_replace='p', value=None, method='pad'):

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series([10, 'p', 'p', 'q', 'p'])
s.replace('p', None)

Output:

0    10
1    10
2    10
3     q
4     q
dtype: object

Previous: Concatenate two or more Pandas series
Next: Modify Pandas series in place using non-NA values



Follow us on Facebook and Twitter for latest update.