w3resource

Pandas Series: interpolate() function

Fill NA/missing values in a Pandas series

The interpolate() function is used to interpolate values according to different methods.

Syntax:

Series.interpolate(self, method='linear', axis=0, limit=None, inplace=False, limit_direction='forward', limit_area=None, downcast=None, **kwargs)
Pandas Series interpolate image

Parameters:

Name Description Type/Default Value Required / Optional
method Interpolation technique to use. One of:
  • ‘linear’: Ignore the index and treat the values as equally spaced. This is the only method supported on MultiIndexes.
  • ‘time’: Works on daily and higher resolution data to interpolate given length of interval.
  • ‘index’, ‘values’: use the actual numerical values of the index.
  • ‘pad’: Fill in NaNs using existing values.
  • ‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘spline’, ‘barycentric’, ‘polynomial’: Passed to scipy.interpolate.interp1d. These methods use the numerical values of the index. Both ‘polynomial’ and ‘spline’ require that you also specify an order (int), e.g. df.interpolate(method='polynomial', order=5).
  • ‘krogh’, ‘piecewise_polynomial’, ‘spline’, ‘pchip’, ‘akima’: Wrappers around the SciPy interpolation methods of similar names. See Notes.
  • ‘from_derivatives’: Refers to scipy.interpolate.BPoly.from_derivatives which replaces ‘piecewise_polynomial’ interpolation method in scipy 0.18.
New in version 0.18.1: Added support for the ‘akima’ method. Added interpolate method ‘from_derivatives’ which replaces ‘piecewise_polynomial’ in SciPy 0.18; backwards-compatible with SciPy < 0.18
str
Default Value: ‘linear’
Required
axis Axis to interpolate along. {0 or ‘index’, 1 or ‘columns’, None}
Default Value: None
Required
limit Maximum number of consecutive NaNs to fill. Must be greater than 0. int Optional
inplace Update the data in place if possible. bool
Default Value: False
Required
limit_direction If limit is specified, consecutive NaNs will be filled in this direction. {‘forward’, ‘backward’, ‘both’}
Default Value: ‘forward’
Required
limit_area

If limit is specified, consecutive NaNs will be filled with this restriction.

  • None: No fill restriction.
  • ‘inside’: Only fill NaNs surrounded by valid values (interpolate).
  • ‘outside’: Only fill NaNs outside valid values (extrapolate).
{None, ‘inside’, ‘outside’}
Default Value: None
Required
downcast Downcast dtypes if possible. infer’ or None
Default Value: None
Optional
**kwargs Keyword arguments to pass on to the interpolating function.   Required

Returns: Series or DataFrame- Returns the same object type as the caller, interpolated at some or all NaN values.

Notes

The ‘krogh’, ‘piecewise_polynomial’, ‘spline’, ‘pchip’ and ‘akima’ methods are wrappers around the respective SciPy implementations of similar names. These use the actual numerical values of the index.

Example - Filling in NaN in a Series via linear interpolation:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series([0, 2, np.nan, 5])
s

Output:

0    0.0
1    2.0
2    NaN
3    5.0
dtype: float64
Pandas Series interpolate image

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series([0, 2, np.nan, 5])
s.interpolate()

Output:

0    0.0
1    2.0
2    3.5
3    5.0
dtype: float64

Example - Filling in NaN in a Series by padding, but filling at most two consecutive NaN at a time:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series([np.nan, "single_one", np.nan,
               "fill_two_more", np.nan, np.nan,
               3.71, np.nan])
s

Output:

0              NaN
1       single_one
2              NaN
3    fill_two_more
4              NaN
5              NaN
6             3.71
7              NaN
dtype: object

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series([np.nan, "single_one", np.nan,
               "fill_two_more", np.nan, np.nan,
               3.71, np.nan])
s.interpolate(method='pad', limit=2)

Output:

0              NaN
1       single_one
2       single_one
3    fill_two_more
4    fill_two_more
5    fill_two_more
6             3.71
7             3.71
dtype: object

Example - Filling in NaN in a Series via polynomial interpolation or splines: Both ‘polynomial’ and ‘spline’ methods require that you also specify an order (int):

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series([np.nan, "single_one", np.nan,
               "fill_two_more", np.nan, np.nan,
               3.71, np.nan])
s = pd.Series([0, 4, np.nan, 8])
s.interpolate(method='polynomial', order=2)

Output:

0    0.000000
1    4.000000
2    6.666667
3    8.000000
dtype: float64

Example - Fill the DataFrame forward (that is, going down) along each column using linear interpolation:

Note how the last entry in column ‘p’ is interpolated differently, because there is no entry after it to use for interpolation. Note how the first entry in column ‘q’ remains NaN, because there is no entry before it to use for interpolation.

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame([(0.0, np.nan, -2.0, 2.0),
                   (np.nan, 3.0, np.nan, np.nan),
                   (2.0, 3.0, np.nan, 7.0),
                   (np.nan, 4.0, -4.0, 16.0)],
                  columns=list('pqrs'))
df

Output:

   p	 q	  r	    s
0	0.0	NaN	 -2.0	2.0
1	NaN	3.0	  NaN	NaN
2	2.0	3.0	  NaN	7.0
3	NaN	4.0	 -4.0	16.0

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame([(0.0, np.nan, -2.0, 2.0),
                   (np.nan, 3.0, np.nan, np.nan),
                   (2.0, 3.0, np.nan, 7.0),
                   (np.nan, 4.0, -4.0, 16.0)],
                  columns=list('pqrs'))
df.interpolate(method='linear', limit_direction='forward', axis=0)

Output:

   p	 q	    r	    s
0	0.0	NaN	-2.000000	2.0
1	1.0	3.0	-2.666667	4.5
2	2.0	3.0	-3.333333	7.0
3	2.0	4.0	-4.000000	16.0

Example - Using polynomial interpolation:

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame([(0.0, np.nan, -2.0, 2.0),
                   (np.nan, 3.0, np.nan, np.nan),
                   (2.0, 3.0, np.nan, 7.0),
                   (np.nan, 4.0, -4.0, 16.0)],
                  columns=list('pqrs'))
df['s'].interpolate(method='polynomial', order=2)

Output:

0     2.000000
1     2.333333
2     7.000000
3    16.000000
Name: s, dtype: float64

Previous: Fill NA/NaN values using the specified method
Next: Sort Pandas series in ascending or descending order by some criterion



Follow us on Facebook and Twitter for latest update.