w3resource

Pandas Series: rolling() function

Rolling window calculations in Pandas

The rolling() function is used to provide rolling window calculations.

Syntax:

Series.rolling(self, window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None)
Pandas Series rolling image

Parameters:

Name Description Type/Default Value Required / Optional
window Size of the moving window. This is the number of observations used for calculating the statistic. Each window will be a fixed size.
If its an offset then this will be the time period of each window. Each window will be a variable sized based on the observations included in the time-period. This is only valid for datetimelike indexes.
int, or offset Required
min_periods Minimum number of observations in window required to have a value (otherwise result is NA). For a window that is specified by an offset, min_periods will default to 1. Otherwise, min_periods will default to the size of the window. int
Default Value : None
Required
center Set the labels at the center of the window. bool
Default Value : False
Required
win_type Provide a window type. If None, all points are evenly weighted. See the notes below for further information. str
Default Value : None
Required
on For a DataFrame, a datetime-like column on which to calculate the rolling window, rather than the DataFrame’s index. Provided integer column is ignored and excluded from result since an integer index is not used to calculate the rolling window. str
Optional
axis   int or str
Default Value : 0
Required
closed  Make the interval closed on the ‘right’, ‘left’, ‘both’ or ‘neither’ endpoints. For offset-based windows, it defaults to ‘right’. For fixed windows, defaults to ‘both’. Remaining cases not implemented for fixed windows.
str
Default Value : None
Required

Returns: a Window or Rolling sub-classed for the particular operation

Example:

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'Q': [0, 2, 4, np.nan, 6]})
df

Output:

    Q
0	0.0
1	2.0
2	4.0
3	NaN
4	6.0
Pandas Series rolling image

Example - Rolling sum with a window length of 2, using the ‘triang’ window type:

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'Q': [0, 2, 4, np.nan, 6]})
df.rolling(2, win_type='triang').sum()

Output:

    Q
0	NaN
1	1.0
2	3.0
3	NaN
4	NaN

Example - Rolling sum with a window length of 1, min_periods defaults to the window length:

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'Q': [0, 2, 4, np.nan, 6]})
df.rolling(2, win_type='triang').sum()
df.rolling(1).sum()

Output:

    Q
0	0.0
1	2.0
2	4.0
3	NaN
4	6.0

Example - Same as above, but explicitly set the min_periods:

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'Q': [0, 2, 4, np.nan, 6]})
df.rolling(2, min_periods=1).sum()

Output:

    Q
0	0.0
1	2.0
2	6.0
3	4.0
4	6.0

Example - A ragged (meaning not-a-regular frequency), time-indexed DataFrame:

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'Q': [0, 2, 4, np.nan, 6]},
                  index = [pd.Timestamp('20190201 09:00:00'),
                           pd.Timestamp('20190201 09:00:02'),
                           pd.Timestamp('20190201 09:00:03'),
                           pd.Timestamp('20190201 09:00:05'),
                           pd.Timestamp('20190201 09:00:06')])
df

Output:

                      Q
2019-02-01 09:00:00	0.0
2019-02-01 09:00:02	2.0
2019-02-01 09:00:03	4.0
2019-02-01 09:00:05	NaN
2019-02-01 09:00:06	6.0

Example - Contrasting to an integer rolling window, this will roll a variable length window corresponding to the time period:

The default for min_periods is 1.

Python-Pandas Code:

import numpy as np
import pandas as pd
df = pd.DataFrame({'Q': [0, 2, 4, np.nan, 6]},
                  index = [pd.Timestamp('20190201 09:00:00'),
                           pd.Timestamp('20190201 09:00:02'),
                           pd.Timestamp('20190201 09:00:03'),
                           pd.Timestamp('20190201 09:00:05'),
                           pd.Timestamp('20190201 09:00:06')])
df.rolling('2s').sum()

Output:

                      Q
2019-02-01 09:00:00	0.0
2019-02-01 09:00:02	2.0
2019-02-01 09:00:03	6.0
2019-02-01 09:00:05	NaN
2019-02-01 09:00:06	6.0

Previous: Splitting the object in Pandas
Next: Expanding transformations in Pandas



Share this Tutorial / Exercise on : Facebook and Twitter