Pandas: Fill missing values in time series data
74. Fill Missing Values in Time Series Data
Write a Pandas program to fill missing values in time series data.
From Wikipedia , in the mathematical field of numerical analysis, interpolation is a type of estimation, a method of constructing new data points within the range of a discrete set of known data points.
Sample Solution :
Python Code :
import pandas as pd
import numpy as np
sdata = {"c1":[120, 130 ,140, 150, np.nan, 170], "c2":[7, np.nan, 10, np.nan, 5.5, 16.5]}
df = pd.DataFrame(sdata)
df.index = pd.util.testing.makeDateIndex()[0:6]
print("Original DataFrame:")
print(df)
print("\nDataFrame after interpolate:")
print(df.interpolate())
Sample Output:
Original DataFrame:
               c1    c2
2000-01-03  120.0   7.0
2000-01-04  130.0   NaN
2000-01-05  140.0  10.0
2000-01-06  150.0   NaN
2000-01-07    NaN   5.5
2000-01-10  170.0  16.5
DataFrame after interpolate:
               c1     c2
2000-01-03  120.0   7.00
2000-01-04  130.0   8.50
2000-01-05  140.0  10.00
2000-01-06  150.0   7.75
2000-01-07  160.0   5.50
2000-01-10  170.0  16.50
For more Practice: Solve these Related Problems:
- Write a Pandas program to interpolate missing values in a time series DataFrame and then plot the resulting series.
- Write a Pandas program to fill missing time series data using forward fill and then compare with backward fill results.
- Write a Pandas program to apply linear interpolation to missing values in a time-indexed DataFrame and then compute the difference from original data.
- Write a Pandas program to fill missing values in a time series by using a custom interpolation method based on moving averages.
Go to:
PREV : Create DataFrames with Mixed Values.
NEXT : Use Local Variable Within a Query.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
