Pandas: Clean object column with mixed data of a given DataFrame using regular expression
Pandas: DataFrame Exercise-76 with Solution
Write a Pandas program to clean object column with mixed data of a given DataFrame using regular expression.
Sample Solution :
Python Code :
import pandas as pd
d = {"agent": ["a001", "a002", "a003", "a003", "a004"], "purchase":[4500.00, 7500.00, "$3000.25", "$1250.35", "9000.00"]}
df = pd.DataFrame(d)
print("Original dataframe:")
print(df)
print("\nData Types:")
print(df["purchase"].apply(type))
df["purchase"] = df["purchase"].replace("[$,]", "", regex = True).astype("float")
print("\nNew Data Types:")
print(df["purchase"].apply(type))
Sample Output:
Original dataframe: agent purchase 0 a001 4500 1 a002 7500 2 a003 $3000.25 3 a003 $1250.35 4 a004 9000.00 Data Types: 0 <class 'float'> 1 <class 'float'> 2 <class 'str'> 3 <class 'str'> 4 <class 'str'> Name: purchase, dtype: object New Data Types: 0 <class 'float'> 1 <class 'float'> 2 <class 'float'> 3 <class 'float'> 4 <class 'float'> Name: purchase, dtype: object
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to fill missing values in time series data.
Next: Write a Pandas program to get the numeric representation of an array by identifying distinct values of a given column of a dataframe.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://www.w3resource.com/python-exercises/pandas/python-pandas-data-frame-exercise-76.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics