Cleaning text data using str.replace() in Pandas
Pandas: Data Cleaning and Preprocessing Exercise-8 with Solution
Write a Pandas program that handles text data with str.replace().
This exercise shows how to clean text data by replacing specific substrings in a column using str.replace().
Sample Solution :
Code :
import pandas as pd
# Create a sample DataFrame with messy text
df = pd.DataFrame({
'Product': ['$50-Discount', '$100-Off', '$200-Rebate']
})
# Clean the text by removing special characters like '$' and '-'
df['Product_Cleaned'] = df['Product'].str.replace('[$-]', '', regex=True)
# Output the result
print(df)
Output:
Product Product_Cleaned 0 $50-Discount 50Discount 1 $100-Off 100Off 2 $200-Rebate 200Rebate
Explanation:
- Created a DataFrame with text data that contains special characters.
- Used str.replace() with a regular expression to remove characters like $ and - from the 'Product' column.
- Added a new column 'Product_Cleaned' with the cleaned text.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://www.w3resource.com/python-exercises/pandas/pandas-clean-text-data-using-str-dot-replace.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics