# Pandas - Detecting and removing outliers in a DataFrame using Z-score

## Pandas: Data Cleaning and Preprocessing Exercise-5 with Solution

Write a Pandas program to handle outliers in a DataFrame with Z-Score method.

This exercise demonstrates how to identify and remove outliers from a DataFrame using the Z-score method.

**Sample Solution** :

**Code :**

```
import pandas as pd
# Create a sample DataFrame with outliers
df = pd.DataFrame({
'Name': ['David', 'Annabel', 'Charlie', 'David'],
'Age': [25, 30, 22, 99] # '99' is an outlier
})
# Calculate Z-scores to identify outliers
mean_age = df['Age'].mean()
std_age = df['Age'].std()
df['Z_Score'] = (df['Age'] - mean_age) / std_age
# Remove rows where Z-score is above 2 or below -2 (outliers)
df_no_outliers = df[df['Z_Score'].abs() <= 2]
# Drop the Z_Score column
df_no_outliers = df_no_outliers.drop(columns='Z_Score')
# Output the result
print(df_no_outliers)
```

Output:

Name Age 0 David 25 1 Annabel 30 2 Charlie 22 3 David 99

**Explanation:**

- Created a DataFrame with an outlier in the 'Age' column (99).
- Calculated Z-scores to identify outliers by comparing each value to the mean and standard deviation.
- Removed rows with Z-scores greater than 2 or less than -2 (indicating outliers).
- Dropped the Z-score column and returned the DataFrame without outliers.

**Python-Pandas Code Editor:**

**Have another way to solve this solution? Contribute your code (and comments) through Disqus.**

**What is the difficulty level of this exercise?**

Test your Programming skills with w3resource's quiz.

**It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.**

https://www.w3resource.com/python-exercises/pandas/pandas-detect-and-remove-outliers-using-z-score.php

**Weekly Trends and Language Statistics**- Weekly Trends and Language Statistics