Handling class imbalance using random oversampling in Pandas
Pandas: Machine Learning Integration Exercise-13 with Solution
Write a Pandas program to handling class imbalance using random oversampling.
This exercise show how to handle class imbalance using random oversampling with the RandomOverSampler from Imbalanced-learn.
Sample Solution :
Code :
import pandas as pd
from imblearn.over_sampling import RandomOverSampler
# Load the dataset
df = pd.read_csv('data.csv')
# Split into features and target
X = df.drop('Target', axis=1)
y = df['Target']
# Initialize the RandomOverSampler
ros = RandomOverSampler(random_state=42)
# Apply random oversampling to balance the target classes
X_resampled, y_resampled = ros.fit_resample(X, y)
# Output the resampled dataset
print(pd.concat([X_resampled, y_resampled], axis=1))
Output:
ID Name Age Gender Salary Target 0 1 Sara 25.0 Female 50000.0 0 1 2 Ophrah 30.0 Male 60000.0 1 2 3 Torben 22.0 Male 70000.0 0 3 4 Masaharu 35.0 Male 80000.0 1 4 5 Kaya NaN Female 55000.0 0 5 6 Abaddon 29.0 Male NaN 1
Explanation:
- Loaded the dataset using Pandas.
- Split the data into features (X) and target (y).
- Initialized RandomOverSampler from Imbalanced-learn to balance the dataset by oversampling the minority class.
- Applied oversampling and displayed the resampled dataset.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://www.w3resource.com/python-exercises/pandas/pandas-handle-class-imbalance-using-random-oversampling.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics