Pandas: Split a given dataset, group by two columns and convert other columns of the dataframe into a dictionary with column header as key
Pandas Grouping and Aggregating: Split-Apply-Combine Exercise-26 with Solution
Write a Pandas program to split a given dataset, group by two columns and convert other columns of the dataframe into a dictionary with column header as key.
Test Data:
school class name date_Of_Birth age height weight address S1 s001 V Alberto Franco 15/05/2002 12 173 35 street1 S2 s002 V Gino Mcneill 17/05/2002 12 192 32 street2 S3 s003 VI Ryan Parkes 16/02/1999 13 186 33 street3 S4 s001 VI Eesha Hinton 25/09/1998 13 167 30 street1 S5 s002 V Gino Mcneill 11/05/2002 14 151 31 street2 S6 s004 VI David Parkes 15/09/1997 12 159 32 street4
Sample Solution:
Python Code :
import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
df = pd.DataFrame({
'school_code': ['s001','s002','s003','s001','s002','s004'],
'class': ['V', 'V', 'VI', 'VI', 'V', 'VI'],
'name': ['Alberto Franco','Gino Mcneill','Ryan Parkes', 'Eesha Hinton', 'Gino Mcneill', 'David Parkes'],
'date_Of_Birth ': ['15/05/2002','17/05/2002','16/02/1999','25/09/1998','11/05/2002','15/09/1997'],
'age': [12, 12, 13, 13, 14, 12],
'height': [173, 192, 186, 167, 151, 159],
'weight': [35, 32, 33, 30, 31, 32],
'address': ['street1', 'street2', 'street3', 'street1', 'street2', 'street4']},
index=['S1', 'S2', 'S3', 'S4', 'S5', 'S6'])
print("Original DataFrame:")
print(df)
dict_data_list = list()
for gg, dd in df.groupby(['school_code','class']):
group = dict(zip(['school_code','class'], gg))
ocolumns_list = list()
for _, data in dd.iterrows():
data = data.drop(labels=['school_code','class'])
ocolumns_list.append(data.to_dict())
group['other_columns'] = ocolumns_list
dict_data_list.append(group)
print(dict_data_list)
Sample Output:
Original DataFrame: school_code class name date_Of_Birth age height weight \ S1 s001 V Alberto Franco 15/05/2002 12 173 35 S2 s002 V Gino Mcneill 17/05/2002 12 192 32 S3 s003 VI Ryan Parkes 16/02/1999 13 186 33 S4 s001 VI Eesha Hinton 25/09/1998 13 167 30 S5 s002 V Gino Mcneill 11/05/2002 14 151 31 S6 s004 VI David Parkes 15/09/1997 12 159 32 address S1 street1 S2 street2 S3 street3 S4 street1 S5 street2 S6 street4 [{'school_code': 's001', 'class': 'V', 'other_columns': [{'name': 'Alberto Franco', 'date_Of_Birth ': '15/05/2002', 'age': 12, 'height': 173, 'weight': 35, 'address': 'street1'}]},
{'school_code': 's001', 'class': 'VI', 'other_columns': [{'name': 'Eesha Hinton', 'date_Of_Birth ': '25/09/1998', 'age': 13, 'height': 167, 'weight': 30, 'address': 'street1'}]},
{'school_code': 's002', 'class': 'V', 'other_columns': [{'name': 'Gino Mcneill', 'date_Of_Birth ': '17/05/2002', 'age': 12, 'height': 192, 'weight': 32, 'address': 'street2'}, {'name': 'Gino Mcneill', 'date_Of_Birth ': '11/05/2002', 'age': 14, 'height': 151, 'weight': 31, 'address': 'street2'}]},
{'school_code': 's003', 'class': 'VI', 'other_columns': [{'name': 'Ryan Parkes', 'date_Of_Birth ': '16/02/1999', 'age': 13, 'height': 186, 'weight': 33, 'address': 'street3'}]},
{'school_code': 's004', 'class': 'VI', 'other_columns': [{'name': 'David Parkes', 'date_Of_Birth ': '15/09/1997', 'age': 12, 'height': 159, 'weight': 32, 'address': 'street4'}]}]
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to split a dataset, group by one column and get mean, min, and max values by group, also change the column name of the aggregated metric. Using the following dataset find the mean, min, and max values of purchase amount (purch_amt) group by customer id (customer_id).
Next: Write a Pandas program to split a given dataset, group by one column and apply an aggregate function to few columns and another aggregate function to the rest of the columns of the dataframe.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
Python: Tips of the Day
Executes the provided function once for each list element:
Example:
def tips_for_each(itr, fn): for el in itr: fn(el) tips_for_each([3, 6, 9], print)
Output:
3 6 9
- Weekly Trends
- Java Basic Programming Exercises
- SQL Subqueries
- Adventureworks Database Exercises
- C# Sharp Basic Exercises
- SQL COUNT() with distinct
- JavaScript String Exercises
- JavaScript HTML Form Validation
- Java Collection Exercises
- SQL COUNT() function
- SQL Inner Join
- JavaScript functions Exercises
- Python Tutorial
- Python Array Exercises
- SQL Cross Join
- C# Sharp Array Exercises
We are closing our Disqus commenting system for some maintenanace issues. You may write to us at reach[at]yahoo[dot]com or visit us at Facebook