Pandas: Split a given dataset, group by two columns and convert other columns of the dataframe into a dictionary with column header as key

Last update on September 08 2025 12:46:11 (UTC/GMT +8 hours)

26. Grouping by Two Columns and Converting to Dictionary

Write a Pandas program to split a given dataset, group by two columns and convert other columns of the dataframe into a dictionary with column header as key.

Test Data:

   school class            name date_Of_Birth   age  height   weight  address
S1   s001     V  Alberto Franco     15/05/2002   12    173      35  street1
S2   s002     V    Gino Mcneill     17/05/2002   12    192      32  street2
S3   s003    VI     Ryan Parkes     16/02/1999   13    186      33  street3
S4   s001    VI    Eesha Hinton     25/09/1998   13    167      30  street1
S5   s002     V    Gino Mcneill     11/05/2002   14    151      31  street2
S6   s004    VI    David Parkes     15/09/1997   12    159      32  street4

Sample Solution:

Python Code :

import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
df = pd.DataFrame({
    'school_code': ['s001','s002','s003','s001','s002','s004'],
    'class': ['V', 'V', 'VI', 'VI', 'V', 'VI'],
    'name': ['Alberto Franco','Gino Mcneill','Ryan Parkes', 'Eesha Hinton', 'Gino Mcneill', 'David Parkes'],
    'date_Of_Birth ': ['15/05/2002','17/05/2002','16/02/1999','25/09/1998','11/05/2002','15/09/1997'],
    'age': [12, 12, 13, 13, 14, 12],
    'height': [173, 192, 186, 167, 151, 159],
    'weight': [35, 32, 33, 30, 31, 32],
    'address': ['street1', 'street2', 'street3', 'street1', 'street2', 'street4']},
    index=['S1', 'S2', 'S3', 'S4', 'S5', 'S6'])
print("Original DataFrame:")
print(df)
dict_data_list = list()

for gg, dd in df.groupby(['school_code','class']):
    group = dict(zip(['school_code','class'], gg))
    ocolumns_list = list()
    for _, data in dd.iterrows():
        data = data.drop(labels=['school_code','class'])
        ocolumns_list.append(data.to_dict())
    group['other_columns'] = ocolumns_list
    dict_data_list.append(group)

print(dict_data_list)

Sample Output:

Original DataFrame:
   school_code class            name date_Of_Birth   age  height  weight  \
S1        s001     V  Alberto Franco     15/05/2002   12     173      35   
S2        s002     V    Gino Mcneill     17/05/2002   12     192      32   
S3        s003    VI     Ryan Parkes     16/02/1999   13     186      33   
S4        s001    VI    Eesha Hinton     25/09/1998   13     167      30   
S5        s002     V    Gino Mcneill     11/05/2002   14     151      31   
S6        s004    VI    David Parkes     15/09/1997   12     159      32   

    address  
S1  street1  
S2  street2  
S3  street3  
S4  street1  
S5  street2  
S6  street4  
[{'school_code': 's001', 'class': 'V', 'other_columns': [{'name': 'Alberto Franco', 'date_Of_Birth ': '15/05/2002', 'age': 12, 'height': 173, 'weight': 35, 'address': 'street1'}]}, 
{'school_code': 's001', 'class': 'VI', 'other_columns': [{'name': 'Eesha Hinton', 'date_Of_Birth ': '25/09/1998', 'age': 13, 'height': 167, 'weight': 30, 'address': 'street1'}]},
 {'school_code': 's002', 'class': 'V', 'other_columns': [{'name': 'Gino Mcneill', 'date_Of_Birth ': '17/05/2002', 'age': 12, 'height': 192, 'weight': 32, 'address': 'street2'}, {'name': 'Gino Mcneill', 'date_Of_Birth ': '11/05/2002', 'age': 14, 'height': 151, 'weight': 31, 'address': 'street2'}]},
 {'school_code': 's003', 'class': 'VI', 'other_columns': [{'name': 'Ryan Parkes', 'date_Of_Birth ': '16/02/1999', 'age': 13, 'height': 186, 'weight': 33, 'address': 'street3'}]},
 {'school_code': 's004', 'class': 'VI', 'other_columns': [{'name': 'David Parkes', 'date_Of_Birth ': '15/09/1997', 'age': 12, 'height': 159, 'weight': 32, 'address': 'street4'}]}]

For more Practice: Solve these Related Problems:

Write a Pandas program to group a dataframe by two columns and then convert the remaining columns into a dictionary with column headers as keys.
Write a Pandas program to split the dataframe by two keys and then create a dictionary for each group containing the aggregated values.
Write a Pandas program to group by school and class, and then aggregate the other columns into a dictionary for each group.
Write a Pandas program to group the dataframe by multiple columns and then use the apply() method to output each group as a dictionary.

Go to:

PREV : Grouping by One Column with Custom Aggregated Metric Names.
NEXT : Grouping with Different Aggregations on Selected Columns.

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.