w3resource

Pandas: Split the specified given dataframe into groups based on school code and class

Pandas Grouping and Aggregating: Split-Apply-Combine Exercise-3 with Solution

Write a Pandas program to split the following given dataframe into groups based on school code and class.

Test Data:

   school class            name date_Of_Birth   age  height  weight  address
S1   s001     V  Alberto Franco     15/05/2002   12    173      35  street1
S2   s002     V    Gino Mcneill     17/05/2002   12    192      32  street2
S3   s003    VI     Ryan Parkes     16/02/1999   13    186      33  street3
S4   s001    VI    Eesha Hinton     25/09/1998   13    167      30  street1
S5   s002     V    Gino Mcneill     11/05/2002   14    151      31  street2
S6   s004    VI    David Parkes     15/09/1997   12    159      32  street4

Sample Solution:

Python Code :

import pandas as pd
pd.set_option('display.max_rows', None)
#pd.set_option('display.max_columns', None)
student_data = pd.DataFrame({
    'school_code': ['s001','s002','s003','s001','s002','s004'],
    'class': ['V', 'V', 'VI', 'VI', 'V', 'VI'],
    'name': ['Alberto Franco','Gino Mcneill','Ryan Parkes', 'Eesha Hinton', 'Gino Mcneill', 'David Parkes'],
    'date_Of_Birth ': ['15/05/2002','17/05/2002','16/02/1999','25/09/1998','11/05/2002','15/09/1997'],
    'age': [12, 12, 13, 13, 14, 12],
    'height': [173, 192, 186, 167, 151, 159],
    'weight': [35, 32, 33, 30, 31, 32],
    'address': ['street1', 'street2', 'street3', 'street1', 'street2', 'street4']},
    index=['S1', 'S2', 'S3', 'S4', 'S5', 'S6'])
print("Original DataFrame:")
print(student_data)
print('\nSplit the said data on school_code, class wise:')
result = student_data.groupby(['school_code', 'class'])
for name,group in result:
    print("\nGroup:")
    print(name)
    print(group)

Sample Output:

Original DataFrame:
   school_code class            name   ...    height  weight  address
S1        s001     V  Alberto Franco   ...      173      35  street1
S2        s002     V    Gino Mcneill   ...      192      32  street2
S3        s003    VI     Ryan Parkes   ...      186      33  street3
S4        s001    VI    Eesha Hinton   ...      167      30  street1
S5        s002     V    Gino Mcneill   ...      151      31  street2
S6        s004    VI    David Parkes   ...      159      32  street4

[6 rows x 8 columns]

Split the said data on school_code, class wise:

Group:
('s001', 'V')
   school_code class            name   ...    height  weight  address
S1        s001     V  Alberto Franco   ...      173      35  street1

[1 rows x 8 columns]

Group:
('s001', 'VI')
   school_code class          name   ...    height  weight  address
S4        s001    VI  Eesha Hinton   ...      167      30  street1

[1 rows x 8 columns]

Group:
('s002', 'V')
   school_code class          name   ...    height  weight  address
S2        s002     V  Gino Mcneill   ...      192      32  street2
S5        s002     V  Gino Mcneill   ...      151      31  street2

[2 rows x 8 columns]

Group:
('s003', 'VI')
   school_code class         name   ...    height  weight  address
S3        s003    VI  Ryan Parkes   ...      186      33  street3

[1 rows x 8 columns]

Group:
('s004', 'VI')
   school_code class          name   ...    height  weight  address
S6        s004    VI  David Parkes   ...      159      32  street4

[1 rows x 8 columns]                  

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to split the following dataframe by school code and get mean, min, and max value of age for each school.
Next: Write a Pandas program to split the following dataframe into groups based on school code and cast grouping as a list.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz



Python: Tips of the Day

Returns True if there are duplicate values in a flat list, False otherwise

Example:

def tips_duplicates(lst):
  return len(lst) != len(set(lst))

x = [2, 4, 6, 8, 4, 2]
y = [1, 3, 5, 7, 9]
print(tips_duplicates(x))
print(tips_duplicates(y))

Output:

True
False