w3resource

Pandas: Groupby to find first dates for each group

Pandas Grouping and Aggregating: Split-Apply-Combine Exercise-31 with Solution

Write a Pandas program to split the following dataset using group by on 'salesman_id' and find the first order date for each group.

Test Data:

    ord_no  purch_amt    ord_date  customer_id  salesman_id
0    70001     150.50  2012-10-05         3002         5002
1    70009     270.65  2012-09-10         3001         5003
2    70002      65.26  2012-10-05         3001         5001
3    70004     110.50  2012-08-17         3003         5003
4    70007     948.50  2012-09-10         3002         5002
5    70005    2400.60  2012-07-27         3002         5001
6    70008    5760.00  2012-09-10         3001         5001
7    70010    1983.43  2012-10-10         3004         5003
8    70003    2480.40  2012-10-10         3003         5003
9    70012     250.45  2012-06-27         3002         5002
10   70011      75.29  2012-08-17         3003         5003
11   70013    3045.60  2012-04-25         3001         5001
 

Sample Solution:

Python Code :

import pandas as pd
pd.set_option('display.max_rows', None)
#pd.set_option('display.max_columns', None)
df = pd.DataFrame({
'ord_no':[70001,70009,70002,70004,70007,70005,70008,70010,70003,70012,70011,70013],
'purch_amt':[150.5,270.65,65.26,110.5,948.5,2400.6,5760,1983.43,2480.4,250.45, 75.29,3045.6],
'ord_date': ['2012-10-05','2012-09-10','2012-10-05','2012-08-17','2012-09-10','2012-07-27','2012-09-10','2012-10-10','2012-10-10','2012-06-27','2012-08-17','2012-04-25'],
'customer_id':[3005,3001,3002,3009,3005,3007,3002,3004,3009,3008,3003,3002],
'salesman_id': [5002,5005,5001,5003,5002,5001,5001,5004,5003,5002,5004,5001]})
print("Original Orders DataFrame:")
print(df)
print("\nGroupby to find first order date for each group(salesman_id):")
result = df.groupby('salesman_id')['ord_date'].min()
print(result)

Sample Output:

Original Orders DataFrame:
    ord_no  purch_amt    ord_date  customer_id  salesman_id
0    70001     150.50  2012-10-05         3005         5002
1    70009     270.65  2012-09-10         3001         5005
2    70002      65.26  2012-10-05         3002         5001
3    70004     110.50  2012-08-17         3009         5003
4    70007     948.50  2012-09-10         3005         5002
5    70005    2400.60  2012-07-27         3007         5001
6    70008    5760.00  2012-09-10         3002         5001
7    70010    1983.43  2012-10-10         3004         5004
8    70003    2480.40  2012-10-10         3009         5003
9    70012     250.45  2012-06-27         3008         5002
10   70011      75.29  2012-08-17         3003         5004
11   70013    3045.60  2012-04-25         3002         5001

Groupby to find first order date for each group(salesman_id):
salesman_id
5001    2012-04-25
5002    2012-06-27
5003    2012-08-17
5004    2012-08-17
5005    2012-09-10
Name: ord_date, dtype: object

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to split the following dataset using group by on first column and aggregate over multiple lists on second column.
Next: Write a Pandas program to split a given dataset using group by on multiple columns and drop last n rows of from each group.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz



Python: Tips of the Day

Negative Indexing:

In Python you can use negative indexing. While positive index starts with 0, negative index starts with -1.

name="Welcome"
print(name[0])
print(name[-1])
print(name[0:3])
print(name[-1:-4:-1])

Output:

W
e
Wel
emo