w3resource

Pandas: Merge all values in two datasets, with matching records from both sides where available

Pandas Joining and merging DataFrame: Exercise-7 with Solution

Write a Pandas program to join the two dataframes with matching records from both sides where available.

Test Data:

student_data1:
  student_id              name  marks
0         S1  Danniella Fenton    200
1         S2      Ryder Storey    210
2         S3      Bryce Jensen    190
3         S4         Ed Bernal    222
4         S5       Kwame Morin    199
student_data2:
  student_id              name  marks
0         S4  Scarlette Fisher    201
1         S5  Carla Williamson    200
2         S6       Dante Morse    198
3         S7    Kaiser William    219
4         S8   Madeeha Preston    201

Sample Solution:

Python Code :

import pandas as pd
student_data1 = pd.DataFrame({
        'student_id': ['S1', 'S2', 'S3', 'S4', 'S5'],
         'name': ['Danniella Fenton', 'Ryder Storey', 'Bryce Jensen', 'Ed Bernal', 'Kwame Morin'], 
        'marks': [200, 210, 190, 222, 199]})

student_data2 = pd.DataFrame({
        'student_id': ['S4', 'S5', 'S6', 'S7', 'S8'],
        'name': ['Scarlette Fisher', 'Carla Williamson', 'Dante Morse', 'Kaiser William', 'Madeeha Preston'], 
        'marks': [201, 200, 198, 219, 201]})

print("Original DataFrames:")
print(student_data1)
print(student_data2)
merged_data = pd.merge(student_data1, student_data2, on='student_id', how='outer')
print("Merged data (outer join):")
print(merged_data)

Sample Output:

Original DataFrames:
  student_id              name  marks
0         S1  Danniella Fenton    200
1         S2      Ryder Storey    210
2         S3      Bryce Jensen    190
3         S4         Ed Bernal    222
4         S5       Kwame Morin    199
  student_id              name  marks
0         S4  Scarlette Fisher    201
1         S5  Carla Williamson    200
2         S6       Dante Morse    198
3         S7    Kaiser William    219
4         S8   Madeeha Preston    201
Merged data (outer join):
  student_id            name_x  marks_x            name_y  marks_y
0         S1  Danniella Fenton    200.0               NaN      NaN
1         S2      Ryder Storey    210.0               NaN      NaN
2         S3      Bryce Jensen    190.0               NaN      NaN
3         S4         Ed Bernal    222.0  Scarlette Fisher    201.0
4         S5       Kwame Morin    199.0  Carla Williamson    200.0
5         S6               NaN      NaN       Dante Morse    198.0
6         S7               NaN      NaN    Kaiser William    219.0
7         S8               NaN      NaN   Madeeha Preston    201.0        

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to join the two dataframes using the common column of both dataframes.
Next: Write a Pandas program to join (left join) the two dataframes using keys from left dataframe only.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz



Python: Tips of the Day

How to sort a Python dict by value

Example:

x1 = {'a': 5, 'b': 7, 'c': 9, 'd': 1}

sorted(x1.items(), key=lambda x: x[1])
[('d', 1), ('c', 9), ('b', 7), ('a', 5)]

# Or:

import operator
print(sorted(x1.items(), key=operator.itemgetter(1)))

Output:

[('d', 1), ('a', 5), ('b', 7), ('c', 9)]