Pandas: Extract the unique sentences from a given column of a given DataFrame

Pandas: String and Regular Expression Exercise-39 with Solution

Write a Pandas program to extract the unique sentences from a given column of a given DataFrame.

Sample Solution:

Python Code :

import pandas as pd
import re as re
df = pd.DataFrame({
    'company_code': ['Abcd','EFGF', 'zefsalf', 'sdfslew', 'zekfsdf'],
    'date_of_sale': ['12/05/2002','16/02/1999','05/09/1998','12/02/2022','15/09/1997'],
    'address': ['9910 Surrey Avenue\n9910 Surrey Avenue','92 N. Bishop Avenue','9910 Golden Star Avenue', '102 Dunbar St.\n102 Dunbar St.', '17 West Livingston Court']

print("Original DataFrame:")

def find_unique_sentence(str1):
    result = re.findall(r'(?sm)(^[^\r\n]+$)(?!.*^\1$)', str1)
    return result

df['unique_sentence']=df['address'].apply(lambda st : find_unique_sentence(st))
print("\nExtract unique sentences :")

Sample Output:

Original DataFrame:
  company_code                   ...                                                   address
0         Abcd                   ...                    9910 Surrey Avenue\n9910 Surrey Avenue
1         EFGF                   ...                                       92 N. Bishop Avenue
2      zefsalf                   ...                                   9910 Golden Star Avenue
3      sdfslew                   ...                            102 Dunbar St.\n102 Dunbar St.
4      zekfsdf                   ...                                  17 West Livingston Court

[5 rows x 3 columns]

Extract unique sentences :
  company_code             ...                         unique_sentence
0         Abcd             ...                    [9910 Surrey Avenue]
1         EFGF             ...                   [92 N. Bishop Avenue]
2      zefsalf             ...               [9910 Golden Star Avenue]
3      sdfslew             ...                        [102 Dunbar St.]
4      zekfsdf             ...              [17 West Livingston Court]

[5 rows x 4 columns]

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to extract the sentences where a specific word is present in a given column of a given DataFrame.
Next: Write a Pandas program to extract words starting with capital words from a given column of a given DataFrame.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz

Python: Tips of the Day

Python: Time library

Time library provides lots of time related functions and methods and is good to know whether you're developing a website or apps and games or working with data science or trading financial markets. Time is essential in most development pursuits and Python's standard time library comes very handy for that.

Let's check out a few simple examples:


import time