w3resource

Pandas: Extract words starting with capital words from a given column of a given DataFrame

Pandas: String and Regular Expression Exercise-40 with Solution

Write a Pandas program to extract words starting with capital words from a given column of a given DataFrame.

Sample Solution:

Python Code :

import pandas as pd
import re as re
df = pd.DataFrame({
    'company_code': ['Abcd','EFGF', 'zefsalf', 'sdfslew', 'zekfsdf'],
    'date_of_sale': ['12/05/2002','16/02/1999','05/09/1998','12/02/2022','15/09/1997'],
    'address': ['9910 Surrey Avenue','92 N. Bishop Avenue','9910 Golden Star Avenue', '102 Dunbar St.', '17 West Livingston Court']
})

print("Original DataFrame:")
print(df)

def find_capital_word(str1):
    result = re.findall(r'\b[A-Z]\w+', str1)
    return result

df['caps_word_in']=df['address'].apply(lambda cw : find_capital_word(cw))
print("\nExtract words starting with capital words from the sentences':")
print(df)

Sample Output:

Original DataFrame:
  company_code date_of_sale                   address
0         Abcd   12/05/2002        9910 Surrey Avenue
1         EFGF   16/02/1999       92 N. Bishop Avenue
2      zefsalf   05/09/1998   9910 Golden Star Avenue
3      sdfslew   12/02/2022            102 Dunbar St.
4      zekfsdf   15/09/1997  17 West Livingston Court

Extract words starting with capital words from the sentences':
  company_code            ...                           caps_word_in
0         Abcd            ...                       [Surrey, Avenue]
1         EFGF            ...                       [Bishop, Avenue]
2      zefsalf            ...                 [Golden, Star, Avenue]
3      sdfslew            ...                           [Dunbar, St]
4      zekfsdf            ...              [West, Livingston, Court]

[5 rows x 4 columns]

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to extract the unique sentences from a given column of a given DataFrame.
Next: Write a Pandas program to remove the html tags within the specified column of a given DataFrame.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz



Python: Tips of the Day

Returns True if there are duplicate values in a flat list, False otherwise

Example:

def tips_duplicates(lst):
  return len(lst) != len(set(lst))

x = [2, 4, 6, 8, 4, 2]
y = [1, 3, 5, 7, 9]
print(tips_duplicates(x))
print(tips_duplicates(y))

Output:

True
False