w3resource

Pandas: Extract only number from the specified column of a given DataFrame

Pandas: String and Regular Expression Exercise-27 with Solution

Write a Pandas program to extract only number from the specified column of a given DataFrame.

Sample Solution:

Python Code :

import pandas as pd
import re as re
pd.set_option('display.max_columns', 10)
df = pd.DataFrame({
    'company_code': ['c0001','c0002','c0003', 'c0003', 'c0004'],
    'address': ['7277 Surrey Ave.','920 N. Bishop Ave.','9910 Golden Star St.', '25 Dunbar St.', '17 West Livingston Court']
    })
print("Original DataFrame:")
print(df)
def find_number(text):
    num = re.findall(r'[0-9]+',text)
    return " ".join(num)
df['number']=df['address'].apply(lambda x: find_number(x))
print("\Extracting numbers from dataframe columns:")
print(df)

Sample Output:

Original DataFrame:
  company_code                   address
0        c0001          7277 Surrey Ave.
1        c0002        920 N. Bishop Ave.
2        c0003      9910 Golden Star St.
3        c0003             25 Dunbar St.
4        c0004  17 West Livingston Court
\Extracting numbers from dataframe columns:
  company_code                   address number
0        c0001          7277 Surrey Ave.   7277
1        c0002        920 N. Bishop Ave.    920
2        c0003      9910 Golden Star St.   9910
3        c0003             25 Dunbar St.     25
4        c0004  17 West Livingston Court     17

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to extract word mention someone in tweets using @ from the specified column of a given DataFrame.
Next: Write a Pandas program to extract only phone number from the specified column of a given DataFrame.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz



Python: Tips of the Day

Returns True if there are duplicate values in a flat list, False otherwise

Example:

def tips_duplicates(lst):
  return len(lst) != len(set(lst))

x = [2, 4, 6, 8, 4, 2]
y = [1, 3, 5, 7, 9]
print(tips_duplicates(x))
print(tips_duplicates(y))

Output:

True
False