w3resource

Pandas: Extract email from a specified column of string type of a given DataFrame

Pandas: String and Regular Expression Exercise-24 with Solution

Write a Pandas program to extract email from a specified column of string type of a given DataFrame.

Sample Solution:

Python Code :

import pandas as pd
import re as re
pd.set_option('display.max_columns', 10)
df = pd.DataFrame({
    'name_email': ['Alberto Franco [email protected]','Gino Mcneill [email protected]','Ryan Parkes [email protected]', 'Eesha Hinton', 'Gino Mcneill [email protected]']
    })
print("Original DataFrame:")
print(df)
def find_email(text):
    email = re.findall(r'[\w\.-][email protected][\w\.-]+',str(text))
    return ",".join(email)
df['email']=df['name_email'].apply(lambda x: find_email(x))
print("\Extracting email from dataframe columns:")
print(df)

Sample Output:

Original DataFrame:
                    name_email
0  Alberto Franco [email protected]
1    Gino Mcneill [email protected]
2        Ryan Parkes [email protected]
3                 Eesha Hinton
4   Gino Mcneill [email protected]
\Extracting email from dataframe columns:
                    name_email          email
0  Alberto Franco [email protected]   [email protected]
1    Gino Mcneill [email protected]   [email protected]
2        Ryan Parkes [email protected]      [email protected]
3                 Eesha Hinton               
4   Gino Mcneill [email protected]  [email protected]

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to split a string of a column of a given DataFrame into multiple columns.
Next: Write a Pandas program to extract hash attached word from twitter text from the specified column of a given DataFrame.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz



Python: Tips of the Day

Python: Cache results with decorators

There is a great way to cache functions with decorators in Python. Caching will help save time and precious resources when there is an expensive function at hand.

Implementation is easy, just import lru_cache from functools library and decorate your function using @lru_cache.

from functools import lru_cache

@lru_cache(maxsize=None)
def fibo(a):
    if a <= 1:
        return a
    else:
        return fibo(a-1) + fibo(a-2)

for i in range(20):
    print(fibo(i), end="|")

print("\n\n", fibo.cache_info())

Output:

0|1|1|2|3|5|8|13|21|34|55|89|144|233|377|610|987|1597|2584|4181|

 CacheInfo(hits=36, misses=20, maxsize=None, currsize=20)