﻿ Pandas: Extract only non alphanumeric characters from the specified column of a given DataFrame - w3resource

# Pandas: Extract only non alphanumeric characters from the specified column of a given DataFrame

## Pandas: String and Regular Expression Exercise-30 with Solution

Write a Pandas program to extract only non alphanumeric characters from the specified column of a given DataFrame.

Sample Solution:

Python Code :

import pandas as pd
import re as re
pd.set_option('display.max_columns', 10)
df = pd.DataFrame({
'company_code': ['c0001#','[email protected]^2','\$c0003', 'c0003', '&c0004'],
'year': ['year 1800','year 1700','year 2300', 'year 1900', 'year 2200']
})
print("Original DataFrame:")
print(df)
def find_nonalpha(text):
result = re.findall("[^A-Za-z0-9 ]",text)
return result
df['nonalpha']=df['company_code'].apply(lambda x: find_nonalpha(x))
print("\Extracting only non alphanumeric characters from company_code:")
print(df)

Sample Output:

Original DataFrame:
company_code       year
0       c0001#  year 1800
1      [email protected]^2  year 1700
2       \$c0003  year 2300
3        c0003  year 1900
4       &c0004  year 2200
\Extracting only non alphanumeric characters from company_code:
company_code       year nonalpha
0       c0001#  year 1800      [#]
1      [email protected]^2  year 1700   [@, ^]
2       \$c0003  year 2300      [\$]
3        c0003  year 1900       []
4       &c0004  year 2200      [&]

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz

﻿

## Python: Tips of the Day

Python: Cache results with decorators

There is a great way to cache functions with decorators in Python. Caching will help save time and precious resources when there is an expensive function at hand.

Implementation is easy, just import lru_cache from functools library and decorate your function using @lru_cache.

from functools import lru_cache

@lru_cache(maxsize=None)
def fibo(a):
if a <= 1:
return a
else:
return fibo(a-1) + fibo(a-2)

for i in range(20):
print(fibo(i), end="|")

print("\n\n", fibo.cache_info())

Output:

0|1|1|2|3|5|8|13|21|34|55|89|144|233|377|610|987|1597|2584|4181|

CacheInfo(hits=36, misses=20, maxsize=None, currsize=20)