w3resource

Pandas: Count of occurrence of a specified substring in a DataFrame column

Pandas: String and Regular Expression Exercise-6 with Solution

Write a Pandas program to count of occurrence of a specified substring in a DataFrame column.

Sample Solution:

Python Code :

import pandas as pd
df = pd.DataFrame({
    'name_code': ['c001','c002','c022', 'c2002', 'c2222'],
    'date_of_birth ': ['12/05/2002','16/02/1999','25/09/1998','12/02/2022','15/09/1997'],
    'age': [18.5, 21.2, 22.5, 22, 23]
})
print("Original DataFrame:")
print(df)
print("\nCount occurrence of 2 in date_of_birth column:")
df['count'] = list(map(lambda x: x.count("2"), df['name_code']))
print(df)

Sample Output:

Original DataFrame:
  name_code date_of_birth    age
0      c001     12/05/2002  18.5
1      c002     16/02/1999  21.2
2      c022     25/09/1998  22.5
3     c2002     12/02/2022  22.0
4     c2222     15/09/1997  23.0

Count occurrence of 2 in date_of_birth column:
  name_code date_of_birth    age  count
0      c001     12/05/2002  18.5      0
1      c002     16/02/1999  21.2      1
2      c022     25/09/1998  22.5      2
3     c2002     12/02/2022  22.0      2
4     c2222     15/09/1997  23.0      4

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to capitalize all the string values of specified columns of a given DataFrame.
Next: Write a Pandas program to find the index of a given substring of a DataFrame column.

What is the difficulty level of this exercise?

Test your Python skills with w3resource's quiz



Python: Tips of the Day

Python: Cache results with decorators

There is a great way to cache functions with decorators in Python. Caching will help save time and precious resources when there is an expensive function at hand.

Implementation is easy, just import lru_cache from functools library and decorate your function using @lru_cache.

from functools import lru_cache

@lru_cache(maxsize=None)
def fibo(a):
    if a <= 1:
        return a
    else:
        return fibo(a-1) + fibo(a-2)

for i in range(20):
    print(fibo(i), end="|")

print("\n\n", fibo.cache_info())

Output:

0|1|1|2|3|5|8|13|21|34|55|89|144|233|377|610|987|1597|2584|4181|

 CacheInfo(hits=36, misses=20, maxsize=None, currsize=20)