w3resource

Pandas Series: str.contains() function

Series-str.contains() function

The str.contains() function is used to test if pattern or regex is contained within a string of a Series or Index.

Return boolean Series or Index based on whether a given pattern or regex is contained within a string of a Series or Index.

Syntax:

Series.str.contains(self, pat, case=True, flags=0, na=nan, regex=True)
Pandas Series: str.contains() function

Parameters:

Name Description Type/Default Value Required / Optional
pat     Character sequence or regular expression. str Required
case   If True, case sensitive. bool
Default Value: True
Required
flags  Flags to pass through to the re module, e.g. re.IGNORECASE. int
Default Value:  0 (no flags)
Required
na   Fill value for missing values. Default Value: None Required
regex 

If True, assumes the pat is a regular expression.
If False, treats the pat as a literal string.

bool
Default Value: True
Required

Returns: Series or Index of boolean values
A Series or Index of boolean values indicating whether the given pattern is contained within the string of each element of the Series or Index.

Example - Returning a Series of booleans using only a literal pattern:

Python-Pandas Code:

import numpy as np
import pandas as pd
s1 = pd.Series(['Tiger', 'fox', 'house and men', '20', np.NaN])
s1.str.contains('ox', regex=False)

Output:

0    False
1     True
2    False
3    False
4      NaN
dtype: object

Example - Returning an Index of booleans using only a literal pattern:

Python-Pandas Code:

import numpy as np
import pandas as pd
s1 = pd.Series(['Tiger', 'fox', 'house and men', '20', np.NaN])
ind = pd.Index(['Tiger', 'fox', 'house and men', '20.0', np.NaN])
ind.str.contains('20', regex=False)

Output:

Index([False, False, False, True, nan], dtype='object')

Example - Specifying case sensitivity using case:

Python-Pandas Code:

import numpy as np
import pandas as pd
s1 = pd.Series(['Tiger', 'fox', 'house and men', '20', np.NaN])
ind = pd.Index(['Tiger', 'fox', 'house and men', '20.0', np.NaN])
s1.str.contains('oX', case=True, regex=True)

Output:

0    False
1    False
2    False
3    False
4      NaN
dtype: object
Pandas Series: str.contains() function

Specifying na to be False instead of NaN replaces NaN values with False. If Series or Index does not contain NaN values the resultant dtype will be bool, otherwise, an object dtype.

Python-Pandas Code:

import numpy as np
import pandas as pd
s1 = pd.Series(['Tiger', 'fox', 'house and men', '20', np.NaN])
ind = pd.Index(['Tiger', 'fox', 'house and men', '20.0', np.NaN])
s1.str.contains('ox', na=False, regex=True)

Output:

0    False
1     True
2    False
3    False
4    False
dtype: bool

Example - Returning ‘house’ or ‘fox’ when either expression occurs in a string:

Python-Pandas Code:

import numpy as np
import pandas as pd
s1 = pd.Series(['Tiger', 'fox', 'house and men', '20', np.NaN])
ind = pd.Index(['Tiger', 'fox', 'house and men', '20.0', np.NaN])
s1.str.contains('house|fox', regex=True)

Output:

0    False
1     True
2     True
3    False
4      NaN
dtype: object

Example - Ignoring case sensitivity using flags with regex:

Python-Pandas Code:

import numpy as np
import pandas as pd
s1 = pd.Series(['Tiger', 'fox', 'house and men', '20', np.NaN])
ind = pd.Index(['Tiger', 'fox', 'house and men', '20.0', np.NaN])
import re
s1.str.contains('MEN', flags=re.IGNORECASE, regex=True)

Output:

0    False
1    False
2     True
3    False
4      NaN
dtype: object

Example - Returning any digit using regular expression:

Python-Pandas Code:

import numpy as np
import pandas as pd
s1 = pd.Series(['Tiger', 'fox', 'house and men', '20', np.NaN])
ind = pd.Index(['Tiger', 'fox', 'house and men', '20.0', np.NaN])
import re
s1.str.contains('\d', regex=True)

Output:

0    False
1    False
2    False
3     True
4      NaN
dtype: object

Ensure pat is a not a literal pattern when regex is set to True. Note in the following example one might expect only s2[1] and s2[3] to return True. However, ‘.0’ as a regex matches any character followed by a 0

Python-Pandas Code:

import numpy as np
import pandas as pd
s1 = pd.Series(['Tiger', 'fox', 'house and men', '20', np.NaN])
ind = pd.Index(['Tiger', 'fox', 'house and men', '20.0', np.NaN])
import re
s2 = pd.Series(['60', '60.0', '61', '61.0', '45'])
s2.str.contains('.0', regex=True)

Output:

0     True
1     True
2    False
3     True
4    False
dtype: bool

Previous: Series-str.cat() function
Next: Series-str.count() function



Follow us on Facebook and Twitter for latest update.