w3resource

Pandas Series: str.split() function

Series-str.split() function

The str.split() function is used to split strings around given separator/delimiter.

The function splits the string in the Series/Index from the beginning, at the specified delimiter string. Equivalent to str.split().

Syntax:

Series.str.split(self, pat=None, n=-1, expand=False)
Pandas Series: str.split() function

Parameters:

Name Description Type/Default Value Required / Optional
pat String or regular expression to split on. If not specified, split on whitespace. str Optional
n Limit number of splits in output. None, 0 and -1 will be interpreted as return all splits. int
Default Value: 1 (all)
Required
expand Expand the splitted strings into separate columns.
  • If True, return DataFrame/MultiIndex expanding dimensionality.
  • If False, return Series/Index, containing lists of strings.
bool
Default Value: False
Required

Returns: Series, Index, DataFrame or MultiIndex
Type matches caller unless expand=True

Example - In the default setting, the string is split by whitespace:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])
s.str.split()			   

Output:

0                         [this, is, my, new, pen]
1    [https://www.w3resource.com/pandas/index.php]
2                                              NaN
dtype: object
Pandas Series: str.split() function

Example - Without the n parameter, the outputs of rsplit and split are identical:

The n parameter can be used to limit the number of splits on the delimiter. The outputs of split and rsplit are different

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])
s.str.split(n=2)			   

Output:

0                           [this, is, my new pen]
1    [https://www.w3resource.com/pandas/index.php]
2                                              NaN
dtype: object

Example - The pat parameter can be used to split by other characters:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])
s.str.split(pat = "/")		   

Output:

0                                 [this is my new pen]
1    [https:, , www.w3resource.com, pandas, index.php]
2                                                  NaN
dtype: object

Example - When using expand=True, the split elements will expand out into separate columns. If NaN is present, it is propagated throughout the columns during the split:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])
s.str.split(expand=True)	   

Output:

                                            0	  1	      2	       3	  4
0	this	is	my	new	pen
1	https://www.w3resource.com/pandas/index.php	None	None	None	None
2	NaN	NaN	NaN	NaN	NaN

Example - Remember to escape special characters when explicitly using regular expressions:

Python-Pandas Code:

import numpy as np
import pandas as pd
s = pd.Series(["this is my new pen",
               "https://www.w3resource.com/pandas/index.php",
               np.nan])
s = pd.Series(["1+1=2"])
s.str.split(r"\+|=", expand=True)	   

Output:

    0	1	2
0	1	1	2

Previous: Series-str.slice_replace() function
Next: Series-str.rsplit() function



Follow us on Facebook and Twitter for latest update.