Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point
numbers, Python objects, etc.). The axis labels are collectively referred to as the index.

In [1]:
import numpy as np
import pandas as pd
In [ ]:
s = pd.Series(data, index=index)

There are different types of data:

  • a Python dict
  • an ndarray
  • a scalar value

From ndarray

If data is an ndarray, index must be the same length as data. If no index is passed, one will be created having
values [0, ..., len(data) - 1].

In [2]:
s = pd.Series(np.random.randn(6), index=['p', 'q', 'r', 'n', 't','v'])
In [3]:
s
Out[3]:
p   -0.310263
q   -0.703727
r    0.760450
n    0.350622
t    0.195871
v    0.739086
dtype: float64
In [4]:
s.index
Out[4]:
Index(['p', 'q', 'r', 'n', 't', 'v'], dtype='object')
In [5]:
pd.Series(np.random.randn(6))
Out[5]:
0   -1.049184
1   -0.524355
2    0.659975
3   -1.122864
4    1.387395
5    0.514023
dtype: float64

From dict

Series can be instantiated from dicts:

In [6]:
n = {'q': 1, 'p': 2, 'r': 3}
In [7]:
pd.Series(n)
Out[7]:
q    1
p    2
r    3
dtype: int64

In the example above, if you were on a Python version lower than 3.6 or a Pandas version lower than 0.23,
the Series would be ordered by the lexical order of the dict keys (i.e. ['p', 'q', 'r'] rather than ['q', 'p', 'r']).

If an index is passed, the values in data corresponding to the labels in the index will be pulled out.

In [8]:
n = {'p': 2., 'q': 1., 'r': 3.}
In [9]:
pd.Series(n)
Out[9]:
p    2.0
q    1.0
r    3.0
dtype: float64
In [10]:
pd.Series(n, index=['q', 'r', 'n', 'p'])
Out[10]:
q    1.0
r    3.0
n    NaN
p    2.0
dtype: float64

From scalar value

If data is a scalar value, an index must be provided. The value will be repeated to match the length of index.

In [11]:
pd.Series(4., index=['p', 'q', 'r', 'n', 't'])
Out[11]:
p    4.0
q    4.0
r    4.0
n    4.0
t    4.0
dtype: float64

Series is ndarray-like

Series acts very similarly to a ndarray, and is a valid argument to most NumPy functions. However, operations
such as slicing will also slice the index.

In [12]:
import numpy as np
import pandas as pd
In [13]:
s = pd.Series(np.random.randn(6), index=['p', 'q', 'r', 'n', 't','v'])
In [14]:
s[0]
Out[14]:
-1.0264054091334087
In [15]:
s[:4]
Out[15]:
p   -1.026405
q   -0.549446
r    0.105166
n    1.237134
dtype: float64
In [16]:
s[s > s.median()]
Out[16]:
r    0.105166
n    1.237134
v    1.099714
dtype: float64
In [17]:
s[[5, 4, 3]]
Out[17]:
v    1.099714
t   -0.357001
n    1.237134
dtype: float64
In [18]:
np.exp(s)
Out[18]:
p    0.358293
q    0.577270
r    1.110894
n    3.445726
t    0.699772
v    3.003308
dtype: float64

Like a NumPy array, a pandas Series has a dtype.

In [19]:
s.dtype
Out[19]:
dtype('float64')

If you need the actual array backing a Series, use Series.array.

In [20]:
s.array
Out[20]:
<PandasArray>
[-1.0264054091334087,  -0.549445701791565, 0.10516552598880402,
  1.2371344986220967, -0.3570011032982442,  1.0997143297297525]
Length: 6, dtype: float64

Accessing the array can be useful when you need to do some operation without the index.
While Series is ndarray-like, if you need an actual ndarray, then use Series.to_numpy().

In [21]:
s.to_numpy()
Out[21]:
array([-1.02640541, -0.5494457 ,  0.10516553,  1.2371345 , -0.3570011 ,
        1.09971433])

Even if the Series is backed by a ExtensionArray, Series.to_numpy() will return a NumPy ndarray.

Series is dict-like A Series is like a fixed-size dict in that you can get and set values by index label:

In [22]:
import numpy as np
import pandas as pd
In [23]:
s = pd.Series(np.random.randn(6), index=['p', 'q', 'r', 'n', 't','v'])
In [24]:
s['q']
Out[24]:
-0.9278660706287409
In [25]:
s['n'] = 10.
In [26]:
s
Out[26]:
p    -0.471729
q    -0.927866
r    -0.086945
n    10.000000
t     0.593117
v    -1.245147
dtype: float64
In [27]:
'n' in s
Out[27]:
True
In [28]:
'd' in s
Out[28]:
False

If a label is not contained, an exception is raised:

s['d'] KeyError: 'd'

Using the get method, a missing label will return None or specified default:

In [29]:
s.get('d')
In [30]:
s.get('d', np.nan)
Out[30]:
nan

Vectorized operations and label alignment with Series

Series can also be passed into most NumPy methods expecting an ndarray.

In [31]:
s + s
Out[31]:
p    -0.943459
q    -1.855732
r    -0.173890
n    20.000000
t     1.186234
v    -2.490294
dtype: float64
In [32]:
s * 2
Out[32]:
p    -0.943459
q    -1.855732
r    -0.173890
n    20.000000
t     1.186234
v    -2.490294
dtype: float64
In [33]:
np.exp(s)
Out[33]:
p        0.623922
q        0.395397
r        0.916728
n    22026.465795
t        1.809620
v        0.287899
dtype: float64

A key difference between Series and ndarray is that operations between Series automatically align the data based
on label.

In [34]:
s[2:] + s[:-2]
Out[34]:
n    20.00000
p         NaN
q         NaN
r    -0.17389
t         NaN
v         NaN
dtype: float64

Name attribute Series can also have a name attribute:

In [35]:
s = pd.Series(np.random.randn(6), name='research')
In [36]:
s
Out[36]:
0    0.765820
1   -1.014433
2    1.185444
3   -0.028960
4   -1.748811
5   -1.244340
Name: research, dtype: float64
In [37]:
s.name
Out[37]:
'research'

You can rename a Series with the pandas.Series.rename() method.

In [38]:
s2=s.rename("search")
In [39]:
s2.name
Out[39]:
'search'

Note that s and s2 refer to different objects.