pivot_table() function

In [1]:
import numpy as np
import pandas as pd
In [2]:
df = pd.DataFrame({"P": ["s1", "s1", "s1",
                         "b1", "b1", "b1", "b1"],
                   "Q": ["one", "one", "one",
                         "one", "one", "two", "two"],
                   "R": ["small", "large", "large", "small",
                         "small", "large", "small"],
                   "S": [2, 2, 3, 3, 4, 5, 6],
                   "T": [4, 5, 5, 7, 8, 8, 9]})
df
Out[2]:
P Q R S T
0 s1 one small 2 4
1 s1 one large 2 5
2 s1 one large 3 5
3 b1 one small 3 7
4 b1 one small 4 8
5 b1 two large 5 8
6 b1 two small 6 9

This first example aggregates values by taking the sum.

In [3]:
table = pd.pivot_table(df, values='S', index=['P', 'Q'],
                     columns=['R'], aggfunc=np.sum)
table
Out[3]:
R large small
P Q
b1 one NaN 7.0
two 5.0 6.0
s1 one 5.0 2.0

You can fill missing values using the fill_value parameter:

In [4]:
table = pd.pivot_table(df, values='S', index=['P', 'Q'],
                    columns=['R'], aggfunc=np.sum, fill_value=0)
table
Out[4]:
R large small
P Q
b1 one 0 7
two 5 6
s1 one 5 2

Following example aggregates by taking the mean across multiple columns:

In [5]:
table = pd.pivot_table(df, values=['S', 'T'], index=['P', 'R'],
                    aggfunc={'S': np.mean,
                             'T': np.mean})
table
Out[5]:
S T
P R
b1 large 5.000000 8
small 4.333333 8
s1 large 2.500000 5
small 2.000000 4

You can calculate multiple types of aggregations for any given value column.

In [6]:
table = pd.pivot_table(df, values=['S', 'T'], index=['P', 'R'],
                     aggfunc={'S': np.mean,
                              'T': [min, max, np.mean]})
table
Out[6]:
S T
mean max mean min
P R
b1 large 5.000000 8 8 8
small 4.333333 9 8 7
s1 large 2.500000 5 5 5
small 2.000000 4 4 4