Examples

In [1]:
import numpy as np
import pandas as pd
In [2]:
df = pd.DataFrame({"P": ["f1", "f1", "f1", "f1", "f1",
                         "b1", "b1", "b1", "b1"],
                   "Q": ["one", "one", "one", "two", "two",
                         "one", "one", "two", "two"],
                   "R": ["small", "large", "large", "small",
                         "small", "large", "small", "small",
                         "large"],
                   "S": [1, 2, 2, 3, 3, 4, 5, 6, 7],
                   "T": [2, 4, 5, 5, 6, 6, 8, 9, 9]})
df
Out[2]:
P Q R S T
0 f1 one small 1 2
1 f1 one large 2 4
2 f1 one large 2 5
3 f1 two small 3 5
4 f1 two small 3 6
5 b1 one large 4 6
6 b1 one small 5 8
7 b1 two small 6 9
8 b1 two large 7 9

Pandas: Dataframe - pivot_table.

This first example aggregates values by taking the sum:

In [3]:
table = pd.pivot_table(df, values='S', index=['P', 'Q'],
                    columns=['R'], aggfunc=np.sum)
table
Out[3]:
R large small
P Q
b1 one 4.0 5.0
two 7.0 6.0
f1 one 4.0 1.0
two NaN 6.0

Pandas: Dataframe - This first example aggregates values by taking the sum.

You can also fill missing values using the fill_value parameter:

In [4]:
table = pd.pivot_table(df, values='S', index=['P', 'Q'],
                    columns=['R'], aggfunc=np.sum, fill_value=0)
table
Out[4]:
R large small
P Q
b1 one 4 5
two 7 6
f1 one 4 1
two 0 6

Pandas: Dataframe - You can also fill missing values using the fill_value parameter.

Following example aggregates by taking the mean across multiple columns.

In [5]:
table = pd.pivot_table(df, values=['S', 'T'], index=['P', 'R'],
                    aggfunc={'S': np.mean,
                             'T': np.mean})
table
Out[5]:
S T
P R
b1 large 5.500000 7.500000
small 5.500000 8.500000
f1 large 2.000000 4.500000
small 2.333333 4.333333

You can also calculate multiple types of aggregations for any
given value column:

table = pd.pivot_table(df, values=['S', 'T'], index=['P', 'R'], aggfunc={'S': np.mean, 'T': [min, max, np.mean]}) table