Examples

In [1]:
import numpy as np
import pandas as pd
In [2]:
dtypes = ['int64', 'float64', 'complex128', 'object', 'bool']
data = dict([(t, np.ones(shape=3000).astype(t))
             for t in dtypes])
In [3]:
df = pd.DataFrame(data)
df.head()
Out[3]:
int64 float64 complex128 object bool
0 1 1.0 (1+0j) 1 True
1 1 1.0 (1+0j) 1 True
2 1 1.0 (1+0j) 1 True
3 1 1.0 (1+0j) 1 True
4 1 1.0 (1+0j) 1 True
In [4]:
df.memory_usage()
Out[4]:
Index            80
int64         24000
float64       24000
complex128    48000
object        24000
bool           3000
dtype: int64
In [5]:
df.memory_usage(index=False)
Out[5]:
int64         24000
float64       24000
complex128    48000
object        24000
bool           3000
dtype: int64

The memory footprint of object dtype columns is ignored by default:

In [6]:
df.memory_usage(deep=True)
Out[6]:
Index            80
int64         24000
float64       24000
complex128    48000
object        96000
bool           3000
dtype: int64

Use a Categorical for efficient storage of an object-dtype column with many repeated values.

In [7]:
df['object'].astype('category').memory_usage(deep=True)
Out[7]:
3168