Examples

In [1]:
import numpy as np
import pandas as pd
In [2]:
df = pd.DataFrame({'num_legs': [2, 4, 8, 0],
                   'num_wings': [2, 0, 0, 0],
                   'num_specimen_seen': [8, 2, 1, 6]},
                  index=['sparrow', 'cat', 'spider', 'snake'])
df
Out[2]:
num_legs num_wings num_specimen_seen
sparrow 2 2 8
cat 4 0 2
spider 8 0 1
snake 0 0 6

Extract 3 random elements from the Series df['num_legs']: Note that we use random_state
to ensure the reproducibility of the examples.

In [3]:
df['num_legs'].sample(n=3, random_state=1)
Out[3]:
snake      0
spider     8
sparrow    2
Name: num_legs, dtype: int64

A random 50% sample of the DataFrame with replacement:

In [4]:
df.sample(frac=0.5, replace=True, random_state=1)
Out[4]:
num_legs num_wings num_specimen_seen
cat 4 0 2
snake 0 0 6

Using a DataFrame column as weights. Rows with larger value in the num_specimen_seen
column are more likely to be sampled.

In [5]:
df.sample(n=2, weights='num_specimen_seen', random_state=1)
Out[5]:
num_legs num_wings num_specimen_seen
sparrow 2 2 8
snake 0 0 6