Examples

In [1]:
import numpy as np
import pandas as pd
In [2]:
s = pd.Series([2, 3, 4])
s.describe()
Out[2]:
count    3.0
mean     3.0
std      1.0
min      2.0
25%      2.5
50%      3.0
75%      3.5
max      4.0
dtype: float64

Pandas: Dataframe - Describe.

Describing a categorical Series:

In [3]:
s = pd.Series(['p', 'p', 'q', 'r'])
s.describe()
Out[3]:
count     4
unique    3
top       p
freq      2
dtype: object

Pandas: Dataframe - Describing a categorical Series.

Describing a timestamp Series.

In [4]:
s = pd.Series([
  np.datetime64("2018-02-01"),
  np.datetime64("2019-02-01"),
  np.datetime64("2019-02-01")
])
s.describe()
Out[4]:
count                       3
unique                      2
top       2019-02-01 00:00:00
freq                        2
first     2018-02-01 00:00:00
last      2019-02-01 00:00:00
dtype: object

Describing a DataFrame. By default only numeric fields are returned:

In [5]:
df = pd.DataFrame({'categorical': pd.Categorical(['s','t','u']),
                   'numeric': [2, 3, 4],
                   'object': ['p', 'q', 'r']
                  })
df.describe()
Out[5]:
numeric
count 3.0
mean 3.0
std 1.0
min 2.0
25% 2.5
50% 3.0
75% 3.5
max 4.0

Pandas: Dataframe - Describing a DataFrame. By default only numeric fields are returned.

Describing all columns of a DataFrame regardless of data type:

In [6]:
df.describe(include='all')
Out[6]:
categorical numeric object
count 3 3.0 3
unique 3 NaN 3
top u NaN q
freq 1 NaN 1
mean NaN 3.0 NaN
std NaN 1.0 NaN
min NaN 2.0 NaN
25% NaN 2.5 NaN
50% NaN 3.0 NaN
75% NaN 3.5 NaN
max NaN 4.0 NaN

Describing a column from a DataFrame by accessing it as an attribute:

In [7]:
df.numeric.describe()
Out[7]:
count    3.0
mean     3.0
std      1.0
min      2.0
25%      2.5
50%      3.0
75%      3.5
max      4.0
Name: numeric, dtype: float64

Including only numeric columns in a DataFrame description.

In [8]:
df.describe(include=[np.number])
Out[8]:
numeric
count 3.0
mean 3.0
std 1.0
min 2.0
25% 2.5
50% 3.0
75% 3.5
max 4.0

Including only string columns in a DataFrame description:

In [9]:
df.describe(include=[np.object])
Out[9]:
object
count 3
unique 3
top q
freq 1

Including only categorical columns from a DataFrame description:

In [10]:
df.describe(include=['category'])
Out[10]:
categorical
count 3
unique 3
top u
freq 1

Excluding numeric columns from a DataFrame description:

In [11]:
df.describe(exclude=[np.number])
Out[11]:
categorical object
count 3 3
unique 3 3
top u q
freq 1 1

Excluding object columns from a DataFrame description:

In [12]:
df.describe(exclude=[np.object])
Out[12]:
categorical numeric
count 3 3.0
unique 3 NaN
top u NaN
freq 1 NaN
mean NaN 3.0
std NaN 1.0
min NaN 2.0
25% NaN 2.5
50% NaN 3.0
75% NaN 3.5
max NaN 4.0