Pandas: Data Manipulation - crosstab() function

Last update on August 19 2022 21:50:33 (UTC/GMT +8 hours)

crosstab() function

The crosstab() function is used to compute a simple cross tabulation of two (or more) factors.

By default computes a frequency table of the factors unless an array of values and an aggregation function are passed.

Syntax:

pandas.crosstab(index, columns, values=None, rownames=None, colnames=None, aggfunc=None, margins=False, margins_name='All', dropna=True, normalize=False)

Parameters:

Name	Description	Type	Default	Required / Optional
index	Values to group by in the rows.	array-like, Series, or list of arrays/Series		Required
columns	Values to group by in the columns.	array-like, Series, or list of arrays/Series		Required
values	Array of values to aggregate according to the factors.	array-like		Optional
rownames	If passed, must match number of row arrays passed.	sequence	Default: None	Optional
colnames	If passed, must match number of column arrays passed.	sequence	Default: None	Optional
aggfunc	If specified, requires values be specified as well.	function		Optional
margins	Add row/column margins (subtotals).	bool	Default: False	Optional
margins_name	Name of the row/column that will contain the totals when margins is True.	str	Default: ‘All’	Optional
dropna	Do not include columns whose entries are all NaN	boolean	Default: True	Optional
normalize	Normalize by dividing all values by the sum of values. If passed ‘all’ or True, will normalize over all values. If passed ‘index’ will normalize over each row. If passed ‘columns’ will normalize over each column. If margins is True, will also normalize margin values.	bool, {‘all’, ‘index’, ‘columns’}, or {0,1}	Default: False	Optional

Returns: Cross tabulation of the data.

Example:

Download the Pandas DataFrame Notebooks from here.

Previous: pivot_table() function
Next: cut() function