Pandas: Data Manipulation - get_dummies() function

get_dummies() function

The get_dummies() function is used to convert categorical variable into dummy/indicator variables.


pandas.get_dummies(data, prefix=None, prefix_sep='_', dummy_na=False, columns=None, sparse=False, drop_first=False, dtype=None)


Name Description Type Default Value Required / Optional
data Data of which to get dummy indicators. array-like, Series, or DataFrame   Required
prefix String to append DataFrame column names. str, list of str, or dict of str Default: None Optional
prefix_sep If appending prefix, separator/delimiter to use. Or pass a list or dictionary as with prefix. str Default: ‘_’ Optional
dummy_na Add a column to indicate NaNs, if False NaNs are ignored. bool Default: False Optional
columns Column names in the DataFrame to be encoded. If columns is None then all the columns with object or category dtype will be converted. list-like Default: None Optional
sparse Whether the dummy-encoded columns should be backed by a SparseArray (True) or a regular NumPy array (False) bool Default: False Optional
drop_first Whether to get k-1 dummies out of k categorical levels by removing the first level. bool Default: False Optional
dtype Data type for new columns. Only a single dtype is allowed. dtype Default: np.uint8 Optional

Returns: DataFrame - Dummy-coded data.


Download the above Notebook from here.

Previous: concat() function
Next: factorize() function