w3resource
Pandas Tutorial

Pandas Series: factorize() function

Encode the object in Pandas

The factorize() function is used to encode the object as an enumerated type or categorical variable.

This method is useful for obtaining a numeric representation of an array when all that matters is identifying distinct values. factorize is available as both a top-level function pandas.factorize(), and as a method Series.factorize() and Index.factorize().

Syntax:

Series.factorize(self, sort=False, na_sentinel=-1)

Parameters:

Name Description Type/Default Value Required / Optional
sort Sort uniques and shuffle labels to maintain the relationship. boolean
Default Value: False
Required
na_sentinel Value to mark “not found”. int
Default Value: 1
Required

    Returns:

  • labels - ndarray
    An integer ndarray that’s an indexer into uniques. uniques.take(labels) will have the same values as values.
  • uniques - ndarray, Index, or Categorical
    The unique valid values. When values is Categorical, uniques is a Categorical. When values is some other pandas object, an Index is returned. Otherwise, a 1-D ndarray is returned.

Example:


Download the above Notebook from here.