Pandas Series: factorize() function

Encode the object in Pandas

The factorize() function is used to encode the object as an enumerated type or categorical variable.

This method is useful for obtaining a numeric representation of an array when all that matters is identifying distinct values. factorize is available as both a top-level function pandas.factorize(), and as a method Series.factorize() and Index.factorize().


Series.factorize(self, sort=False, na_sentinel=-1)
Pandas Series factorize image


Name Description Type/Default Value Required / Optional
sort Sort uniques and shuffle labels to maintain the relationship. boolean
Default Value: False
na_sentinel Value to mark “not found”. int
Default Value: 1


  • labels - ndarray
    An integer ndarray that’s an indexer into uniques. uniques.take(labels) will have the same values as values.
  • uniques - ndarray, Index, or Categorical
    The unique valid values. When values is Categorical, uniques is a Categorical. When values is some other pandas object, an Index is returned. Otherwise, a 1-D ndarray is returned.


Download the Pandas Series Notebooks from here.

Previous: First discrete difference of element in Pandas
Next: Maximum of the values for the Pandas requested axis