w3resource

NumPy String: numpy.char.encode() function

numpy.char.encode() function

The numpy.char.encode() is used to encode the elements of a string array using a specified encoding. This function takes an input array of strings, an optional encoding parameter (default is 'UTF-8'), and an optional errors parameter (default is 'strict') to control how encoding errors are handled.

This function is useful in -

  • Data preprocessing: When working with data containing string arrays, it may be necessary to convert these arrays into byte arrays for further analysis or manipulation. The numpy.char.encode() function can be used to encode string arrays into byte arrays using a specified encoding.
  • Text analysis: In natural language processing or text analysis tasks, you may encounter string arrays that need to be encoded into byte arrays for specific algorithms or processing steps. The numpy.char.encode() function can be used to convert these string arrays into byte arrays using a specified encoding.

Syntax:

numpy.char.encode(a, encoding=None, errors=None)

Parameters:

Name Description Required /
Optional
a [array_like of str or unicode]
encoding The name of an encoding. str, optional
errors Specifies how to handle encoding errors. str, optional

Return value:

It returns a new array with the same shape as the input array, where each element is an encoded byte array.

Example: Encoding strings in a NumPy array using a custom encoding


>>> c = np.array(['aAaAaA', '  aA  ', 'abBABba'])
>>> c
array(['aAaAaA', '  aA  ', 'abBABba'],
    dtype='|S7')
>>> np.char.encode(c, encoding='cp037')
array(['\x81\xc1\x81\xc1\x81\xc1', '@@\x81\xc1@@',
    '\x81\x82\xc2\xc1\xc2\x82\x81'],
    dtype='|S7')

In the above code the numpy.char.encode() function is called with the input array c and the custom encoding 'cp037'. This function encodes each element in the array using the specified encoding and returns a new array with the same shape, containing encoded byte arrays.

Example: Decoding and encoding a byte array using default encoding in NumPy

>>> import numpy as np
>>> a = np.char.decode(b'\xa6\xf3\x99\x85\xa2\x96\xa4\x99\x83\x85', 'cp500')
>>> b = np.char.encode(a, encoding=None, errors=None)
>>> b
array(b'w3resource', dtype='|S10')

In the above code the numpy.char.decode() function is called with the input byte array b'\xa6\xf3\x99\x85\xa2\x96\xa4\x99\x83\x85' and the custom encoding 'cp500'. This function decodes the input byte array using the specified encoding and returns a 0-dimensional array (scalar) containing the decoded string.

Python - NumPy Code Editor:

Previous: decode()
Next: join()



Follow us on Facebook and Twitter for latest update.