NumPy String: numpy.char.encode() function
numpy.char.encode() function
The numpy.char.encode() is used to encode the elements of a string array using a specified encoding. This function takes an input array of strings, an optional encoding parameter (default is 'UTF-8'), and an optional errors parameter (default is 'strict') to control how encoding errors are handled.
This function is useful in -
- Data preprocessing: When working with data containing string arrays, it may be necessary to convert these arrays into byte arrays for further analysis or manipulation. The numpy.char.encode() function can be used to encode string arrays into byte arrays using a specified encoding.
- Text analysis: In natural language processing or text analysis tasks, you may encounter string arrays that need to be encoded into byte arrays for specific algorithms or processing steps. The numpy.char.encode() function can be used to convert these string arrays into byte arrays using a specified encoding.
Syntax:
numpy.char.encode(a, encoding=None, errors=None)
Parameters:
Name | Description | Required / Optional |
---|---|---|
a | [array_like of str or unicode] | |
encoding | The name of an encoding. | str, optional |
errors | Specifies how to handle encoding errors. | str, optional |
Return value:
It returns a new array with the same shape as the input array, where each element is an encoded byte array.
Example: Encoding strings in a NumPy array using a custom encoding
>>> c = np.array(['aAaAaA', ' aA ', 'abBABba'])
>>> c
array(['aAaAaA', ' aA ', 'abBABba'],
dtype='|S7')
>>> np.char.encode(c, encoding='cp037')
array(['\x81\xc1\x81\xc1\x81\xc1', '@@\x81\xc1@@',
'\x81\x82\xc2\xc1\xc2\x82\x81'],
dtype='|S7')
In the above code the numpy.char.encode() function is called with the input array c and the custom encoding 'cp037'. This function encodes each element in the array using the specified encoding and returns a new array with the same shape, containing encoded byte arrays.
Example: Decoding and encoding a byte array using default encoding in NumPy
>>> import numpy as np
>>> a = np.char.decode(b'\xa6\xf3\x99\x85\xa2\x96\xa4\x99\x83\x85', 'cp500')
>>> b = np.char.encode(a, encoding=None, errors=None)
>>> b
array(b'w3resource', dtype='|S10')
In the above code the numpy.char.decode() function is called with the input byte array b'\xa6\xf3\x99\x85\xa2\x96\xa4\x99\x83\x85' and the custom encoding 'cp500'. This function decodes the input byte array using the specified encoding and returns a 0-dimensional array (scalar) containing the decoded string.
Python - NumPy Code Editor:
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://www.w3resource.com/numpy/string-operations/encode.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics