Python 2 vs Python 3: Byte, Str and Unicode

January 12, 2019
Convert Str/Bytes to Unicode; Convert Unicode to Str/Bytes

In Python 2.x:

  • <type 'str'> or 'Hello' is byte string.
  • <type 'unicode'> or u'Hello' is unicode string.
# convert byte string to unicode string
'Hello'.decode('utf-8')
# u'Hello'

# convert unicode string to byte string
u'Hello'.encode('utf-8')
# 'Hello'
u'你好'.encode('utf-8')
# '\xe4\xbd\xa0\xe5\xa5\xbd'

In Python 3.x:

  • <type 'str'> or 'Hello' is unicode string.
  • <type 'bytes'> or b'Hello' is byte string.
# convert bytes to unicode
b'Hello'.decode('utf-8')
# 'Hello'

# convert unicode to bytes
'Hello'.encode('utf-8')
# b'Hello'
'你好'.encode('utf-8')
# b'\xe4\xbd\xa0\xe5\xa5\xbd'
This work is licensed under a
Creative Commons Attribution-NonCommercial 4.0 International License.