pjg1.site / python-type-cheatsheet

Python type conversion cheatsheet

The built-in functions to convert between different types in Python is something I keep forgetting and have to look up each time. So I decided to make a cheatsheet for reference. Tested on version 3.11.7.

int ↔ hex

There are two built-in functions for this, hex() and int().

hex() converts an integer (base 10) to a hexadecimal number (base 16). Hexadecimal numbers aren't a separate type in Python, so the resuting output is of type str.

>>> hex(255)
'0xff'

For the other way round, int() is called with the base set to 16. The function accepts input with and without the 0x prefix.

>>> int('0xff', 16)
255
>>> int('ff', 16)
255

Even typing the hex string with the 0x prefix but without the quotes will output its integer representation.

>>> 0xff
255

hex ↔ bytes

The functions are part of the bytes built-in type.

bytes.hex() converts bytes to a hexadecimal string.

>>> bytes.hex(b'hello')
'68656c6c6f'

bytes.fromhex() converts a hex string to its byte form. Unlike int(), this function only accepts input without the 0x prefix.

>>> bytes.fromhex('68656c6c6f')
b'hello'
>>> bytes.fromhex('0x68656c6c6f')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: non-hexadecimal number found in fromhex() arg at position 1

int ↔ bytes

The int data type has functions for converting to and from bytes.

int.to_bytes() requires two parameters apart from the number itself.

The first is the length of the resulting output, or the number of bytes. I can either specify a number, or use another function, int.bit_length() to calculate the number of bytes.

>>> num = 5
>>> bin(num)
'0b101'
>>> int.bit_length(num)
3
>>> nbytes = (int.bit_length(num) + 7) // 8
>>> nbytes
1

The next parameter is the byte order. When in doubt, default to big-endian.

>>> int.to_bytes(num, (int.bit_length(num) + 7) // 8, 'big')
b'\x05'
>>> int.to_bytes(num, 2, 'big')
b'\x00\x05'
>>> int.to_bytes(num, 2, 'little')
b'\x05\x00'

The function can also be written as num.to_bytes(2, 'big'), however I prefer the earlier notation as its more consistent with int.from_bytes(), the reverse function.

>>> int.from_bytes(b'\x00\x05', 'big')
5
>>> int.from_bytes(b'\x05\x00', 'little')
5

str ↔ bytes

The process of translating human readable characters (string) to something computers can understand (bytes) is called encoding. Strings can be encoded using str.encode(). utf-8 is the default encoding, however this can be changed.

>>> str.encode('hello', 'utf-8')
b'hello'
>>> str.encode('hello', 'utf-16')
b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'

The reverse process, i.e., decoding (converting bytes to string) is done using bytes.decode().

>>> bytes.decode(b'hello', 'utf-8')
'hello'
>>> bytes.decode(b'\xff\xfeh\x00e\x00l\x00l\x00o\x00', 'utf-16')
'hello'

int ↔ str

This is the one I have most confusion about, as the ints are one of two types:

For ASCII characters, its two built-in functions - chr() for number to character, and ord() for the opposite.

>>> chr(65)
'A'
>>> ord('A')
65

For longer ints, the process is the same as int ↔ bytes, with the additional step of encoding or decoding where needed.

In the case of int to str, the resulting bytes are decoded.

>>> num = 448378203247
>>> int.to_bytes(num, 5, 'big').decode('utf-8')
'hello'

For the other way round, the string input is encoded before converting.

>>> text = 'hello'
>>> int.from_bytes(text.encode('utf-8'), 'big')
448378203247

The same holds true for hex ↔ bytes as well.