The built-in functions to convert between different types in Python is something I keep forgetting and have to look up each time. So I decided to make a cheatsheet for reference. Tested on version 3.11.7.
int ↔ hex
There are two built-in functions for this, hex()
and int()
.
hex()
converts an integer (base 10) to a hexadecimal number (base 16). Hexadecimal numbers aren't a separate type in Python, so the resuting output is of type str
.
>>> hex(255)
'0xff'
For the other way round, int()
is called with the base set to 16. The function accepts input with and without the 0x
prefix.
>>> int('0xff', 16)
255
>>> int('ff', 16)
255
Even typing the hex string with the 0x
prefix but without the quotes will output its integer representation.
>>> 0xff
255
hex ↔ bytes
The functions are part of the bytes
built-in type.
bytes.hex()
converts bytes to a hexadecimal string.
>>> bytes.hex(b'hello')
'68656c6c6f'
bytes.fromhex()
converts a hex string to its byte form. Unlike int()
, this function only accepts input without the 0x
prefix.
>>> bytes.fromhex('68656c6c6f')
b'hello'
>>> bytes.fromhex('0x68656c6c6f')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: non-hexadecimal number found in fromhex() arg at position 1
int ↔ bytes
The int
data type has functions for converting to and from bytes.
int.to_bytes()
requires two parameters apart from the number itself.
The first is the length of the resulting output, or the number of bytes. I can either specify a number, or use another function, int.bit_length()
to calculate the number of bytes.
>>> num = 5
>>> bin(num)
'0b101'
>>> int.bit_length(num)
3
>>> nbytes = (int.bit_length(num) + 7) // 8
>>> nbytes
1
The next parameter is the byte order. When in doubt, default to big-endian.
>>> int.to_bytes(num, (int.bit_length(num) + 7) // 8, 'big')
b'\x05'
>>> int.to_bytes(num, 2, 'big')
b'\x00\x05'
>>> int.to_bytes(num, 2, 'little')
b'\x05\x00'
The function can also be written as num.to_bytes(2, 'big')
, however I prefer the earlier notation as its more consistent with int.from_bytes()
, the reverse function.
>>> int.from_bytes(b'\x00\x05', 'big')
5
>>> int.from_bytes(b'\x05\x00', 'little')
5
str ↔ bytes
The process of translating human readable characters (string) to something computers can understand (bytes) is called encoding. Strings can be encoded using str.encode()
. utf-8
is the default encoding, however this can be changed.
>>> str.encode('hello', 'utf-8')
b'hello'
>>> str.encode('hello', 'utf-16')
b'\xff\xfeh\x00e\x00l\x00l\x00o\x00'
The reverse process, i.e., decoding (converting bytes to string) is done using bytes.decode()
.
>>> bytes.decode(b'hello', 'utf-8')
'hello'
>>> bytes.decode(b'\xff\xfeh\x00e\x00l\x00l\x00o\x00', 'utf-16')
'hello'
int ↔ str
This is the one I have most confusion about, as the int
s are one of two types:
- ASCII characters, either individual or appended together
- All characters of a string are added up as one single integer/long
For ASCII characters, its two built-in functions - chr()
for number to character, and ord()
for the opposite.
>>> chr(65)
'A'
>>> ord('A')
65
For longer ints, the process is the same as int ↔ bytes, with the additional step of encoding or decoding where needed.
In the case of int to str, the resulting bytes are decoded.
>>> num = 448378203247
>>> int.to_bytes(num, 5, 'big').decode('utf-8')
'hello'
For the other way round, the string input is encoded before converting.
>>> text = 'hello'
>>> int.from_bytes(text.encode('utf-8'), 'big')
448378203247
The same holds true for hex ↔ bytes as well.