# Two’s Complement Binary in Python?

It works best if you provide a mask. That way you specify how far to sign extend.

>>> bin(-27 & 0b1111111111111111)
'0b1111111111100101'


Or perhaps more generally:

def bindigits(n, bits):
s = bin(n & int("1"*bits, 2))[2:]
return ("{0:0>%s}" % (bits)).format(s)

>>> print bindigits(-31337, 24)
111111111000010110010111


In basic theory, the actual width of the number is a function of the size of the storage. If it’s a 32-bit number, then a negative number has a 1 in the MSB of a set of 32. If it’s a 64-bit value, then there are 64 bits to display.

But in Python, integer precision is limited only to the constraints of your hardware. On my computer, this actually works, but it consumes 9GB of RAM just to store the value of x. Anything higher and I get a MemoryError. If I had more RAM, I could store larger numbers.

>>> x = 1 << (1 << 36)


So with that in mind, what binary number represents -1? Python is well-capable of interpreting literally millions (and even billions) of bits of precision, as the previous example shows. In 2’s complement, the sign bit extends all the way to the left, but in Python there is no pre-defined number of bits; there are as many as you need.

But then you run into ambiguity: does binary 1 represent 1, or -1? Well, it could be either. Does 111 represent 7 or -1? Again, it could be either. So does 111111111 represent 511, or -1… well, both, depending on your precision.

Python needs a way to represent these numbers in binary so that there’s no ambiguity of their meaning. The 0b prefix just says “this number is in binary”. Just like 0x means “this number is in hex”. So if I say 0b1111, how do I know if the user wants -1 or 15? There are two options:

Option A: The sign bit
You could declare that all numbers are signed, and the left-most bit is the sign bit. That means 0b1 is -1, while 0b01 is 1. That also means that 0b111 is also -1, while 0b0111 is 7. In the end, this is probably more confusing than helpful particularly because most binary arithmetic is going to be unsigned anyway, and people are more likely to run into mistakes by accidentally marking a number as negative because they didn’t include an explicit sign bit.