Jesper Noehr

Pythonista, RESTafarian, Binary Poet & Proud Bucketeer

Archive for August, 2009

Bitwise permissions in Python (and Django)

with 4 comments

So I’ve been wanting to write about this for a while. Finally getting around to it.

We’re going to be talking about how to construct and use a flexible permission scheme with Python, and how you can use it in your Django project.

If you’re already a binary ninja, you can skip this section.

The “trick” we’re going to be using, is what’s called “bitwise operators”, namely &, | and <<.

It’s important you know what these do, so lets do a quick tour. We’ll start with the “left shift” operator (<<).

So you know binary, right? Those 1’s and 0’s? OK, good. Lets say that we have the number 2. That’s 10 in binary. Well, actually it’s not 10, it’s 00000010, since there are 8 bits for one byte. The << operator, also called “left shift”, will *move* the bits to the left. So 00000010 << 1 is 00000100, or in decimal, 4. 00000010 << 2 is 00001000, or decimal 8. There’s also the “right shift” operator, so 8>>1 is 4, but we won’t be using that.

There’s an interesting pattern here. All the numbers we get, starting from 1, are in the power of 2, or 2*(2^x), actually. So 2<<4 is the same as 2*(2^4), which is decimal 32.

Now, lets talk about “binary or” (|). This operator, at first sight, may seem like it’s the same as decimal plus (+), since if say 2 | 4, we get 6. But, if we say 6 | 4, we get 6 again. Strange, eh? Not really. The trick here is that “or” will *set* the bit in the originating number, *if it’s not already set*. If it *is* set, it does nothing. Lets drop down to binary again to demonstrate this.

So, using our example of 2 | 4, 2 is 00000010 and 4 is 00000100. 2 | 4 is basically saying “look at the bits in both, and switch all the ones you find on”. So we end up with 00000110, which is…. 6. But wait! 6 is not in the power of 2. You can’t do 2<<x and end up with 6. Thus, we can conclude that 6 is the *combination* of 2<<0 and 2<<1.

In fact, we can construct very large numbers using only “or” and numbers generated by 2<<x, and we can trace it back to the originating numbers. And your computer knows this. We’ll exploit that fact in just a minute.

Finally, lets talk about “binary and” (&). This operator will look at two numbers, and in its result, only return the bits that are set in *both* numbers. This is the operator we’re going to use to “deconstruct” the numbers you or’d together earlier. For example, if we again have 2, which is 00000010, and 6, which is 00000110, it will return 00000010, since only the 7th bit is set in both. Since 6 is constructed from 2 | 4, it will also return 00000100, since 4 is also part of 6. For anything else, it will return 0.

So in summary, 6 & 2 is 2, 6 & 4 is 4, and 6 & 8 is 0.

Application

Lets try it:

CAN_READ = 1<<2
CAN_WRITE = 1<<3
CAN_ADMIN = 1<<4

READER = CAN_READ
WRITER = READER | CAN_WRITE
ADMIN = WRITER | CAN_ADMIN

bob = ADMIN
alice = READER

print "Is Bob an admin?"

if bob & CAN_ADMIN:
    print "Yes!"
else:
    print "No."

print "is Bob a reader?"

if bob & CAN_READ:
    print "Yes!"
else:
    print "No."

print "Is Alice a writer?"

if alice & CAN_WRITE:
    print "Yes!"
else:
    print "No."

print "Is Alice a reader?"

if alice & CAN_READ:
    print "Yes!"
else:
    print "No."

alice |= ADMIN

print "Can Alice write now?"

if alice & CAN_WRITE:
    print "Yes!"
else:
    print "No."

And the output:

$ python bit.py
Is Bob an admin?
Yes!
is Bob a reader?
Yes!
Is Alice a writer?
No.
Is Alice a reader?
Yes!
Can Alice write now?
Yes!

So we’ve defined a very simple permission scheme here, reading, writing and administrating. We’ve defined 3 “flags”, indicating what you can do, and we’ve defined 3 “roles”, defining what each role has access to.

The way this works, comes from what we discussed above. CAN_READ is 4, CAN_WRITE is 8, and CAN_ADMIN is 16. As we saw, we can piece these together using the “or” operator, to get a new number that has that flag “set.” READER is 4, WRITER is 12 (CAN_READ | CAN_WRITE), and ADMIN is 28 (CAN_READ | CAN_WRITE | CAN_ADMIN, or simply WRITER | CAN_ADMIN to add that flag).

Now, with an unsigned integer, we can go up to (2**16)-1 (65535, does that look familiar?), so we can actually fit quite a few more flags in there. How many? You guessed it–16.

In Python, and most other “newer” languages, you don’t really have to worry about unsigned-ness and 8 bit integers, as the language just adjusts the internal representation when you go above the limit. This means that Python won’t really complain if you give it something like 2<<1 | 2<<100, it will just give you back 2535301200456458802993406410756L, indicating that you’re no longer dealing with integers, you’re now dealing with longs. Most database backends support this too–MySQL and PostgreSQL both gives you BIGINT, which will let you go up to 18446744073709551615 (which is (2<<63)-1, hence it’s a 64 bit integer.)

So now you have 64 flags you can mess around with, and you can define these on a *per row/object level basis in a single column/number*! So theoretically, you could eliminate 64 database columns in favor of one number, and you can even use SQL to SELECT it, as SQL *also* supports bitwise operators.

Oh, right, how can you use this in your Django application? Well, we use it heavily on Bitbucket to define permissions to repositories, issues, whatever.

We have a lot of statements that look like this:

if repo.access_for(request.user) & RP.WRITER:
   # allow the write...

And of course, you can construct various other complex comparisons this way.

I hope this has helped you understand basic bitwise operators, and I urge you to dive in further. There’s cool stuff like “not” (~) and “xor” (^), who may be more powerful than what we’ve already demonstrated.

Read more about Bitwise operators on Wikipedia, and have fun.

Written by jespern

August 27th, 2009 at 12:16 pm

Posted in Uncategorized