Jesper Noehr

Pythonista, RESTafarian, Binary Poet & Proud Bucketeer

Off to Eurodjangocon

without comments

Tomorrow morning (Sunday) I’ll be off to Eurodjangocon. I’ll be in Prague for one week, staying at the Iris congress hotel,  so if anyone wants to meet up and discuss, let me know. My contact information can be found on the About page.

I’m bringing ~200 Bitbucket stickers as well, first come, first served.

If you’re going to the conference and want to grab a beer, that’s cool too.

Written by jespern

May 2nd, 2009 at 3:34 pm

Posted in django

Debugging Django cache

with one comment

Easy way to debug what’s going on with your cache:

from django.core.cache import cache as django_cache

class debug_cache(object):
    ignore = [ 'to_slug', 'repodownloadsize' ] # ignore keys starting with these

    def __getattr__(self, attr):
        def wraps(f):
            def i(*args, **kwargs):
                if not any([ args[0].startswith(ig) for ig in debug_cache.ignore ]):
                    print "CACHE %s: args=%s, kwargs=%s" % (attr, args, kwargs)
                return f(*args, **kwargs)
            return i
        return wraps(getattr(django_cache, attr))

if getattr(settings, 'DEBUG_CACHE', False):
    cache = debug_cache()
else:
    cache = django_cache

Put a setting in your settings.py called DEBUG_CACHE = True, and you’ll see what’s going on.

Written by jespern

April 16th, 2009 at 9:46 am

Posted in Uncategorized

Mercurial powertip: Move changesets out of the way momentarily

with 2 comments

Sometimes you may be working in a repository, and want to momentarily move changesets out of the way. From what I can gather, you can get the same results as you get with “git stash”, but it offers much more.

Say that you have been working on an experimental feature, but need to fix a bug. You don’t want to sit and be careful only to commit the files modified by the bugfix, especially if the bugfix touches files you’ve already modified.

Your log could look like this:

$ hg log
changeset:   2:41009a6aa783
tag:         tip
summary:     adding B

changeset:   1:419ab519b195
summary:     adding C

changeset:   0:8f276b14c116
summary:     adding A



Now, changeset 1 & 2 are the experimental changes. You need to get rid of these before you can fix the bug.

It’s important that you have a “patch queue” repository inside your repository first, this is what “qinit” is for. Afterwards, we’ll import the changesets into the patch queue, using qimport:

$ hg qinit -c # tell hg to create a versioned patch queue in .hg/patches/
$ hg qimport -r 2:1


Now lets take a look at the log:

$ hg log
changeset:   2:41009a6aa783
tag:         qtip
tag:         2.diff
tag:         tip
summary:     adding B

changeset:   1:419ab519b195
tag:         1.diff
tag:         qbase
summary:     adding C

changeset:   0:8f276b14c116
tag:         qparent
summary:     adding A



The changesets are still there, but they’re a little different; They’ve been tagged with a couple of things - first, is the filename the changeset was saved as. In this case, your changes are in ‘.hg/patches/1.diff’ and ‘.hg/patches/2.diff’. Go on, have a look. There’s also some new semantic, namely ‘qbase’, ‘qtip’ and ‘qparent’. This is a way for MQ to keep track of the queue tip, the queue base and the parent.

But, you may notice that the changesets are still present. This is because they are “applied” to the repository. To get rid of them, we use qpop:

$ hg qpop -a # pop all patches from the stack
patch queue is now empty
$ hg log
changeset:   0:8f276b14c116
tag:         tip
summary:     adding A



Lovely. You can now see which patches are available via qseries:

$ hg qseries
1.diff
2.diff



To pop them back on the stack, you can use ‘qpush -a’. But first, we have a bug to fix:

$ echo 'D' > D
$ hg add D
$ hg ci -m "Adding D, which we'll pretend fixes a bug."



And the log:

$ hg log
changeset:   1:5d41625a80b5
tag:         tip
summary:     adding D (which is a bugfix)

changeset:   0:8f276b14c116
summary:     adding A



Now push that fix out, or whatever you want to. Time to get the experimental changesets back. We’ll use ‘qpush -a’ for that:

$ hg qpush -a
applying 1.diff
applying 2.diff
now at: 2.diff



You can run log to see what happened. Needless to say, your patches are there. Lets turn them back into normal changesets:

$ hg qfinish 3:2 # they're not 2:1 anymore, we have another changeset
                   in before them now, consult 'hg log' for details
$ hg log
changeset:   3:1a07541824d3
tag:         tip
summary:     adding B

changeset:   2:b4a1402f9b50
summary:     adding C

changeset:   1:5d41625a80b5
summary:     adding D (which is a bugfix)

changeset:   0:8f276b14c116
summary:     adding A



Et viola.

There’s much more you can do with MQ. If you’re only importing a single changeset, you can name the patch via ‘qimport -n’. You can give your patches to other people, and you can even push your patch queue around. ‘qimport’ will even import patches from outside your repository. You can move the order of patches around, you can do guards, .. MQ is really a wonderful addition to Mercurial.

Written by jespern

April 10th, 2009 at 12:04 pm

Posted in hg

{l,r}strip considered harmful

with one comment

If you’re using lstrip() or rstrip() in your code, chances are you might have a problem.

This is because those functions probably don’t do what you think they do.

So go ack –python ‘[lr]strip’ your codebase now.

What you think it does

If you haven’t been bitten by this before, and you haven’t thoroughly read help(str.rstrip), you probably think rstrip will strip a sequence of bytes off the end of a string.

For example, it could be used to get rid of a file extension, like


>>> filename = "fumble.exe"
>>> basefn = filename.rstrip(".exe")

Bzzzt. Wrong.

What it actually does

As per the docstring:

rstrip(…)
S.rstrip([chars]) -> string or unicode

Return a copy of the string S with trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.
If chars is unicode, S will be converted to unicode before stripping

Pay attention here: characterS. Plural. Not a sequence. More like a list.

Now, have a look at our previous example, removing the extension.


>>> filename = "fumble.exe"
>>> basefn = filename.rstrip(".exe")
>>> basefn
'fumbl'

Not what you expected, eh? Problem here is that it treats ‘.exe’ as a list of characters, so it’s basically this:


>>> remove_chars = [ '.', 'e', 'x', 'e' ]
>>> for char in reverse(filename):
… if char in remove_chars:
… # remove the char we’re looking at
… else:
… break

  1. Start at the end and go backwards, byte by byte.
  2. If the character we’re seeing is in the aforementioned list, remove it.
  3. If not, we’ve reached a stop point, so process no further.

The opposite is of course true for lstrip.

What it is useful for

Once you get over the misleading behavior and come to terms with what it actually does, you can start discovering what it is useful for.

For example, it’s immensely useful for stripping leading or trailing whitespace. In fact, this is such a common use-case that this is what it does if you don’t specify any arguments.

Since it’s a list of characters, in cases where you need to remove both unix-style carriage returns as well as win32 ones, you can simply do:


block_of_text.rstrip("\r\n")

This will remove both. They don’t necessarily have to be in that order.

What you probably wanted instead

OK, so having that out of the way, what would you want to get rid of a file extension? replace(). replace() is perfect for this, because it takes a third optional argument:

replace(…)
S.replace (old, new[, count]) -> string

Return a copy of string S with all occurrences of substring
old replaced by new. If the optional argument count is
given, only the first count occurrences are replaced.

So lets try it again:


>>> filename = "fumble.exe"
>>> basefn = filename.replace(".exe", "", 1)
>>> basefn
'fumble'

Much better.

Written by jespern

March 8th, 2009 at 12:57 pm

Posted in python

Tagged with

Mercurial powertip: Un-add a file

without comments

Something that isn’t entirely clear from the use of Mercurial, is how to un-add a file you accidentally added, before you commit.


$ hg add data/
adding data/index.txt
adding data/README
adding data/hugefile.db

$ hg status
A data/index.txt
A data/README
A data/hugefile.db

Oops. Didn’t want to add ‘hugefile.db’. How to undo that add?


$ hg revert data/hugefile.db

Did that do the right thing?


$ ls data/hugefile.db # still there?
data/hugefile.db

$ hg status
A data/index.txt
A data/README
? data/hugefile.db

Yep!

Written by jespern

March 2nd, 2009 at 10:04 am

Posted in hg

Tagged with ,

Python tricks: functools.partial and wraps

with 2 comments

Since Python 2.5, Python has had the ‘functools’ module for doing various higher order functions.

For example:

from functools import partial

def adder(first, second):
	return first + second

adder10 = partial(adder, 10)

print adder10(32) # -> 42

Partial evaluation, eh? That’s kinda cool.

On to ‘wraps’ which is the one I’ve found most practical use for. I like decorators, and I use them where applicable. What I don’t like about decorators is that when you get a backtrace, it’ll actually show up as *that* function, and not the function you decorated.

‘wraps’ to the rescue:

from functools import wraps

def some_decorator(f):
	def wrap(*args, **kwargs):
		return f(*args, **kwargs)
	return wraps(f)(wrap)

@some_decorator
def some_function():
	...

Now the function name, docstring, signature, etc. will be that of ‘f’, no longer ‘wrap’! Immensely useful.

Written by jespern

January 25th, 2009 at 7:23 pm

Posted in python

Piston and Oberon

with 2 comments

I just wanted to do a quick write-up on a couple of things, because:

  1. I wanted to announce two upcoming projects of mine, and
  2. Getting a new post out there

Piston

Piston’s a django-app I’m writing for Bitbucket. It serves as sort of a “mini-framework” on top of Django for creating RESTful APIs. Well, actually it doesn’t tie you to be RESTful at all, as its url mapping facility hooks directly into Django.

A while back, jacobian wrote an article, “REST worst practices”, outlining some of the things a good implementation would need. I’m happy to say that Piston’s elegantly waltzing its way through the list, checking off his points one by one.

We don’t tie a resource to a model (although you easily can), we have plug-able authentication (with new handlers being a breeze to add), configurable output formats (in form of “emitters”, a simple dict-to-x facility, comes with emitters for JSON, YAML and XML), proper use of HTTP (status codes, headers) and CRUD semantics, and best of all, it ties right in to your Django application.

Anyway, I wrote it for Bitbucket, but it definitely merits an open source release and its own project. It’s behind closed doors right now, but nearing completion. Once we feel it’s good to release, we’ll do a release together with David Larlet, the author of Semantic Django (who else?)

Oberon

Oberon’s also something we use on Bitbucket. It’s a queue-based “application platform” based on Twisted. Vague, huh?

No, we use it for the service integration facility of Bitbucket. Oberon itself is just a daemon, serving as a message-passing facility between the client and what I call “brokers”. A broker is a piece of Python code that must satisfy two things:

  1. It must contain a class that subclasses “BaseBroker”, and
  2. That class must have a “handle” method receiving a single argument, “payload”

What this allows you to do is pretty nifty. You can load up a few of these brokers, and then using the client API, you can send messages to Oberon, and it’ll take it from there.

For example, we have a couple of brokers, like Twitter, which extracts the information it wants from the payload and uses a Twitter client library to post messages. There’s a Basecamp broker, and the most popular one thus far is the “Issue” broker, which parses commit messages and acts on them. Stuff like “great, all done, fixes #42″ will close up issue 42, and “hm, needs more work, references #37″ will add a comment to issue 37.

Best of all, and my favorite feature is ‘oberonc’, the command line client. It’s pretty basic but it has useful commands like ’stats’, ‘brokers’ and best of all: ‘reload’ — yep, that’s right, you can reload brokers on the fly without disrupting service. It works really well too, due to the way we’ve designed the application. It also means you can load up new brokers that have never been loaded before, so it makes it really interesting to upgrade running systems.

None is this stuff is tied into Bitbucket, so it has a vast variety of uses. It runs on top of ‘twistd’ as well so it should be pretty stable and scalable (it uses stuff like epoll.)

Anyway, Oberon’s also getting its own open source release, together with all the brokers we’ve written for the service integration we’re using on the live system. Those should serve as good examples.

I’ll post about both here, when they’re out.

Written by jespern

January 23rd, 2009 at 10:51 am

Posted in django, python

Conditional middleware execution in Django

without comments

On BitBucket, we need to handle streaming data through Django. This lowers the memory footprint of the application and makes execution faster.

The problem with this is that several stock middleware in Django “look” at the content before sending it. This is a problem for streaming content, since you’d generally use a generator, and you can’t consume it until the very last minute.

The middleware in Django that does this is ConditionalGetMiddleware which attempts to create an ‘ETag’ header, and CommonMiddleware, which attemps to create a ‘Content-Length’ header.

Here’s an easy way of not executing certain middleware in such cases:

def wsgi_compat_middleware_factory(klass):
    class compatwrapper(klass):
        def process_response(self, req, resp):
            if not whatever_condition:
                return klass.process_response(self, req, resp)
            return resp
    return compatwrapper

This is a “factory”, returning a class that can you use instead of the normal middleware. On BitBucket, the condition is ‘if not req.is_mercurial():’. Replace with whatever makes sense for you.

You use it by doing something like this:

from django.middleware.http import ConditionalGetMiddleware
from django.middleware.common import CommonMiddleware

StreamingConditionalGetMiddleware = wsgi_compat_middleware_factory(ConditionalGetMiddleware)
StreamingCommonMiddleware = wsgi_compat_middleware_factory(CommonMiddleware)

Now you have two new classes - Just install those in place of the stock middleware, and viola.

Written by jespern

January 20th, 2009 at 8:32 am

Posted in django, python

Consumer Psychology

with one comment

A few days ago on reddit, there was a link to a book outline called “Predictably Irrational.” I’ve been reading through the entire thing, and there are some real gems in there — many which I’m sure you can apply to a vast variety of business and consumers.

Check it out - Predictably Irrational

Some examples:

Simonsohn and Loewenstein found that people who move to a new city remain anchored to the prices they paid in their previous city. People who move from Lubbock to Pittsburgh squeeze their families into smaller houses to pay the same amount. People who move from LA to Pittsburgh don’t save money, they just move into mansions.

and:

“If companies want to benefit from the advantages of social norms, they need to do a better job of cultivating those norms….It’s remarkable how much work companies (particularly start-ups) can get out of people when social norms (such as the excitement of building something together) are stronger than market norms (such as salaries stepping up with each promotion). If corporations started thinking in terms of social norms, they would realize that these norms build loyalty and–more important–make people want to extend themselves to the degree that corporations need today: to be flexible, concerned, and willing to pitch in.  That’s what a social relationship delivers.”

Written by jespern

December 31st, 2008 at 12:02 pm

What this blog will be about

with 3 comments

Right, so I’ve decided not to import my old blog entries, with the danger of some linking from other sites going wrong. Since I wrote my own blog software for the last attempt, and it doesn’t have any kind of export functionality + I don’t have time to write it, I’m just going to leave it at that.What will this blog be about? Well, hopefully it will have news about Bitbucket, DVCS in general, Python, maybe some scaling — stuff like that. I’m going away for Xmas vacation on Monday, and will be away for 2 weeks, so until then I won’t have anything on here.Stay tuned.

Written by jespern

December 18th, 2008 at 4:39 pm

Posted in Uncategorized