Wednesday, January 7, 2009

Python VM simplicity

It has been quite fun, while studying for the Virtual Machines course with Lars Bak (the head designer of Google V8) in Aarhus University, and working for the project, to dig in the archives of python-dev about Python history and its choices.

For a VM developer, like for authors of optimizing compilers, the aim is to relieve developers from concerns about low-level optimizations, allowing them to concentrate on their programs and their high-level optimizations.

It looks like Guido van Rossum has instead the concern to keep the Python interpreter (the main implementation, called CPython - other implementations exist) simple. So simple that a couple of groups in the course could prototype interpreters which are faster than the Python one. Obviously, support for some advanced features is missing, but at least my group tried to include everything having impact in the fast path.
I'm not saying no optimization work is done on CPython, obviously, just complaining on the amount of it.

And especially, I'm complaining about how braindead is to suggest developers who care about performance to write performance-critical code in C:
  • That leads to more complexity for the users, because they have to give up simplicity or performance, or to switch to Jython (which is 2x slower than Python but supports thread) or to Java (which is anyway simpler than Python+C).
  • That leads to more maintainance costs for the Python VM itself, because the amount of extension code written in C is much bigger.
  • Also, having lots of independent developers writing extensions in C makes the transition costs for API changes even higher.
I want to elaborate on maintainance costs. One could think the tradeoff is just between the complexity of the Python VM core and of extensions. Actually, it's far worse. It's a tradeoff of manpower.
There will likely be a few core developers working on the interpreter, and they're likely going to be the most experienced ones. And the core of an interpreter tends to stay small anyway.
Instead, authors of extensions (especially independent ones) will likely have a more varied skill set. And extensions can be many more, i.e. much more code.

It's like this also on the Linux kernel: while on the core exceptionally complex optimizations are accepted (when they do improve performance) and people still tend to get it right, there is much more driver code, much simpler, but often of poor quality, and causing most bugs.

To show some numbers, in Python 2.7 SVN, I have 9.8M in Modules/ folder (C module code) and 1.3M in the Python folder (C interpreter code).
Google V8, instead, writes as much as possible of the standard library in JavaScript, and saves the JIT-ed code to load it at startup.

Finally, the implementation cost have been overestimated. Implementing a JIT, for instance, is not that complex. There are also possibilities to get a JIT from the interpreter, by basically pasting together the pieces of the interpreter that handle the various opcodes (see for instance "context threaded JIT", mentioned in this post about Apple Safari's new JavaScript VM:
http://webkit.org/blog/214/introducing-squirrelfish-extreme/).

A technique called "indirect threading" (having nothing to do with multithreading) adds at least some of that performance, and still it wasn't implemented (though this is maybe going to change).

And finally, even a complete rewrite would not be as horrible as Python people think. Other big opensource projects are doing that all the time, or face extensive refactoring and changes of their core API quite often, like the Linux kernel; and also the Samba code has been undergoing a core some rewrite time ago.

No comments:

Post a Comment