Tue 03 February 2015 | -- (permalink)
I'm looking for the perfect Python web framework. Looking at the leading options in the Python community makes me feel a bit like Goldilocks, with nothing being quite right. There are pieces I like from most of the frameworks I've tried, as well as things I'd really rather avoid. In some cases there are things I liked at the time that I don't like so much anymore.
My first exposure to Python web programming was when Google App Engine was brand new, using their webapp framework. Things I like from the webapp framework:
- Class-based handlers with a Python method per HTTP method (GET, PUT, POST, etc.)
- All the routes in one place, explicitly fed into an Application object.
(I now know that Webapp Framework took these patterns from web.py, but I didn't know that at the time.)
I liked how easy it was to use App Engine's built-in datastore, but I wouldn't write an application that used it today. I want more control over the persistence layer. I want Postgres, and the ability to throw in Redis or RabbitMQ when needed. And I don't want my app to be held hostage to Google's whims.
Next came Django, and was my main development environment for a couple years. Much like App Engine, its routing and request/response handling were very straightforward. And also much like App Engine, data persistence was kind of magical. You just import your models, and off you go. No worrying about database connections.
As my Django projects started to grow, I became a bit disenchanted with the program structure that Django encourages. Django's concept of a "project" comprised of pluggable "apps" sounds like a great idea on paper, but in practice the apps really aren't that pluggable, and it's very easy to make a mess trying to use that abstraction to organize your static files, templates, etc. These days I'm more likely to just do all that at the project level.
I also found that Django's magical management of database connections totally abandons you when you want to add a more exotic backend service like Mongo, Redis, or RabbitMQ. There's no obvious place to do the connection setup, no obvious place to stick the connection objects, and no obvious patterns for managing connection lifetimes (Do I use a middleware and make a connection per request? Or just make new connections when I need them inside my views? Do I need to make a connection pool? How does one even do that?)
When I started with Django its templates, admin, and forms components saved me tons of time. But in the years since, UIs have moved from doing everything on the server to doing a lot more in the browser. We have websockets now to make things real-time. And Django hasn't kept up.
I've also found that a lot of my projects have command line components in the same codebase as the web components. While I know it's possible to write subcommand plugins for ./manage.py, it feels more elegant to have Python console entry points. And it feels really ugly for them to have to set a DJANGO_SETTINGS_MODULE environment variable at the top of every script before we can even finish all the 'import' statements.
Flask and CherryPy
So my next projects were in Flask and CherryPy. I found Flask more pleasant, but both have a pattern of making you import a magical context-local object to access the current request (see flask.request or cherrypy.request). I greatly preferred the Django pattern of having the request explicitly passed to my handler. I'll never get used to the look of the Flask/CherryPy pattern of importing a global and then grabbing request-specific data off it, or even worse, setting request-specific parameters on cherrypy.response. To someone new to Python, that's teaching a very dangerous pattern.
CherryPy also taught me that I really really don't want object-based URL dispatching, where the URL is incrementally consumed by each handler in the chain. That's made some projects a huge PITA to figure out where specific pieces of data are coming from.
Flask's routing pattern, using decorators, seemed elegant at first. But as soon as my application grew to the point that I wanted handlers in multiple files, I had circular imports between my handler modules and my main application module. Though the Flask docs say "here it is actually fine", I could never get over my uneasiness.
While using both Flask and CherryPy, I found that I really missed the ability to look at a single file and see all the routes that my application handled. I want to be able to put all that in one place.
One thing I really did like about Flask was the way you could write URL patterns like "/people/<name>/", with human-readable parameters, instead of having to write regexes as Django forces you to do.
A Very Brief Tornado Flirtation
Having read somewhere that Tornado apps are structured more or less like the webapp and web.py frameworks, and also having heard that it had nice built-in support for websockets, I decided to give Tornado a try. I did a little "Hello world" fiddling, and things were looking good. But when I saw the hoops I'd have to jump through to get Tornado to play nice with Postgres, the mind revolted. I really don't want to turn my project into callback/try/except spaghetti.
Rolling My Own
With all those experiences in mind, here's what I decided I really want from a Python web framework:
- Explicit passing of request/response objects (no magic context locals).
- No magical database connections. I'll make connections in my handlers/middlewares, or connection pools on startup. I just want a place to hang them.
- Explicit passing in of config (no DJANGO_SETTINGS_MODULE).
- Support for lots of concurrent open connections, as you'd have using websockets. But without Tornado's callback style. To me, this means using Gevent.
- A central place for mapping URL patterns to request handlers. No CherryPy-style object dispatch, no Flask-style decorator routing, and no circular imports.
- Pretty, Flask-style URL patterns instead of regexes ("/persons/<int:person_id>/" is way better than "/persons/(?P<person_id>[0-9]+)/").
- View classes that have GET/PUT/POST/DELETE/HEAD methods.
- Middlewares that are plain old WSGI wrappers.
That framework does not exist, but I have a hunch that I could coax it into existence if I start with Werkzeug (which provides most of the power behind Flask, without the syntactic sugar), wrap its routing to let me have web.py-style view classes, and sprinkle in some Gevent.
I found that Armin Ronacher has already written a little webpyalike wrapper. I've adapted that into a little module I call webzeug, and though it's still only a couple hours old, I'm optimistic that it might be able to tick all my boxes. So far, it's running this application really nicely:
from gevent.monkey import patch_all; patch_all() from gevent import pywsgi from webzeug import App, Request, Response, View import time class Index(View): def GET(self): return Response('Hello World') class Hello(View): def GET(self, name): return Response('Hey there ' + name + '!') class CountIter(object): """ An iterator that increments an integer once per second and yields a newline-terminated string """ def __init__(self): self.num = 0 def __iter__(self): return self def next(self): self.num += 1 time.sleep(1) return '%s\n' % self.num class Counter(View): """ A long-lived stream of incrementing integers, one line per second. Best viewed with "curl -N localhost:8000/count/" Open up lots of terminals with that command to test how many simultaneous connections you can handle. """ def GET(self): return Response(CountIter()) app = App([ ('/', 'index', Index), ('/people/<name>/', 'hello', Hello), ('/count/', 'count', Counter), ]) if __name__ == "__main__": # In production I'd probably use Gunicorn's Gevent worker instead, which # provides nicer logging out of the box and will automatically use a worker # process per core. Gevent's pywsgi server still demonstrates the ability # to have a lot of simultaneous connections though. server = pywsgi.WSGIServer(('', 8000), app) server.serve_forever()