PyPy
PyPy[webapps_with_pypy]

Developing web applications with PyPy

What is this (and why)?

PyPy is a platform that is very versatile, and provides almost endless possibilities. One of the features that is currently already available is that of translating RPython (the 'restricted Python' subset) to JavaScript. This specific feature can make the life of a developer of web applications that use client-side logic a lot easier, although there are certain hurdles to take.

Currently the state of the JavaScript backend support for PyPy is usable, but somewhat complex to use out-of-the-box and not very well documented. I'm writing this document while investigating the possibilities myself, and hope that it can serve others to relatively quickly dive in, without having to know too much of anything in the (rather daunting) PyPy codebase besides the useful bits for web application development.

Note that this document describes APIs that are not mature and will change in the near future.

PyPy features for web developers

Of course the 'killer feature' of PyPy for web application developers is the ability to convert somewhat restricted Python code (aka RPython) into JavaScript code. Unlike other libraries that perform similar functionality, PyPy really interprets the code, and produces 'lower level' JavaScript code, so it implements Python core language features like list comprehensions, and really behaves like Python (it's not Python syntax with JS semantics). This particulary means that when a program is in RPython, you can run it on top of Python interpreter with the same results as translated to JS version.

However, there is some other interesting code available in the PyPy package that may help developing web apps. The most interesting of these I think is a library that transparently makes server-side functionality available to client-side code, using XMLHttpRequest. This code lives in a module in pypy/translator/js/examples that's called 'server.py', and uses a library in pypy/translator/js called 'commproxy.py'.

Note that the 'server.py' library currently is usable, but not complete nor stable: I assume there will be quite some development and refinement later. This may mean that certain things in this document are outdated. There might be even a strong API rewrite at some point.

Layers

As seems to be common in PyPy, web applications will be relatively 'layered': there are a large number of different 'levels' of code execution. This makes that writing and debugging web applications written in this manner can be relatively complicated; however, the gains (such as having client-side code written in Python, and being able to test it on a normal Python interpreter) hopefully outweigh those complications.

A quick overview of the (main) layers of code in the application we're going to write:

  • HTTP server implementation - 'normal' Python code

    the web server code, the code that handles dealing with the HTTP API and dispatching to application code is written in 'normal' Python code, and is executed on the server in a 'normal' Python interpreter (note that this does _not_ mean that you can not execute this in a PyPy interpreter, as long as that interpreter has support for the functionality we need (things like sockets) that should work)

  • Server-side application code - 'normal' and 'described' Python code

    the application code on the server consists of 'internal' code, code that is called directly from other server-side code ('normal' Python functions), and 'exposed' code, code that is made available to the client (and has to be described)

    exposed functions only support common datatypes (ints, strings, lists of ints, etc.) for arguments and return values, and have to have those arguments and return values 'described' so that the annotator system can discover how they should be exposed, and so that the (transparently used) XMLHttpRequest handler and JSON serializer know how they should be treated

  • Client-side application code - RPython code

    the application code on the client lives in a seperate module, and must be RPython code

    this code can call the 'described' Python code from the server, and also use the normal client-side (browser) APIs

Writing a simple application

To explain what needs to be done to write an application using all this, I decided to add the (almost mandatory) classic guestbook example. To show off the transparent XMLHttpRequest stuff, I chose to do one that stays on-screen when adding a message, so there's no screen refresh, but apart from that it's a very basic, simple guestbook. Of course the script is still too much code to fit in this document nicely, so I placed it in some nice files: on for the server-side code and one for the client-side. Let's examine the server side first.

guestbook.py

If you examine the code, you will see 2 classes, one function and some code in the usual "if __name__ == '__main__'" block. The latter bit starts the webserver, passing one of the classes (Handler) as a handler to a BaseHTTPServer, which in turn uses the function (guestbook_client) to produce JavaScript, and the class (ExportedMethods) to serve 'AJAX' methods from.

This will be what a general PyPy web application (using commproxy) will look like: a set of HTTP methods (web pages and scripts) on a Handler class, a set of 'RPC' methods on an ExportedMethods class, a main loop and perhaps some helper functionality (e.g. to convert RPython to JS). Let's examine the two classes in more detail.

Handler

The Handler class is a normal BaseHTTPRequestHandler subclass, but with some pre-defined HTTP handlers (do_GET et al) that dispatch actual content generation to user-defined methods. If a '/' path is requested from the server, a method called 'index' is called, else the path is split on slashes, and the first bit, with dots replaced by underscores, is used to find a method, the other parts passed to that method as arguments. So calling '/foo.bar/baz' will result in the method 'foo_bar' getting called with argument 'baz'.

Note that a method needs to have an attribute 'exposed' set to True to actually expose the method, if this attribute is not set requesting the path will result in a 404 error. Methods that aren't exposed can be used for internal purposes.

There is some helper functionality, such as the Static class to serve static files and directories, in the pypy/translator/js/lib/server.py.

(Note that even though currently BaseHTTPServer is used to deal with HTTP, this will likely change in the future, in favour of something like CherryPy.)

ExportedMethods

The ExportedMethods class contains methods that are (through some magic) exposed to the client side, in an RPC-like fashion. This is implemented using XMLHttpRequest and JSON: when from the client-side a method is called here, the arguments to the method will be serialized and sent to the server, the method is called, the return value is serialized again and sent to the client, where a callback (registered on calling the method) is called with the return value as argument. Methods on ExportedMethods contain normal, unrestricted Python code, but do need to have the arguments and return value described in order for the proxy mechanism to deal with them.

Let's take a look at the 'get_messages' method. The 'callback' decorator that wraps it is used to tell the proxy mechanism what arguments and return value is used (arguments can also be determined by examining the function signature, so only have to be provided when the function signature doesn't describe default values), using that decorator will mark the method for exposure. This particular method does not have arguments passed to it from the client, but does have a return value: a list of strings.

The second message on this class, 'add_message', does have a set of arguments, which both have a string value, both of which are described in the function arguments (as default values).

guestbook_client

This function imports a Python (or RPython, actually) module and commands PyPy to convert a set of functions in it to JavaScript. This is all you need to do to get a conversion done, the function returns a string of directly executable JavaScript with the functions made available under their original name.

guestbook_client.py

This contains the client-side code. It contains a couple of functions to display messages and add messages to the guestbook, all relatively simple, but there are a couple of things that need explanation.

First thing to notice is of course that the code is RPython. This allows PyPy to translate it to JavaScript (and a lot of other things :), but does mean there are some restrictions: RPython is less dynamic than normal Python. For instance, you need to make sure that PyPy knows in advance of what types variables are, so changing a variable type, or writing functions that allow different types of variables (Python's 'duck typing') is not allowed.

Another odd thing are the imports: these are ignored on the client. They are only there to satisfy the RPython to JavaScript translator, and to allow running the code as 'normal' Python, for instance when testing. Both the 'dom' object, and 'exported_methods' are made available on the client by some commproxy magic.

In the functions, there are some other things to note. The 'exported_methods' variable is a reference to the exported_methods instance in guestbook.py, but because there's the commproxy between the client and server, the methods are wrapped, and the signatures are changed. The methods all need an additional argument, 'callback', when called from the client side, and rather than returning the return value directly, this callback is called with the return value as argument.

Finishing up

The previously described components together already make up a working web application. If you run the 'guestbook.py' script, the main loop bit on the bottom will be triggered, passing the Handler class to some BaseHTTPRequest server, which will start waiting for a request.

If a browser is pointed to http://localhost:8008/ (the root of the application), the 'index' method will be called, which presents the HTML page with the form. A script tag will make that the JavaScript is retrieved from the server, which is built on-the-fly (if required) from the RPython 'guestbook_client.py' file. An 'onload' handler on the <body> tag will trigger the 'init_guestbook()' function in this JS, which in turn calls the server's 'exported_methods.get_messages()' function that provides the initial page content.

If the form is filled in and the 'submit' button pressed, the 'add_message()' function is called, which gets the data from the form and calls the server's 'exported_methods.add_message()' function. This then returns a string version of the message back to the client (to the '_add_message_callback' function) where it is presented.

All this code is written without having to care about XMLHttpRequest and serialization, with a simple server-side interface, and in a testable manner.

Conclusion

Even though RPython is somewhat hard to debug sometimes, and the tools are still a little rough around the edges sometimes, PyPy can already be a valuable tool for web developers, with RPython to make code easier to write and test, and the commproxy to help dealing with asynchronous server calls and serialization. Do not expect that PyPy does not try to 'magically' make all web development problems go away: you will still need to manually write client-side code, and still need to take care for dealing with browsers and such, except not in JavaScript but in RPython, and with the 'AJAX' stuff nicely tucked away. There may be future developments that will provide higher-level layers, and client-side widgets and such, but currently it's still 'work as usual', but with somewhat nicer (we hope ;) tools.