Developing Web Applications with Quixote

PyCon 2004
March 24, 2004

A.M. Kuchling
www.amk.ca
amk @ amk.ca

Overview

Quixote is a Web development framework written in Python.

Some of Quixote's goals:

Related tools:

Sites/Applications using Quixote

Sites:

Applications:

Quixote was originally written for the MEMS Exchange, a project that aims to implement a network for distributed semiconductor fabrication, a network coordinated over the web. For more information about the architecture we used on that project, see "The MEMS Exchange Architecture", a paper presented at PyCon 2003.

Linux Weekly News is the highest-traffic Quixote site, and demonstrates that Quixote can be pretty scalable. Using Quixote and mod_python, LWN survived a Slashdotting while running on a relatively small machine, a 1GHz Pentium with 512Mb of RAM.

Most Quixote projects are for internal use. One publicly available project is Cartwheel, which performs genomic sequence analysis. I'm working on a Slashdot clone named Solidus, and hope to have an alpha version available before PyCon.

The Quixote Toolkit

How Quixote Works: The basic idea

Quixote applications are Python packages, so they can be installed using the Distutils and similar tools. Incoming HTTP requests are mapped to a chunk of Python code, which is executed and passed an object representing the contents of the request; the code returns a string containing the contents that will be returned to the client.

The code to be run to determined like this.

This is something like Zope's traversal, but the rules are simpler; applications can't change this algorithm or override it. There are still some special names, though, that we'll look at after providing a simple example.

Simple example

From quixote/demo/__init__.py:

# Every publicly accessible attribute has to be listed in _q_exports.
_q_exports = ["simple"]  

def _q_index (request):
    return """
<html>
<body>
...""" def simple (request): # This function returns a plain text
document, not HTML. request.response.set_content_type("text/plain")
return "This is the Python function 'quixote.demo.simple'.\n"

Because Quixote publishes the contents of Python modules, there has to be a way of declaring which functions should be considered public and can be called through HTTP requests. This is done by listing the public names in a _q_exports module variable or object attribute; Quixote will not traverse into an object or module that lacks a _q_exports attribute.

How Quixote Works: Special names

_q_index: If traversal ends up at an object that isn't callable, this name is checked for and called.

_q_lookup: if an attribute isn't found, this name is checked for and called with the attribute.

_q_access: at every step, this name is checked for and called to perform access checks.

_q_resolve: like a memoized version of _q_lookup (rarely used)

All the names special to Quixote begin with _q_.

How Quixote Works: _q_lookup example

This example handles URLs such as /whatever/1/, .../2/, etc.

def _q_lookup (request, component):
    try:
        key = int(component)
    except ValueError: 
        raise TraversalError("URL component is not an integer")

    obj = ... database lookup (key) ...
    if obj is None:
        raise TraversalError("No such object.")

    # Traversal will continue with the ObjectUI instance
    return ObjectUI(obj)

How Quixote works: _q_access example

_q_access is always called before traversing any further.

This example requires that all users must be logged in.

from quixote.errors import AccessError, TraversalError

def _q_access (request):
    if request.session.user is None:
         raise AccessError("You must be logged in.")

    # exits quietly if nothing is wrong

def _q_index [html] (request):
    """Here is some security-critical material ..."""

_q_access is used to impose an access control condition on an entire object; this saves the user from having to add access control checks to each attribute and running the risk of forgetting one. At every step of traversal, _q_access is checked for and called if present. The function can raise an exception to abort further traversal; if no exception is raised, any return value is ignored.

How Quixote works: The HTTPRequest class

How Quixote works: The HTTPResponse class

How Quixote works: Enabling sessions

Instead of Publisher, use SessionPublisher:

from quixote.publisher import SessionPublisher
app = SessionPublisher('quixote.demo')

The request will then have a .session attribute containing a Session instance.

Two other classes:

SessionManager is a dictionary-like object responsible for storing sessions. The default implementation stores sessions in-memory, but you can provide your own session manager that stores them using a persistence mechanism such as ZODB or a relational database.

The only interesting attribute of Session is a .user attribute, whose value is undefined by Quixote and left up to the application.

Running Quixote applications

Several options:

Running Quixote applications: CGI/FastCGI

demo.cgi:

#!/www/python/bin/python
# Example driver script for the Quixote demo: 
# publishes the quixote.demo package.

from quixote import Publisher

# Create a Publisher instance, giving it the root package name
app = Publisher('quixote.demo')

# Open the configured log files
app.setup_logs()

# Enter the publishing main loop
app.publish_cgi()

The above code will also handle FastCGI. CGI scripts will run through publish_cgi() once and exit; under FastCGI it will loop and service multiple requests.

Running Quixote Applications: Stand-alone

Running a server on localhost is really easy:

import os, time 
from quixote.server import medusa_http 
 
if __name__ == '__main__': 
    s = medusa_http.Server('quixote.demo', port=8000) 
    s.run() 

This can even be used for writing desktop applications: run a Quixote server locally and use Python's webbrowser.open() module to open a browser pointing at it.

PTL: Overview

PTL = Python Templating Language

example.ptl:

# To callers, templates behave like regular Python functions 
def cell [html] (content):
    '<td>'         # Literal expressions are appended to the output
    content        # Expressions are evaluated, too.
    '</td>'

def row [html] (L):
    # L: list of strings containing cell content
    '<tr>'
    for s in L:
        cell(s)
    '</tr>\n'

def loop (n):  # No [html], so this is a regular Python function
    output = ""
    for i in range(1, 10):
        output += row([str(i), i*'a', i*'b'])
    return output

PTL: Using templates

Templates live in .ptl files, which can be imported. To enable this:

import quixote ; quixote.enable_ptl()    # Enable import hook

Templates behave just like Python functions:

>>> import example
>>> example.cell('abc')
<htmltext '<td>abc</td>'>
>>> example.loop()
<htmltext '<tr><td>1</td><td>a</td><td>b</td>...</tr>\n'>

In .ptl files, methods can even be PTL files.

PTL: Comparison with other syntaxes

System Syntax
Apache SSI <!--#include virtual="/script/"-->
PHP <?php func()?>
ASP <% func() %>
ZPT <span tal:replace="content">...</span>
PTL def f [html] (): content

PTL's advantages over other syntaxes:

PTL: Automatic escaping

def no_quote [plain] (arg):
    '<title>'
    arg              # Converted to string
    '</title>'

def quote [html] (arg):
    '<title>'
    arg              # Converted to string and HTML-escaped
    '</title>'

>>> no_quote('A history of the < symbol')
'<title>A history of the < symbol</title>'
>>> quote('A history of the < symbol')
<htmltext '<title>A history of the &lt; symbol</title>'>

By using '[html]' instead of '[plain]', string literals are compiled as htmltext instances. When combined with regular strings using a + b or '%s' % b, htmltext HTML-escapes the regular string.

This mechanism is both a convenience for the application writer and a security feature. Cross-site scripting (XSS) attacks are a class of security hole caused by forgetting to escape HTML tags in untrusted data; you might forget to escape the title of a mail message, for example. An attacker could insert JavaScript that opened pop-up windows or redirected the user to another site.

It's easy to forget the required function call, and forgetting to escape a single snippet is all it takes. PTL's automatic escaping trusts only the string literals supplied in the program text, and it also fails securely. When you mess up, the usual result is double-escaping a string, resulting in web site users seeing '<p>blah blah blah...'. This is embarrassing, but doesn't open up any vulnerabilities.

It should be remembered that while we think PTL is really neat, it's still optional. Using alternative templating isn't hard, and there are Quixote users who never use PTL.

Generating HTML: Nevow

Graham Fawcett wrote a small Nevow implementation.

from nevow import *

_q_exports = ['template']

def template(doctitle, docbody):
    """
    A page template. The stylesheet is there as a visual check
    that class and id attributes are set properly.
    """
    return html [
        head [
            title[doctitle],
            style(type='text/css')[
                'body { background-color: lightblue; } ',
                '.section { border: blue 3px solid; padding: 6px; } ',
                '#mainbody { background-color: white; } '
            ],
        ],
        body [
            h1 [doctitle],
            div({'class':'section'}, id='mainbody')[docbody],
            hr
        ]
    ]

Toolkit: Form processing

Quixote contains a set of classes for implementing forms. Example:

from quixote.form import Form

class UserForm (Form):
    def __init__ (self):
        Form.__init__(self)
        user = get_session().user
        self.add_widget("string", "name", title="Your name",
                        value=user.name)
        self.add_widget("password", "password1", title="Password",
                        value="")
        self.add_widget("password", "password2", 
                        title="Password, again", value="")
        self.add_widget("single_select", "vote",
                        title = "Vote on proposal",
                        allowed_values=[None] + range(4),
                        descriptions=['No vote', '+1', '+0', 
                                      '-0', '-1'],
                        hint = "Your vote on this proposal")
        self.add_widget("submit_button", "submit", 
                        value="Update information")

The basic idea is that you subclass the Form class to create a single new form. A form contains a number of widgets. Widgets represent a form element such as a text field or checkbox, or multiple form elements; multiple form elements could be used to enter a date, for example. Widgets can also perform additional checks such as requiring that a text field contain an integer.

The framework handles processing of a form. The Form instance creates widgets in its __init__ method. The render() method is called to generate HTML to display the form. On submitting the form, the process() method is called to read the values of fields and perform any error checking, and if no errors are reported, the action() method is called to perform the actual work of the form (e.g. inserting data into a database, sending an e-mail, etc.).

For a more detailed explanation of the form framework, see part 2 of the Quixote tutorial at http://www.quixote.ca/learn/2.

Toolkit: Form processing (cont'd)

                   
class UserForm (Form):
    ...
    def render [html] (self, request, action_url):
        standard.header("Edit Your User Information")
        Form.render(self, request, action_url)
        standard.footer()
        
    def process (self, request):
        values = Form.process(self, request)
        if not (values['password1'] == values['password2']):
            self.error['password1'] = 'The two passwords must match.'
        return values
   
    def action (self, request, submit, values):
        user = request.session.user
        user.name = values['name']
        if values['password1'] is not None:
            user.password = values['password1']
        return request.response.redirect(request.get_url(1))

This render() implementation uses the default rendering of the form, but wraps our own header/footer around that rendering.

process() gets the values and performs error checks.

action() does the work of the form, and can assume the input data is all correct.

Toolkit: Serving Static Files

For Quixote-only apps, you often need to return static files such as PNGs, PDFs, etc.

from quixote.util import StaticFile, StaticDirectory

_q_exports = ['images', 'report_pdf']

report_pdf = StaticFile('/www/sites/qx/docroot/report.pdf',
                        mime_type='application/pdf' 
images = StaticDirectory('/www/sites/qx/docroot/images/')

The quixote.util module also contains helpers for XML-RPC, for streaming files back to the client, etc.

These classes includes a number of conveniences. If you don't provide a MIME media type, Python's mimetypes module will be used to guess the correct MIME type. Files can optionally be cached in memory to save on I/O.

StaticDirectory defaults to security: it doesn't follow symlinks or allow listing the directory unless you explicitly enable this.

Design patterns: Good URL design matters

The canonical bad URL:

http://gandalf.example.com/cgi-bin/catalog.py
  ?item=9876543&display=complete&tag=nfse_lfde

A better set of URLs:

http://www.example.com/catalog/9876543/complete
                                   .../brief 
                                   .../features  

Quixote features such as _q_lookup make it easy to support sensible URLs.

Design patterns: Separate problem objects and UI classes

Don't mix the basic objects for your problem with the HTML for the user interface.

For each object, represent it by a class and put the user interface in a *UI class elsewhere.

Advantages:

Structure of an application: PyCon proposal submission

Directory organization:

qx/bin/                       # Various scripts 
qx/conference/__init__.py     # Marker
qx/conference/objects.py      # Basic objects: Proposal, Author, Review
qx/ui/conference/__init__.py
qx/ui/conference/email.ptl    # Text of e-mail messages
qx/ui/conference/standard.ptl # Header, footer, display_proposal()
qx/ui/conference/pages.ptl    # Login form, base CSS
qx/ui/conference/proposal.ptl # ProposalUI class

Design patterns: common filenames

Naming conventions for common modules:

Questions, comments?

These slides: www.amk.ca/talks/quixote

Quixote BoF session: Tonight at 8PM, in Room 1

Quixote home page: www.mems-exchange.org/software/quixote

Quixote resources: