Python Wsgi

This is a description on how to use python wsgi as your main method of web scripting. The source is visit It is a very good description but just in case it will vanish i do a short version here.

wsgi

What is wsgi, or the webserver gateway interface ?

WSGI application are callable python objects (functions or classes with a call method that are passed two arguments: a WSGI environment as first argument and a function that starts the response.
the application has to start a response using the function provided and return an iterable where each yielded item means writing and flushing.
The WSGI environment is like a CGI environment just with some additional keys that are either provided by the server or a middleware.
you can add middlewares to your application by wrapping it.

Of course an example will do much more ...

the application

application
from cgi import parse_qs, escape

def hello_world(environ, start_response):
    parameters = parse_qs(environ.get('QUERY_STRING', ''))
    if 'subject' in parameters:
        subject = escape(parameters['subject'][0])
    else:
        subject = 'World'
    start_response('200 OK', [('Content-Type', 'text/html')])
    return ['''Hello %(subject)s
    Hello %(subject)s!

''' % {'subject': subject}]

This wsgi application will attempt to print Hello world. It's main function 'hello_world' is called by the interface with the two standard parameters :

the environment 'environ' in which QUERY_STRING is set for instance which we need and which is extracted in the first part.
the handler 'start_response'
start_response is the handler and it 'in turn' is called with 2 parameters :
a status code 'OK' in this case
a list of tuples, representing the response headers

However there is no way to test this without an enclosing webserver. Nothing will call this function. Installing under apache is a step too early so let's use the standalone server that comes with wsgiref. Extend the file like this ( add the last part):

apache
from cgi import parse_qs, escape

def hello_world(environ, start_response):
    parameters = parse_qs(environ.get('QUERY_STRING', ''))
    if 'subject' in parameters:
        subject = escape(parameters['subject'][0])
    else:
        subject = 'World'
    start_response('200 OK', [('Content-Type', 'text/html')])
    return ['''
    Hello %(subject)s!

    ''' % {'subject': subject}]

if __name__ == '__main__':
    from wsgiref.simple_server import make_server
    srv = make_server('localhost', 8081, hello_world)
    srv.serve_forever()

and browse to :

visit

It will get subject from the GET request and print :

output
1	`Hello kees!`

Path dispatching

Now in php we where used to many scripts running by calling them directly. Not so in wsgi, there is only one script running which dispatches the calls based on it's input. For instance on the path (PATH_INFO).

So if you call visit

You get the same result with the current application, try it. So to make a dispatcher you can use these variables :

PATH_INFO
SCRIPT_NAME

In the last example you can print these and they will be : /otherpath and '' SCRIPT_NAME is not filled in the development server, but it is under apache as we will see later.

So what if we split the application in a main index, a part that handles hell and an error otherwise. regular expressions can be used for this :

regexp
#!/usr/bin/env python
import re # regular expressions
from cgi import parse_qs, escape

# function to be called on plain index
def index(environ, start_response):
    """This function will be mounted on "/" and display a link
    to the hello world page."""
    start_response('200 OK', [('Content-Type', 'text/html')])
    # just display what it is
    return ['''Hello World Application
               This is the Hello World application:

    `continue [visit](hello/)`_

    ''']

# the main function, printing hello 'subject'
def hello(environ, start_response):
    parameters = parse_qs(environ.get('QUERY_STRING', ''))
    if 'subject' in parameters:
        subject = escape(parameters['subject'][0])
    else:
        subject = 'World'

    start_response('200 OK', [('Content-Type', 'text/html')])
    return ['''
    Hello %(subject)s!xx

    ''' % {'subject': subject}]

# error function
def not_found(environ, start_response):
    """Called if no URL matches."""
    start_response('404 NOT FOUND', [('Content-Type', 'text/plain')])
    return ['Not Found']

# map urls to functions
urls = [
    (r'^$', index),         # [empty] (start nothing end)
    (r'hello/?$', hello),   # hello with 0 or 1 '/'
    (r'hello/(.+)$', hello) # hello/[anything]
]

def application(environ, start_response):
    """
    The main WSGI application. Dispatch the current request to
    the functions from above and store the regular expression
    captures in the WSGI environment as  `myapp.url_args` so that
    the functions from above can access the url placeholders.

    If nothing matches call the `not_found` function.
    """
    path = environ.get('PATH_INFO', '').lstrip('/')
    for regex, callback in urls: # walk though the urls
        match = re.search(regex, path)
            environ['myapp.url_args'] = match.groups()
            # run the callback if found
            return callback(environ, start_response) # 
    return not_found(environ, start_response) # or error if not

if __name__ == '__main__':
    from wsgiref.simple_server import make_server
    srv = make_server('192.168.2.5', 8081, application)
    srv.serve_forever()

Now this is already half of a rest service, so it will not be that difficult to augment on that. More interesting however is how to get this working under apache.

apache

Short guide to running a wsgi script under apache2 with mod_wsgi, source: visit

Test application : myapp.wsgi

apache app
def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!'

    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    return [output]

Note that 'application' is a significant name in mod_wsgi. mod_wsgi expects you application to be called that way.

mounting the wsgi application

Mounting against a specific url that is. The main directive for that is :

WSGIScriptAlias
WSGIScriptAlias /myapp /usr/local/www/wsgi-scripts/myapp.wsgi

This can appear in a virtualhost section for instance, but be aware that you have to set access control for the directory containing the script :

access control
[visit](Directory /usr/local/www/wsgi-scripts)
Order allow,deny
Allow from all
[visit](/Directory)

So the simplest setup would be :

setup

mkdir -p /usr/local/www/wsgi-scripts
cp myapp.wsgi /usr/local/www/wsgi-scripts
vim /etc/apache2/apache2.conf 
... put above WSGIScriptAlias and Directory snippets in apache2.conf
service apache2 reload
browse to http://localhost/myapp

Done!!

You could put all your scripts in wsgi-scripts and keep adding new WSGIScriptAlias lines :

wsgi-scripts
WSGIScriptAlias /myapp /usr/local/www/wsgi-scripts/myapp.wsgi
WSGIScriptAlias /myapp2 /usr/local/www/wsgi-scripts/myapp2.wsgi
[visit](Directory /usr/local/www/wsgi-scripts)
    Order allow,deny
    Allow from all
[visit](/Directory)

But the next virtualhost example will be more useful probably.

virtualhost example

An actual example for my finance site, it is the virtualhost finance.

NameVirtualHost
NameVirtualHost finance.klopt.org

[visit](VirtualHost finance.klopt.org)
    ServerAdmin kees@klopt.org
    ServerName  finance.klopt.org
    ServerAlias klopt.org

    # Indexes + Directory Root.
    DirectoryIndex index.html
    DocumentRoot /var/www/finance

    # wsgi section ...
    WSGIScriptAlias /rest /var/www/finance/rest.wsgi

    [visit](Directory /var/www/finance)
    Order allow,deny
    Allow from all
    [visit](/Directory)
[visit](/VirtualHost)

Note that if you want to have multiple scripts for multiple virtualhosts, you need to do it within 1 VirtualHost section, for example this will only run /doc :

different port
[visit](VirtualHost *:1901)
    ServerName doc.klopt.org

    WSGIScriptAlias /doc /var/www/web/doc/doc.wsgi

    [visit](Directory /var/www/web/doc)
        Order deny,allow
        Allow from all
    [visit](/Directory)
[visit](/VirtualHost)

[visit](VirtualHost *:1901)

    WSGIScriptAlias /ws /var/www/web/ws.wsgi

    [visit](Directory /var/www/web)
        Order deny,allow
        Allow from all
    [visit](/Directory)

Alias   /static /var/www/web

[visit](/VirtualHost)

But this will run both:

run both
[visit](VirtualHost *:1901)
    ServerName doc.klopt.org

    WSGIScriptAlias /doc /var/www/web/doc/doc.wsgi

    [visit](Directory /var/www/web/doc)
        Order deny,allow
        Allow from all
    [visit](/Directory)

    WSGIScriptAlias /ws /var/www/web/ws.wsgi

    [visit](Directory /var/www/web)
        Order deny,allow
        Allow from all
    [visit](/Directory)


Alias   /static /var/www/web

[visit](/VirtualHost)

Since i just have an index.html for the outline of the site and only want data from the python code, i just run /rest as an application for ajax requests. /rest will dispatch all calls further based on PATH_INFO. But to complete this example and to display the SCRIPT_NAME as promised earlier : here is a running example :

/rest
def application(environ, start_response):
    status = '200 OK'
    output = 'Hello World!'

    output += "nPATH_INFO = "
    output += environ.get('PATH_INFO', 'empty')
    output += "nSCRIPT_NAME = "
    output += environ.get('SCRIPT_NAME', 'empty')

    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    return [output]

For a url like visit it will print

output
1 2 3	`Hello World! PATH_INFO = /custompath SCRIPT_NAME = /rest`

Most likely you want to put the handling code in a different python script than the main one given here. To do so you need to import that script but also tell apache2 where to find it. No.. the same directory is not appended by standard.

import
import sys, os
# to include submodule in this directory:
sys.path.append(os.path.dirname(__file__))
import myapp

def application(environ, start_response):
    status = '200 OK'

    output = myapp.run(environ)

    response_headers = [('Content-type', 'text/plain'),
                        ('Content-Length', str(len(output)))]
    start_response(status, response_headers)

    return [output]

myapp.py :

run
def run(environ):
    echo "just saying i did it";

If you have problems: 'tail' the apache log for clues.

log
1	`tail -f /var/log/apache2/error.log`

debugging

Wsgi tends to just present an error page without further clues about what happened. This page explains how to best get some logging into you wsgi application:

visit

Also, this page presents some techniques you can use for debugging :

visit

The result will appear in /var/log/apache2/error.log