olsner's blog: Perversions in Computer Science

2008-01-20

mod_rewrite Turing Complete

While setting up my blosxom blog installation to rewrite URL:s to pass them by the blosxom CGI script, I just set up a redirection loop with apache's mod_rewrite module - by accident - and realized that mod_rewrite really amounts to an implementation of a String rewriting system.

If one felt like it, this could be used to produce a turing-complete set of rewrite rules such that each accessed URL is a program , and the final URL is the out-data (or a link to a CGI-script that gives back the final output to the user, with suitable decoding). It's possible (trivial, even) to add a first rewrite rule that bootstraps the rewriting with an interpreter string. It's theoretically possible to have this be something like a Haskell interpreter - imagine the possibilities! - a dog-slow haskell interpreter implemented in an apache config file!

Here's a proof-of-concept implementation of a rewrite system framework in mod_rewrite. All that's missing is a set of rewrite rules to implement something interesting, like a 99-bottles-of-beer program or a Universal Turing Machine.


# This file should be included in an Apache2 config file for a VirtualHost.

# Enable rewriting
RewriteEngine on

# First off, set a sensible log level so that you can see what goes wrong
RewriteLogLevel 3
RewriteLog "/var/log/apache2/rewrite/rewrite.log"

# Hack to make actual files available if you know their names
RewriteCond /var/www/rewrite%{REQUEST_FILENAME} -f
RewriteRule ^.*$ - [L]

# This should only run on the first inputed string to add the interpreter.
# After each run of substitutions, we return to the first rule, so this must
# be safe and idempotent. So, we add the interpeter prefixed by '-' if the
# string does not start with '-'. (This could also be used for input programs
# to replace the interpreter if they like. We could also simply require input
# programs to come with their own interpreters.)
RewriteRule ^([^-].*)$ -INTERPRETER$1

# Add a comma to distinguish unchanged strings. This would be the termination
# condition for our rewrite system. Of course there are other ways to do this,
# for example a regexp which evaluates into the terminate-rule at the bottom.
# Doesn't need to be a comma, of course. If this character is part of the input
# alphabet, it could confuse some things though.
RewriteRule ^(.*)$ ,$1

# This is where the implementation of the entire string rewriting system would
# be. The general format is ,xyz -> xy'z - remove the first comma to mark the
# string as matched and changed (if there was a comma to start with), then
# change the string y to y' (with x and z left unchanged).
RewriteRule ^,?(.*)foo(.*)$ $1bar$2
# ... etc until your rewrite rules implement some turing-complete language, or
# a program that does something interesting.

# If the string has changed, we are not yet done. Loop to the beginning and
# remove the changed-marker. This doesn't change the string at all ('-' is a
# special substitution string that simple uses the original string), just has
# the [N] flag to trigger apache to restart rewriting.
RewriteRule ^[^,].*$ - [N]

# If the string is still unchanged, terminate and apply output-rewriting.

# First, remove the comma
RewriteRule ^,(.*)$ $1 
# Since the example string INTERPRETER is not actually interpreted in this
# example, remove it after applying the rewrite rules.
RewriteRule ^-INTERPRETER(.*)$ $1
# Finally, pass the resulting string to a simple CGI or PHP script to print it
# to the web browser. [L] means to abort rewriting here.
RewriteRule ^(.*)$ /print.php?$1 [L]

/home :: :: permalink :: rss

Testing Blosxom

Is this how easy it is? That's incredible!

Wonder how I get Haskell code in here though.

/home :: :: permalink :: rss