Back Home to Wombat

For a long time, Wombat was my preferred Vim color scheme, but about a year and a half ago, I had become bored with it. I wandered in the desert, looking for new color schemes. I turned to Obsidian, Jellybean, Twilight, Tomorrow and its variants, Solarized and its variants, and even Monokai—and they’re all good, except perhaps for Monokai—but at last, I’ve returned to Wombat.

Here’s an example of Wombat:

The Wombat theme in Vim.

The constituent colors work well together (basically variants of RGB, plus white and shades of gray), and it’s dark, so it’s easy on the eyes.

Maybe I’ll leave Wombat for good one day, but it’s just such a good color scheme. It also exists for other editors, although the color assignments are not exactly the same as with Vim.

Comments

What I’ve Been Up To

Haskell and Math

With a new child and new house, I’ve been quite busy these days. In the spare moments I have I’ve been learning Haskell for fun and profit. A few days ago I started toying with using Haskell to do Kleene logic. My inner math geek has been yearning to get out, and I’ll see where it takes me.

Typography

Also, as a typography geek, I’ve been fiddling around with TeX and friends. I’m amazed by the beautiful output that can be achieved by these technologies. I’m also considering trying my hand at creating an open-source font.

Comments

What if Your Factory Method Can’t Create Anything?

The Problem

I’ve come across a situation that had code that looked like this:

if (context.Request["cmd"] == "show-standard-stats")
    // show standard stats
else if (context.Request["cmd"] == "show-other-stats")
    // show the other stats
else if (context.Request["cmd"] == "show-some-other-stats")
    // show some other stats
else
    // don't show any stats but do something else altogether

Each of the commands does something similar with regard to displaying stats, while the code in the else block does something else entirely. I saw this code as a candidate for refactoring using the command pattern and the factory method pattern. The trouble is in the final else clause, because what’s executed there is arguably not a “show stats” command at all. My first idea was to do this:

IStatsCommand command = StatsCommandFactory.CreateCommand(context);

if (command != null)
    command.Execute();
else
    // don't show any stats but do something else altogether

A Proposed Solution

However, returning null from a factory method sounded weird to me, so I asked the StackOverflow community whether it was a good idea. Jon Skeet replied with,

I think it’s potentially reasonable for a factory method to return null in some situations, but not if it’s a method called CreateCommand. If it were GetCommand or FetchCommand, that might be okay… but a Create method should throw an exception on failure, I would suggest.

Fair enough. So I started going with that:

try
{
    IStatsCommand command = StatsCommandFactory.CreateCommand(context);
    command.Execute();
}
catch (CreationFailedException)
{
    // don't show any stats but do something else altogether
}

Another Solution

This is better, but adding in a trycatch made me uneasy because it makes the caller have to know about CreationFailedException. Maybe that’s OK, though. Another responder at StackOverflow responded that I could try this:

IStatsCommand command;
if (StatsCommandFactory.TryCreateCommand(context, out command))
{
    // creation succeeded ...
}

Now, “out” parameters make me skittish, but this looks to me like it might be a little better than surrounding the factory method call with trycatch. I’m not sure which solution I like better.

Comments

Never Mind, Pygments Is Better

Recently I wrote that I preferred Google Code Prettify for code highlighting. Well, after using it, I have decided to return to Pygments. It doesn’t rely on JavaScript and thereby get in the way of page speed. Plus, changing themes is not really that hard.

Comments

A Better Way to Handle Routes to Static Files

Yesterday at work I needed to figure out how to map an ASP.NET MVC route to a directory of static XML files. I came across this blog post about how to do it. As I implemented it, I realized that their example is wrong.

In their example, the work of writing to the ResponseStream is done in the GetHttpHandler method of their implementation of IRouteHandler. But notice what GetHttpHandler returns: an IHttpHandler. In their example, they do the writing to the stream, and then return null.

Is that right? They are essentially saying, “To handle this route, instantiate this class and get an IHttpHandler (which handles the HTTP request made with this route)—and we’ll always give you null.” That doesn’t make sense. So, the proper way to do it is to return an implementation of IHttpHandler which does the actual work of locating the static file and writing it to the ResponseStream, as follows:

public class StaticXmlFileRouteHandler : IRouteHandler
{
    private const string FilenameRouteElementName = "filename";

    public IHttpHandler GetHttpHandler(RequestContext requestContext)
    {
        var filename = requestContext.RouteData.Values[FilenameRouteElementName] as string;

        return new StaticXmlFileHttpHandler(filename);
    }
}

And the object returned is of this class:

public class StaticXmlFileHttpHandler : IHttpHandler
{
    private const string StaticXmlFilePhysicalDirectory = "~/dir-of-static-xml-files/";

    private const string XmlContentType = "application/xml";

    private readonly string filename;

    public StaticXmlFileHttpHandler(string filename)
    {
        this.filename = filename;
    }

    public void ProcessRequest(HttpContext context)
    {
        context.Response.Clear();

        string filepath = context.Server.MapPath(StaticXmlFilePhysicalDirectory + this.filename);

        if (!File.Exists(filepath))
            context.Response.StatusCode = (int)HttpStatusCode.NotFound;
        else
        {
            context.Response.ContentType = XmlContentType;
            context.Response.WriteFile(filepath);
        }

        context.Response.End();
    }

    public bool IsReusable
    {
        get { return false; }
    }
}

(The call to File.Exists should really come from an adapter so as to facilitate testing, but you already knew that.)

Comments

Safely Base-36-Encoding Integers

I’m working on a project at Ancestry.com which involves URLs with fairly long integers—that is, integers up to eight digits or more in length. To shorten these IDs, we’ve decided to base-36-encode them. So, for example, 17,012,00010 = A4MJK36. Now, the number of IDs that will be encoded is on the order of 108. What is the likelihood that at least one of those IDs encodes to something vulgar in English? See, for example, what 1,329,077 encodes to.

I cannot take the risk that this may happen. What, then, is to be done? Note that the digits in “normal” base-36 encoding are 0, 1, 2, …, 9, a, b, c, …, z. But suppose we used a different alphabet? What if we just removed the vowels, say? That would leave us with an alphabet of 31, not 36 characters, so we would then be base-31-encoding the integers instead of base-36-encoding them. Is that OK? Well, for my purposes, it is. Note that any base-31 representation is a valid base-36 one, too (for a different number), just as 10 is both a valid base-2 and base-10 representation—meaning 2 in base-2, and 10 in base-10.

In Python, the encoding and decoding algorithms are as follows:

ALPHABET = "0123456789bcdfghjklmnpqrstvwxyz"

def encode(n):
    if n == 0:
        return ALPHABET[0]

    # We're only dealing with nonnegative integers.
    if n < 0:
        raise Exception() # Raise a better exception than this in real life.

    result = ""

    while (n > 0):
        result = ALPHABET[n % len(ALPHABET)] + result
        n /= len(ALPHABET)

    return result

def decode(encoded):
    result = 0

    for i in range(len(encoded)):
        place_value = ALPHABET.index(encoded[i])
        result += place_value * (len(ALPHABET) ** (len(encoded) - i - 1))

    return result

This can never give vulgar English words, since all English words have vowels in them. At least, it’s good enough; I suppose dirty acronyms could still be made, but that’s obscure enough for me.

Note that the code above will encode with any alphabet of two or more symbols, as long as the symbols are distinct. So, an alphabet of "01" will give you the binary encoding, "012" the ternary, "0123456789abcdef" the hexadecimal, and so on. Of course, "-+" gives a binary encoding, just with different symbols than those normally used. Since capital letters are distinct from lower-case letters in ASCII, you can base-62-encode a number with "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" if you wanted to.

Comments

My First Stab at a Useful Liquid Tag

I’ve experimented with a number of ways to display code on this site. I have come to prefer using Google Code Prettify because of its simplicity and elegance. Until recently, I would “drop down” into HTML from writing my note in Markdown, something like this:

Some markdown text...
<pre class="prettify"><code>IEnumerable&gt;string&lt; GetStrings()
{
    foreach (string str in this.strings)
        yield return str;
}
</code></pre>
Some more markdown text...

Note that I had to put the code right after the opening <pre><code>—it would preserve the line break if I didn’t. Also, if the code contained any characters special to HTML, like angle brackets, as my example above does, and as many languages do, I would have to escape those characters myself. But last night, dabbling in Liquid, I produced the {% code %} tag:

require 'cgi'

class CodeBlock < Liquid::Block
  def initialize(name, markup, tokens)
    super
  end

  def render(context)
    output = super
    source = "<pre class=\"prettyprint highlight\"><code>"
    source += CGI.escapeHTML(output.strip)
    source += "</code></pre>"
    source
  end
end

Liquid::Template.register_tag('code', CodeBlock)

This is my first stab at a Liquid tag. I want to add the ability to pass certain parameters to the tag, mostly to support the parameters that Google Code Prettify blocks can support.

Comments

The Iterable Stream

Sometimes I find myself reusing certain paradigms in code in different projects. Many of these paradigms are the standard design patterns. There is one paradigm I’m using in a project right now that I’ve noticed I’ve used in other projects before. I call this pattern the Iterable Stream.

The Pattern

Say you have a data file or even just a stream that contains multiple records. A common example of this would be a character-delimited file which has records separated by newlines.

Python makes this embarrassingly easy with its ability to iterate over the lines of a file:

with open("path/to/some/file", "rb") as record_file:
    data_records = map(lambda line: DataRecord(line), record_file)
    for record in data_records:
        # Do something interesting with each record here.

In cases such as that described above, when using C#, I often expose the iterable records through a property:

public class SomeRecordSet
{
    private readonly StreamReader reader;

    public SomeRecordSet(StreamReader reader)
    {
        this.reader = reader;
    }

    public IEnumerable<DataRecord> Records
    {
        get
        {
            while (!this.reader.EndOfStream)
                yield return new DataRecord(this.reader.ReadLine());
        }
    }
}

This encapsulates the dirty work of producing records from the stream and allows you to do this:

using (var reader = new StreamReader(File.OpenRead("path/to/some/file")))
{
    var recordSet = new SomeRecordSet(reader);

    foreach (DataRecord record in recordSet.Records)
    {
        // Do something interesting with the record here.
    }
}

Though not as fancy as Python, using C#’s yield return syntax lets you easily make a stream iterable.

An Alternative

Alternatively, instead of exposing the iterable records through a property on SomeRecordSet, you could make SomeRecordSet itself implement IEnumerable<DataRecord>. Admittedly, this is a little more complicated, but not too much:

public class SomeRecordSet : IEnumerable<DataRecord>
{
    private readonly SomeRecordSetEnumerator enumerator;

    public SomeRecordSet(StreamReader reader)
    {
        this.enumerator = new SomeRecordSetEnumerator(reader);
    }

    public IEnumerator<DataRecord> GetEnumerator()
    {
        return this.enumerator;
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return this.GetEnumerator();
    }

    private class SomeRecordSetEnumerator : IEnumerator<DataRecord>
    {
        private readonly StreamReader reader;

        public SomeRecordSetEnumerator(StreamReader reader)
        {
            this.reader = reader;
        }

        public void Dispose()
        {
        }

        public bool MoveNext()
        {
            string line = this.reader.ReadLine();

            if (line == null)
                return false;

            this.Current = new DataRecord(line);

            return true;
        }

        public void Reset()
        {
            throw new NotImplementedException();
        }

        public DataRecord Current { get; private set; }

        object IEnumerator.Current
        {
            get { return this.Current; }
        }
    }
}

Then you can just iterate over the records in the object, like this:

using (var reader = new StreamReader(File.OpenRead("path/to/some/file")))
{
    var recordSet = new SomeRecordSet(reader);

    foreach (DataRecord record in recordSet)
    {
        // Do something interesting with the record here.
    }
}

Conclusion

We’ve been using file streams here, but it should be clear that this pattern could be used for most any stream. Making use of C#’s yield return syntax or implementing the IEnumerable<T> interface can make a data stream iterable and hide the ugly details of doing so.

Comments

FeedBurner’s Missing API

I use FeedBurner for this site’s feed. I like what it tries to do, though I am a little concerned about possible neglect on Google’s part. Normally, FeedBurner checks each feed every thirty minutes to see whether there is new content. This is, of course, much too slow for those of us who want real-time publishing of items to Twitter or Facebook or some other social network.

There is a form at the FeedBurner site that you can use to ping FeedBurner manually. Incidentally, that page refers to the “Ping and Extended Ping XML-RPC API” which is apparently defunct. Using the form makes a GET request to /fb/a/pingSubmit, whereupon the result of the request is displayed—that is, whether it succeeded or was throttled or failed.

That’s nice for human use, but not so much for automation. So, I’ve wrapped that service in a REST endpoint to which you can make a POST (not a GET), like this (in Ruby):

endpoint = URI('http://feedburner-pinger.herokuapp.com/')
feed_url = 'http://feeds.feedburner.com/MyFeed'
response = Net::HTTP.post_form endpoint, {:url => feed_url}

The response, if successful, will look like this:

{
  "status": "SUCCEEDED",
  "message": "Successfully pinged"
}

To publish this site, I have a Rake task that uses this API to ping FeedBurner. More details of the API are at the service endpoint.

Comments

I’ve Changed My Mind (Somewhat)

You will notice social sharing buttons and a comment section at the end of this note. I have gone back somewhat on what I earlier said about the purpose of this site and these notes. I believe that the original model I had—that is, notes without the possibility of comment or easy sharing—is a little too detached for my tastes. Therefore, I have added these elements. I still do intend for these notes to be, on average, more substantive than in my previous blog, and, being a fan of simplicity, I do intend to keep this site rather simple. I am in the middle of an experiment of how I want this web site to be.

Comments

Introducing Mvc.Jsonp

I’ve recently been doing some JSONP-related stuff with ASP.NET MVC 3, and I decided to make it into an open-source library. It provides a JsonpResult which is a subclass of MVC’s JsonResult, as well as an abstract JsonpControllerBase, from which you can subclass your controllers, which provides a Jsonp method (and various overloads) which return a JsonpResult.

So, let’s say you have a controller which you want to return JSONP. You could use the library like this:

public class StarshipController : JsonpControllerBase
{
    public JsonpResult Index(string callback = "myCallbackFunction")
    {
        var starships = new List<string>
        {
            "Enterprise", "Excalibur", "Yorktown", "Oberth", "Hood", "Reliant"
        };

        return this.Jsonp(starships, callback, JsonRequestBehavior.AllowGet);
    }
}

This will give you a response of application/javascript with this content:

myCallbackFunction(["Enterprise","Excalibur","Yorktown","Oberth","Hood","Reliant"]);

You can grab it at GitHub or get NuGet binaries.

Comments

The Spatial Theory of Voting

Introduction

I have always been fascinated by the idea of simulating social phenomena. In my youth I would spend hours playing President Elect, Balance of Power, SuperPower, and other games that purported to simulate political and economic phenomena. I have recently come across a mathematical model of voting, called the spatial theory of voting, which I shall discuss here.

Ideological Spaces

The One-Dimensional “Political Spectrum”

We often speak in terms of the political spectrum, which we think of as a continuum along which we may identify a person’s ideology. On the right, we have right-wing ideology, and on the left we have left-wing ideology, and in the middle, we have centrist ideology. This continuum has certain properties. For example, we hear things like, “he’s to the left of Ted Kennedy,” or, “she’s more conservative than Ronald Reagan.” This implies that on this continuum we may rank ideologies as being “greater” or “lesser” than others, or more “rightward” or “leftward” than others—that is, given two ideologies \( a \) and \( b \), \( a \neq b \), we can say that \( a > b \) or \( a < b \)—without making any normative judgment, though people often use such rankings to make normative judgments.

Two-Dimensional Nolan Space

Careful thought, however, reveals that the arrangement of ideologies might be more complicated than placement on a one-dimensional continuum. This is the motivation of the Nolan chart, which places ideological positions in a two-dimensional space which I call Nolan space. One axis represents a person’s economic freedom “score,” while the other represents the person’s personal freedom “score.” Regions of the Nolan space correspond to general ideologies.

The General Case: \( n \)-Dimensional Space

There is no reason we cannot measure a person’s ideology with respect to more than two measures. We can therefore work with higher-dimensional spaces, though visualization may be more difficult for four or more dimensions.

A person’s ideology is therefore an \( n \)-vector \( \mathbf{p} \) in the ideological space represented by \( \mathbb{R}^n \). The Euclidean distance between two vectors \( d(\mathbf{p}_1, \mathbf{p}_2) = \sqrt{(\mathbf{p}_1 - \mathbf{p}_2)^\top (\mathbf{p}_1 - \mathbf{p}_2)} \) is the “distance” between two ideologies and can be a measure of the similarity of two ideologies. A weighting matrix \( \mathsf{W} \) can be added to produce a weighted distance \[ d(\mathbf{p}_1, \mathbf{p}_2, \mathsf{W}) = \sqrt{(\mathbf{p}_1 - \mathbf{p}_2)^\top \mathsf{W} (\mathbf{p}_1 - \mathbf{p}_2)}, \] which allows us to specify the importance to which an elector gives each element of the elector’s ideological preference. When \( \mathsf{W} = \mathsf{I} \), then the distance reduces to the un-weighted Euclidean distance.

For the purposes of this discussion, we will restrict ourselves without loss of generality to Nolan space.

Elections

Electors and Candidates

An elector (or voter) is a rational agent that has a “bliss point” \( \mathbf{p} \) in the ideological space that represents that elector’s ideological preferences. A candidate is a person who is up for election by the electors. A candidate has an ideology \( \mathbf{c} \) in the ideological space. Let \( C = \{ \mathbf{c}_1, \mathbf{c}_2, \dots, \mathbf{c}_i \} \) be the set of candidates, represented by their ideologies, in a particular election. An elector with ideological preferences \( \mathbf{p}_i \) has a utility \( u_{\mathbf{p}_i}(\mathbf{c}_j) \) of voting for a particular candidate with ideology \( \mathbf{c}_j \). This utility function \( u : \mathbb{R}^n \to \mathbb{R} \) is monotonically increasing as the Euclidean distance \( d \) decreases. A rational, ideological elector will vote for that candidate whose ideological position will give that elector maximum utility: \[ \mathbf{c}_{\mathbf{p}_i} = \arg \max_{\mathbf{c} \in C} \, u_{\mathbf{p}_i}(\mathbf{c}). \] Note that if \( d = 0 \), then \( \mathbf{p}_i = \mathbf{c}_j \), and \( u_{\mathbf{p}_i}(\mathbf{c}_j) \) is at a maximum. Thus there would be no other candidate than that represented by \( \mathbf{c}_j \) that gives the elector more utility. This implies that, in elections in which candidates are also electors, rational candidates should always vote for themselves.

The Electoral Zone

With the foregoing utility function, we see that each candidate will command an electoral zone in which each elector whose ideological preference is in the zone will have maximum utility of voting for that candidate. These zones are, in fact, Voronoi tessellations of the ideological space. For first-past-the-post elections, that candidate whose electoral zone encompasses the most ideological positions of the electors is the winner of the election.


Figure 1. A three-candidate election in Nolan space and each candidate's electoral zone. In this election, candidate A might be a center-right candidate, candidate B a center-left candidate, and candidate C a classical liberal candidate.

Distribution of Electors

The distribution of electors in the ideological space is probably not uniform, especially when the number of electors is large. In American political parlance, there are some “red states” and some “blue states,” and within states, there are “red” and “blue” counties and townships. This implies that we could treat the distribution of electors as a probability distribution. In a particular election, the electorate could be distributed multivariate-normally or some other way.

Economists Antonio Merlo and Aureo de Paula at the University of Pennsylvania, in their paper ”Identification and Estimation of Preference Distributions When Voters are Ideological,” assert that by studying election results, one can estimate the ideological distribution of voters in an election. But suppose one already knows (or suspects) the distribution of the electorate. This spatial model would make it possible to predict—or at least simulate—electoral outcomes. This would be a matter of creating a Voronoi tessellation of the ideological space and “counting” the number of electors in each electoral zone, according to the distribution of electors.

Campaigning

Viewed in the light of the spatial model, campaigning and electioneering can be seen as the efforts of candidates to maximize the number of voters in their electoral zones. This can be achieved either through convincing the electorate to move their ideological bliss points into the candidate’s electoral zone, or by moving the candidate’s ideological position so as to acquire a greater number of electors, which has echoes of the median voter theorem.

Is This Worth Anything?

This model of voting is all nice and neat, but is it actually useful in describing reality? Let’s revisit some of its assumptions.

  • Electors are ideological. Do electors always vote ideologically? Might they not vote for other reasons? Is it realistic to have a utility function that numerically describes an elector’s utility of choosing a particular candidate? Are electors’ utilities really cardinal and not just ordinal? Isn’t value subjective? Is it valid to compare two electors’ subjective utilities? Is it realistic to assign numerical values to ideologies and place them in a metric space?

  • Electors have good information about their own ideological preferences and those of the candidates. Can each elector accurately “calculate” the elector’s distance in the ideological space to each candidate, or at least to the nearest candidates?

  • All electors vote. This is clearly false in the real world. Could we modify this model to capture the fact that many electors do not vote? Would it matter if we could?

Conclusion

This model may be predictively good in certain electoral situations, and it may be helpful in determining voter distributions, so long as the assumptions hold. But voting is primarily a matter of human action, which is notoriously difficult to predict accurately. But for less-serious, simulative purposes, this model might be useful for quasi-realistic prediction and simulation, such as for political simulation games.

Comments

Transition to New Site

I have for some time wanted to have my own web presence—a place on the Internet which could serve as the center of my “geek life,” or even the center of my larger life. I have had a blog for some years now, and I have moved between WordPress and Blogger during that time, in addition to flirting with Posterous and Tumblr. None of those services gave me the total control I wanted, and besides, a blog is only a part of one’s web presence anyway. I have experimented with GitHub Pages and Jekyll (and Markdown) and found those technologies to be better suited to what I want to accomplish.

I won’t bother migrating my previous blog content to this site; I intend for these notes to be, on the whole, larger and more substantive than my previous blog content. In fact, the reader will notice that these notes are, for one thing, called “notes,” not “blog posts,” and for another, there is no place to comment on these notes. This is entirely by design, since what I intend for these notes is not entirely to serve the purpose of blogging, but rather as a place for essays. For shorter, more immediate posting, there are Twitter (which is, after all, a microblogging service) and Google+.

The last post on my previous blog contains a brief message and a link to this new site. Because I intend for the notes on this site to be generally larger and more substantive than my previous blog content, they will likely be a bit more infrequent.

Comments