'Optimal Movie Seating' Puzzle

June 24, 2007 at 06:18 PM | Facebook | View Comments

Well, it is about time that I start posting so here it goes. The other day, Dave, me (Ted) and a few friends decided to try to solve the Facebook Optimal Movie Seating Puzzle based off of an XKCD comic.

The gist of problem was that you are seating 16 people in a movie theater with a series of rules that assign points for different arrangements such as seating friends beside each other. You want to find an optimal arrangement that maximizes the 'social enjoyment'. Now with 16 people there are 20 922 789 888 000 permutations. This is a fairly large number number of possibilities to test each one on a normal consumer grade computer.

To tackle the problem Dave started writing a client-server scheduler to split up the work among multiple machines for a cluster-computing effect. I'd guess he will write up a post on it soon. Meanwhile, I began to tackle looking at ways to quickly evaluate the 'social enjoyment score' and also figure out how to narrow down the solution space so we didn't have to actually test all twenty-one trillion possibilities.

  • To handle the issue of finding scores we first agreed on an interpretation of the rules in which people had to be adjacent for the rules to take effect. Some of the rules involved pairs of people and other rules involved triples. We addressed this by generating a lookup table at the start that assigned a score for each sequence of three people. Using the lookup table we would find the partial scores of partial seatings as we permuted the possible seatings. Now this still involves checking every possible seating, albeit much faster than the first version.
  • To cut down the solution space we looked at the maximum value that appeared in the lookup table which was 4 for us. If only n people are left to be seating, we know at best our partial score can only improve by 4n and thus can know after seating only some of the people that no mater how we arange the unseated people, this will never beat our current best so we don't need to even try all those combinations. This simple heuristic made a huge difference in speeds (I think the factor was 10x but I'm just making that up off the top of my head).
  • Determined to get it even faster we looked into our lookup table some more. We realized that all the cases that had 4 points were disjoint and that only one of the could be true in a given seating. Once one of the 4 point arrangements was used the highest value was now 3 points. We updated our heuristic from 4n to 3(n-1)+4 which gave the code a much better early estimate on when a partial seating arrangement would not be able to outperform the current best score. This sped up code by another factor of 10x or so (again I'm making up numbers since i don't remember and our measurements were hardly scientific).

We decided we should let it run through the full set and turned on the compiler optimizer for another 2x faster. In the end our our runtime estimates went from the order of 1000 cpu-hours with the first version to under 5 cpu-hours in the end. This really stresses the benefit of optimizing your algorithm versus optimizing the code itself.

I'm sure this problem could be solved much much quicker and we may make another attempt at improving our method.

EDIT: It seems this wasn't clear (which was my fault), but as pointless as it may be, we were aiming for a solution that ensured it covered all options in case of local minima (which for this problem was pretty much a non-issue anyways). There are certainly far faster and more practical probabilistic ways as people have been pointing out.

Permalink + Comments

Sorting out Sorting

June 20, 2007 at 10:34 AM | Play | View Comments

Back in grade 10, my high school computer science teacher showed us a video called Sorting out Sorting. Produced at U of T in 19811, it shows how different sorting algorithms work and compares their runtime.

It was very exciting to see it again this morning! Check out Sorting out Sorting on YouTube.

1: http://www.cs.princeton.edu/~ah/alg_anim/animation/bibliography3_8.html#SOS

Permalink + Comments

Fixing Files with Vim Macros

June 15, 2007 at 10:51 PM | Vim | View Comments

I got an email a couple days ago asking me to add a small header to all the pages of a website I work on from time to time. The site was built with static HTML a couple of years ago and this was the first site-wide modification I've had to make since then. So I fired up Vim, loaded the session I use to work on the page and got to work. It didn't take too long to make the changes to one page -- add a couple of JavaScript functions, add a call to the the body's onLoad property and finally a little div just below that.

But then came the problem. How do I apply this change to all the pages? Normally, to do multi-file search-and-replace, I use perl -pi -e 's/search/replace/g' `find . -name '*.html'`... But, in this case, I would have to run that three times (one for each block of text)... And more importantly I want to see the file before I edit it (because some of the HTML files don't use the standard template, so they would get horribly mangled).1

So how did I do it? By writing a Vim script! I used gvim *.html to open up all the HTML files, them qa to record all my keystrokes in to register a.

As I made the changes, I used general steps so that they could be repeated on each file. For instance, I used /onLoad to search for the onLoad property of the body, then f=la to find the =, move one character to the right then insert my text. When that was done, I opened up the line right below the body tag with o and put in the div tag that I needed.

Finally, after saving the file, I used :bn to move to the next buffer and hit q to stop the recording. I could now look at the new file to see if it needed to be changed, then change it with @a or move on to the next buffer with :bn.

Now, this makes me curious... Faced with a similar situation, how would you solve a problem like this?

1: Now, in retrospect, I realize I could have used:

for FILE in `find . -name '*.html'`; do
  read -p "Edit $FILE? " R
  if [[ $R == "y" ]]; then
     perl -pi -e 's/old/new' $FILE
  fi
done

But that would assume that I was putting some serious level of thought in to this. Also, unless I also had the presence of mind to add cp $FILE{,.backup} I would not have any way to un-do my changes.

Permalink + Comments

Amazon S3 (or, How I Solved My Image Hosting Woes)

June 12, 2007 at 12:19 PM | Python, Play | View Comments

A couple months ago, I read Jeff Atwood's article on Using Amazon S3 as an Image Hosting Service, but because my favicon does not does not generate 27 GBs of traffic a month, I didn't think much of it.

(if you're scratching your head wondering what Amazon's S3 is, I'll do my best to explain. It stands for Simple Storage Service, and it is just that: a simple service that allows you to store data. That data (or "those files", if you wish) can be world-readable (from a web browser) or private (many people are using it for backup). It's targeted at developers (Amazon only provides a set of APIs -- all GUIs are 3rd party) and it is dirt cheep: only $0.15 a gigabyte.)

Then, a few weeks ago, I was taking some pictures of my sister's soccer practice, and realized that I needed some place to put them. After looking at a couple of cheep hosting plans, I decided that I didn't like them; $5 a month isn't much, but that's still $60 a year for a whole lot of space I won't be using. Then I remembered S3.

After a night of playing around, I figured that it should be pretty simple. Write a script that will go through an HTML file looking for images, then upload them one at a time, replacing the links as it goes. html2s3 was born.

It turns out that the S3 library provided by Amazon is very easy to use, with most of my time spent correcting file paths. A couple of neat bits:


parser = OptionParser(usage = "usage: %prog [options] [FILES]\nWill process ..."
parser.add_option("-b", "--bucket", action="store", dest="bucket", help=...)
parser.add_option("-k", "--key-prefix", dest="remote_base_path", help="Prefix key ...")
parser.add_option("-n", "--no-backup", dest="no_backup", action="store_true" ...)
(options, args) = parser.parse_args()
 

I had seen OptionParser before, but never used it. It's so simple, though, that I think I'm going to use it quite a bit more frequently.


  source = args or sys.stdin.xreadlines()
  for line in source:
    process_entry(line)
 

One thing I loved about C was the ability to use assignment in loops (while (fgets(buf, sizeof(buf), stdin) != EOF)), but stdin.xreadlines() does the same sort of thing.

Update: When I moved the script over to my server, it started telling me that I didn't have permission to list the bucket. So I started digging in to it a bit, and found out that I was getting the error RequestTimeTooSkewed. It turns out that I forgot to turn my server's clock back last daylight savings... D'oh!

Permalink + Comments