Well, it is about time that I start posting so here it goes. The other day, Dave, me (Ted) and a few friends decided to try to solve the Facebook Optimal Movie Seating Puzzle based off of an XKCD comic.
The gist of problem was that you are seating 16 people in a movie theater with a series of rules that assign points for different arrangements such as seating friends beside each other. You want to find an optimal arrangement that maximizes the 'social enjoyment'. Now with 16 people there are 20 922 789 888 000 permutations. This is a fairly large number number of possibilities to test each one on a normal consumer grade computer.
To tackle the problem Dave started writing a client-server scheduler to split up the work among multiple machines for a cluster-computing effect. I'd guess he will write up a post on it soon. Meanwhile, I began to tackle looking at ways to quickly evaluate the 'social enjoyment score' and also figure out how to narrow down the solution space so we didn't have to actually test all twenty-one trillion possibilities.
- To handle the issue of finding scores we first agreed on an interpretation of the rules in which people had to be adjacent for the rules to take effect. Some of the rules involved pairs of people and other rules involved triples. We addressed this by generating a lookup table at the start that assigned a score for each sequence of three people. Using the lookup table we would find the partial scores of partial seatings as we permuted the possible seatings. Now this still involves checking every possible seating, albeit much faster than the first version.
- To cut down the solution space we looked at the maximum value that appeared in the lookup table which was 4 for us. If only n people are left to be seating, we know at best our partial score can only improve by 4n and thus can know after seating only some of the people that no mater how we arange the unseated people, this will never beat our current best so we don't need to even try all those combinations. This simple heuristic made a huge difference in speeds (I think the factor was 10x but I'm just making that up off the top of my head).
- Determined to get it even faster we looked into our lookup table some more. We realized that all the cases that had 4 points were disjoint and that only one of the could be true in a given seating. Once one of the 4 point arrangements was used the highest value was now 3 points. We updated our heuristic from 4n to 3(n-1)+4 which gave the code a much better early estimate on when a partial seating arrangement would not be able to outperform the current best score. This sped up code by another factor of 10x or so (again I'm making up numbers since i don't remember and our measurements were hardly scientific).
We decided we should let it run through the full set and turned on the compiler optimizer for another 2x faster. In the end our our runtime estimates went from the order of 1000 cpu-hours with the first version to under 5 cpu-hours in the end. This really stresses the benefit of optimizing your algorithm versus optimizing the code itself.
I'm sure this problem could be solved much much quicker and we may make another attempt at improving our method.
EDIT: It seems this wasn't clear (which was my fault), but as pointless as it may be, we were aiming for a solution that ensured it covered all options in case of local minima (which for this problem was pretty much a non-issue anyways). There are certainly far faster and more practical probabilistic ways as people have been pointing out.