Argh! My brain!

July 30, 2007 at 11:12 PM | Fixed-it | View Comments

For the summer I'm working on adding features to, cleaning up, and making algorithms more generic in a academic research tool. Overall the algorithms and data structures are well planned out and have been generally quite friendly to improvement. As part of my work, I'm integrating derivative versions created by people in their research so it isn't surprising when certain things are hacked together just to get them working, but sometimes it is a little overkill.

The fun example I've found that convinced me to write up this post was using a FIVE dimensional dynamic array to store some information. While I can see some reasons for it, there was quite a shock when i noticed the typecast to (short*****). Here is the code if you dare! (I replaced the variable name with a much shorter one)

  1. foo = (short *****) my_malloc ((nx+1)*sizeof(short ****));
  2. for (i = 0; i < nx+1; i++){
  3.   foo[i] = (short ****) my_malloc ((ny+1)*sizeof(short ***));
  4.   for (j = 0; j < ny+1; j++){
  5.     sblock_pattern_init_mux_lookup[i][j] = (short ***) my_malloc (4*sizeof(short **));
  6.     for (from_side = 0; from_side < 4; from_side++){
  7.       foo[i][j][from_side] = (short **) my_malloc (4*sizeof(short *));
  8.       for (to_side = 0; to_side < 4; to_side++){
  9.         foo[i][j][from_side][to_side] = (short *) my_malloc (nodes_per_chan*sizeof(short));
  10.         for (itrack = 0; itrack < nodes_per_chan; itrack++){
  11.           foo[i][j][from_side][to_side][itrack] = UN_SET;
  12.         }
  13.       }
  14.     }
  15.   }
  16. }

I really don't believe this was the best answer and I can hopefully cut this out.

EDIT: Apparently there were so many *'s it broke the blog software...

Permalink + Comments

Problem E - Enhancing IPSC Rules: Rock Paper Scissors with PNGs

July 21, 2007 at 11:33 PM | Play | View Comments
Rock paper scissors in PNG form
Rock paper scissors in PNG form

At this point, it seems like the Facebook problems have started to go down hill. Even when Ted and I combine our attention spans, we just can't quite make it through the descriptions. To make matters worse, the only short problem left, Easy Puzzle Mountain, isn't actually all that easy...

So what ever will we do on a lovely Saturday night? Well, why not revisit an older problem. This time, it is courtesy of the IPSC1: Problem E - Enhancing IPSC Rules.

The problem is simple: the input is a JPEG with many pairs of rocks, papers and scissors. Each pair represents one round of Rock Paper Scissors. You must figure out who won each round (and ultimately the game).

As normal for the IPSC, it was possible to solve the simple data set by hand -- each image had about 10 rounds each. The hard data set, on the other hand... Well, you can see for yourself.

10% of one hard dataset. There are 22 sets in total.

We quickly decided that it would be silly to try and examine the shape of each object, so we decided to focus on the color.

Looking at the color, it's easy to tell the difference between the rocks and everything else: they are really dark, everything else is really bright. We called this the luminosity2, and it is simple to obtain: simply add the red, green and blue channels of each pixel3.

The next problem we had was telling the difference between paper and scissors. Their "luminosity" is similar, and probably not enough to make a sure match. Fortunately, the paper was fairly bland and the scissors were fairly colorful. We came up with a metric called "color delta" to measure this. The formula we used simple: color_delta = abs(red - green) + abs(red - blue) + abs(blue - green) (where abs(x) is the absolute value of x).

Alright, great. So now we have a method of distinguishing between rocks, papers and scissors... But how do I read the file? I have never worked with graphics before. The Python graphics libraries seemed intimidating, and I'll be darned if I'm also going to decode the PNGs.

Fortunately, I had the presence of mind to remember the PNM format. The Wikipedia page has more information, but all that's important for now is that the P stands for Portable, and this portability is due to the fact that it uses flat ASCII to describe an image. Nothing could be easier to parse.


for FILE in *.png
do
    convert $FILE{,.pnm}
done
 

After this, the code came surprisingly quickly... And, short of an incredibly stupid bug (I used something along the lines of side = count / width % 2 to figure out which side of the image was being processed), it even worked! I was able to plot the output (which was the luminosity, color delta pairs) in gnuplot and see a strong correlation.

To figure out what each point was, I simply opened up one of the files in Photoshop and cut out all the items but the scissors, then all the items but papers. After adding these to the plot, it was simple to tell which group was which.

The final plot
The final plot. Red points are data points, green points are papers from the training set and blue points are scissors from the training set.

And that is it :-)

Code and sample data can be found at /code/e/.

1: On a side note, the IPSC is a great programming/problem solving contest. I have written it for the past three of four years, and every time it's a lot of fun.

2: Yes, I realize that this is completely incorrect. Brightness would be much more appropriate.

3: In actuality, I ended up using (255*3)-(r+g+b) so that white (empty) pixels would have a value of 0.

Permalink + Comments

Moving to S9

July 07, 2007 at 10:59 PM | Uncategorized | View Comments

As you have probably noticed, the blog looks a little different now. I've switched engines from BBlog to S9. BBlog was decent, but not under active development, missing a few crucial features (for instance, comment/post preview) and had a few minor bugs. Since I don't have time to fix it1, my friend Pete suggested Serendipity.

Serendipity has a large user base, lots of plugins and is generally fairly friendly. My only complaint so far is that it runs a little slowly... But that's what I get for running my server on a P2 233.

Please let me know if anything breaks.

1: And by "don't have time to fix it" I actually mean "don't want to hack any more PHP"

Permalink + Comments

Solving Gridflip

July 05, 2007 at 05:45 PM | Play, Facebook | View Comments

Ted and I had so much fun solving the last Facebook problem that we decided to do another this weekend: gridflip.

The problem is simple: you are given a grid of numbers, some positive and some negative. The goal is to minimize the sum of the grid, while keeping the sums of the individual rows and columns positive. The only operation that can be performed is flipping the sign of the individual rows of columns. For instance, given the grid:

 [  2  -3 ] -1
 [ -3   5 ]  2
   -1   2    2

A flip to the first row and the first column would result in the grid:

 [ -2   3 ]  1
 [ -3   5 ]  2 # After flipping the first row
   -5   8    6

 [  2   3 ]  5
 [  3   5 ]  8 # After flipping the first column
    5   8   26

This is the best solution for this grid, because it is the only solution where the sums of all the rows and columns are positive.

The solution we came up with for this problem was almost as simple: for every possible combination of flipped/not flipped columns1, make sure that every row is positive. If, after making the sum of every row positive, the sum of every column is positive, the grid is a contender for the smallest sum.

Now, there is one small problem: rows which have a sum of zero. If a row's sum is zero, a flip won't change the value of the row, but it will change the value of the columns. The obvious solution is simply to keep track of which rows have a sum of zero, then try flipping them... And that's exactly what we did.

Now, the biggest downside to this algorithm is that it runs in O(2n) where n is the shortest side of the grid... But in practice, even my less-than-optimal Python code could solve the "large" data set in less than three seconds (Ted's C# code was almost instant).

It makes me wonder, though, is there any way to speed this up?

See /code/facebook/gridflip/ for source code and test grids.

1: Columns for simplicity -- the actual code makes sure there are fewer columns than rows by transposing the grid.

Edit: So, it appears that I can't add. I've fixed up the example grids so that they actually make sense.

Edit by Ted: I'd like to try to clarify our algorithm a little since the explanation is crappy. Let's say we have a 10 x 20 grid. First we chose the smallest dimension which in this example is 10. Then we enumerate all possible ways of the 10 rows which is 2^10 = 1024 ways. For each of those 1024 ways, we look at the 20 columns and update their sums. If any sum is < 0 we flip the column and now its sum is positive. This allows us to directly solve the pattern the columns need to be flipped in that can give a valid solution.
The only issue though is when the sum is zero. In this case we don't know whether to flip or not. Maybe out of the 20 columns, 3 will sum to zero (very few sum to zero and it does vary based on permutations but stays quite low) so that is 2^3 which is 8 possibilities. In the end the sign of the sums is checked and it is determined if we are the max sum. Only a few thousand possibilities and the operations during the iteration are quite simple (adding a few numbers and so if-else's). This results in a very fast runtime.

Permalink + Comments

Breaking News: Nuclear Bombs are Dangerous!

July 03, 2007 at 08:19 PM | Uncategorized | View Comments

What is the breaking news in at least three of Toronto's newspapers today?

Dirty bomb would cause panic, cost billions: Study

Because there are those nice, clean bombs that generate revenue and increase public calm?

Well, this news made the front page of The Globe and Mail, The Metro and maybe even The Toronto Star.

Now, to be fair, they do go on to mention that there have been a number of lost or stolen "radioactive items"... But is this REALLY worthy of front-page news?

Permalink + Comments