Python Brain-Teaser

May 29, 2008 at 11:14 AM | Python, Play | View Comments

I'm working on pulling some functionality out of one object and putting it in another, and I came across this interesting problem:

class Foo:
    me = "foo"
 
class Bar:
    me = "bar"
    def get_me(self):
        return self.me
 
Foo.get_me = Bar.get_me
x = Foo()
 
print x.get_me()


What does this print?

And, next question, why is that?

After lunch I'll post my thoughts :-)


Until now, I had assumed that the semi-magic self variable was set on method calls (ie, when x.get_me() is called)... But apparently it's set when the object is instantiated (which makes perfect sense, otherwise getattr(Bar(), 'get_me')() would not work).

So I can only presume that something equivalent to this happens when an object is instantiated:

from functools import partial
new_obj = Class()
for (key, val) in new_obj.__dict__.items():
    if not callable(val): continue
    setattr(new_obj, key, partial(val, new_obj))
Permalink + Comments

SSH Connection Sharing

May 17, 2008 at 09:35 AM | Play | View Comments

This tip was originally posted to the python-dev mailing list, so I can't take one scrap of credit for it.

Basically, since OpenSSH4, there has been an option to share connections -- that is, once you've opened one connection to a host, every subsequent connection is tunneled through the same channel, completely removing the overhead of authentication!

It's quite simple, just add this to ~/.ssh/config:

ControlMaster auto
ControlPath ~/.ssh/.%r@%h:%p

There can be problems if your machine crashes and that file is left lying around... So adding this to your crontab will fix that:

@reboot rm -f .ssh/controls/*

Next time: tab completion with scp!

Permalink + Comments

SSH and HTTPS on the same port?!

April 11, 2008 at 05:38 PM | Python, Play | View Comments

A friend of mine was over the other night, and we were talking about how much we like running SSH on port 443. It gets you around all but the most insane firewalls, and to anyone but the keenest of observers it just looks like HTTPS traffic.

But running SSH over port 443 makes it particularly difficult to run a secure web server as well.

So what can be done?

Well, it turns out that when an SSH client initiates a connection, it waits for the server to send a banner (SSH-1.99-OpenSSH_4.7, for example) before beginning the secure negotiation. Browsers, on the other hand, just go right ahead and begin the secure negotiation.

Knowing this, it didn't take long to formulate a plan: write a little program which will listen on 443. When it gets a connection, it figures out if the remote end is trying to speak SSH or HTTPS, makes the appropriate connection on the local end, the passes the data between the two.

It only took a couple of hours to write a small proof-of-concept... And, believe it or not, it even works! Disclaimer: this particular implementation is not quite suitable for any real use.

Anyway, this may not be the most efficient way of doing things... But that's no matter. It's still neat :-)

Code: sstp.py

Permalink + Comments

Problem E - Enhancing IPSC Rules: Rock Paper Scissors with PNGs

July 21, 2007 at 11:33 PM | Play | View Comments
Rock paper scissors in PNG form
Rock paper scissors in PNG form

At this point, it seems like the Facebook problems have started to go down hill. Even when Ted and I combine our attention spans, we just can't quite make it through the descriptions. To make matters worse, the only short problem left, Easy Puzzle Mountain, isn't actually all that easy...

So what ever will we do on a lovely Saturday night? Well, why not revisit an older problem. This time, it is courtesy of the IPSC1: Problem E - Enhancing IPSC Rules.

The problem is simple: the input is a JPEG with many pairs of rocks, papers and scissors. Each pair represents one round of Rock Paper Scissors. You must figure out who won each round (and ultimately the game).

As normal for the IPSC, it was possible to solve the simple data set by hand -- each image had about 10 rounds each. The hard data set, on the other hand... Well, you can see for yourself.

10% of one hard dataset. There are 22 sets in total.

We quickly decided that it would be silly to try and examine the shape of each object, so we decided to focus on the color.

Looking at the color, it's easy to tell the difference between the rocks and everything else: they are really dark, everything else is really bright. We called this the luminosity2, and it is simple to obtain: simply add the red, green and blue channels of each pixel3.

The next problem we had was telling the difference between paper and scissors. Their "luminosity" is similar, and probably not enough to make a sure match. Fortunately, the paper was fairly bland and the scissors were fairly colorful. We came up with a metric called "color delta" to measure this. The formula we used simple: color_delta = abs(red - green) + abs(red - blue) + abs(blue - green) (where abs(x) is the absolute value of x).

Alright, great. So now we have a method of distinguishing between rocks, papers and scissors... But how do I read the file? I have never worked with graphics before. The Python graphics libraries seemed intimidating, and I'll be darned if I'm also going to decode the PNGs.

Fortunately, I had the presence of mind to remember the PNM format. The Wikipedia page has more information, but all that's important for now is that the P stands for Portable, and this portability is due to the fact that it uses flat ASCII to describe an image. Nothing could be easier to parse.


for FILE in *.png
do
    convert $FILE{,.pnm}
done
 

After this, the code came surprisingly quickly... And, short of an incredibly stupid bug (I used something along the lines of side = count / width % 2 to figure out which side of the image was being processed), it even worked! I was able to plot the output (which was the luminosity, color delta pairs) in gnuplot and see a strong correlation.

To figure out what each point was, I simply opened up one of the files in Photoshop and cut out all the items but the scissors, then all the items but papers. After adding these to the plot, it was simple to tell which group was which.

The final plot
The final plot. Red points are data points, green points are papers from the training set and blue points are scissors from the training set.

And that is it :-)

Code and sample data can be found at /code/e/.

1: On a side note, the IPSC is a great programming/problem solving contest. I have written it for the past three of four years, and every time it's a lot of fun.

2: Yes, I realize that this is completely incorrect. Brightness would be much more appropriate.

3: In actuality, I ended up using (255*3)-(r+g+b) so that white (empty) pixels would have a value of 0.

Permalink + Comments

Solving Gridflip

July 05, 2007 at 05:45 PM | Play, Facebook | View Comments

Ted and I had so much fun solving the last Facebook problem that we decided to do another this weekend: gridflip.

The problem is simple: you are given a grid of numbers, some positive and some negative. The goal is to minimize the sum of the grid, while keeping the sums of the individual rows and columns positive. The only operation that can be performed is flipping the sign of the individual rows of columns. For instance, given the grid:

 [  2  -3 ] -1
 [ -3   5 ]  2
   -1   2    2

A flip to the first row and the first column would result in the grid:

 [ -2   3 ]  1
 [ -3   5 ]  2 # After flipping the first row
   -5   8    6

 [  2   3 ]  5
 [  3   5 ]  8 # After flipping the first column
    5   8   26

This is the best solution for this grid, because it is the only solution where the sums of all the rows and columns are positive.

The solution we came up with for this problem was almost as simple: for every possible combination of flipped/not flipped columns1, make sure that every row is positive. If, after making the sum of every row positive, the sum of every column is positive, the grid is a contender for the smallest sum.

Now, there is one small problem: rows which have a sum of zero. If a row's sum is zero, a flip won't change the value of the row, but it will change the value of the columns. The obvious solution is simply to keep track of which rows have a sum of zero, then try flipping them... And that's exactly what we did.

Now, the biggest downside to this algorithm is that it runs in O(2n) where n is the shortest side of the grid... But in practice, even my less-than-optimal Python code could solve the "large" data set in less than three seconds (Ted's C# code was almost instant).

It makes me wonder, though, is there any way to speed this up?

See /code/facebook/gridflip/ for source code and test grids.

1: Columns for simplicity -- the actual code makes sure there are fewer columns than rows by transposing the grid.

Edit: So, it appears that I can't add. I've fixed up the example grids so that they actually make sense.

Edit by Ted: I'd like to try to clarify our algorithm a little since the explanation is crappy. Let's say we have a 10 x 20 grid. First we chose the smallest dimension which in this example is 10. Then we enumerate all possible ways of the 10 rows which is 2^10 = 1024 ways. For each of those 1024 ways, we look at the 20 columns and update their sums. If any sum is < 0 we flip the column and now its sum is positive. This allows us to directly solve the pattern the columns need to be flipped in that can give a valid solution.
The only issue though is when the sum is zero. In this case we don't know whether to flip or not. Maybe out of the 20 columns, 3 will sum to zero (very few sum to zero and it does vary based on permutations but stays quite low) so that is 2^3 which is 8 possibilities. In the end the sign of the sums is checked and it is determined if we are the max sum. Only a few thousand possibilities and the operations during the iteration are quite simple (adding a few numbers and so if-else's). This results in a very fast runtime.

Permalink + Comments