Python security tip: urllib/urllib2 will read `file://` URLs
June 05, 2011 at 11:03 PM | Python | View CommentsI discovered, entirely by accident, that urllib2.urlopen, urllib.urlretrieve, and probably others, will happily read file:// urls and filesystem paths. For example:
>>> import urllib, urllib2
>>> urllib.urlretrieve("database_connection_settings.txt", "/tmp/temp_file")
('/tmp/temp_file', <mimetools.Message instance at 0x…>)
>>> urllib2.urlopen("file:///dev/urandom").read(10)
'\xf1r?\x0fC\x86p\x05\xa4\xdd'
This means that applications which blindly urlopen untrusted URLs (for example, from RSS feeds) are potentially vulnerable to information disclosure and denial of service attacks.
I encountered my first bug (in recent memory) that was caused by Python's significant whitespace today.
As I was editing a class, I accidentally left-shifted an entire method, effectively removing the rest of the methods from the class:
class Foo:
def foo(self):
…
def accidentally_shifted_left(self):
…
def bar(self):
…
Like always, the best parts of PyCon 2011 happened outside of scheduled talks (one particular highlight was a long conversation with Tavis Rudd, author of the Cheeta template engine, about why HTML templates are bad)… But since those parts weren't recorded, I can't very well recommend watching them. The talks were recorded, though, and here are the ones I'd recommend watching.
(NB: this is not a complete list of good talks, just the ones I found/will find interesting. Check the PyCon 2011
schedule for a complete list of talks, then search Google for {talk title}
site:blip.tv to find the videos.)
Talks I Attended
Ordered roughly by how much I enjoyed them.
- Using Python 3 to Build a Cloud Computing Service for my Superboard II by David Beazley (video link) — very amusing
- How to write obfuscated python by Rev. Johnny Healey (video link) — also very amusing
- Everything You Wanted To Know About Pickling, But Were Afraid To Ask! by Richard T. Saunders (video link) — enjoyable, informative
- "Dude, Where's My RAM?" - A deep dive into how Python uses memory by Dave Malcolm (video link) — it would be hard to get much lower-level than this talk
- Reverse-engineering Ian Bicking's brain: inside pip and virtualenv by Carl Meyer (video link) — Carl builds a virtualenv from scratch to show how virtualenv works
- API Design: Lessons Learned by Raymond Hettinger (video link) — Raymond Hettinger wrote many of Python's core data structures, and he presents some good advice on API design with examples to back it up
- Genetic Programming in Python by Eric Floehr (video link) — enjoyable, good examples, interesting stuff
- The Python That Wasn't by Larry Hastings (video link) — rejected PEPs and features which didn't make it into Python (also, I enjoy Larry's talks)
- Testing with mock by Michael Foord (video link) — the title says it all (Michael Foord is the author of Mock)
- Advanced Network Architectures With ZeroMQ by Zed Shaw (video link) — ZeroMQ looks really neat, talk is a nice overview
- Fun with Python's Newer Tools by Raymond Hettinger (video link) — another talk by Hettinger, shows some interesting tidbits
- Best Practices for Impossible Deadlines by Christopher Groskopf (video link) — makes me glad I don't work for a newspaper
Talks I want to watch
In alphabetical order.
- Get new contributors (and diversity) through outreach by Asheesh Laroia (video link) — Asheesh is an awesome guy, I'm looking forward to hearing this talk
- Handling ridiculous amounts of data with probabilistic data structures by C. Titus Brown (video link) — Titus is a good presenter, and I heard a lot of good things about this talk
- How to kill a patent with Python by Van Lindberg (video link) — Van is also a good presenter, heard good things about this talk
- Linguistics of Twitter by Michael D. Healy (video link) — looks interesting
- Python and Robots: Teaching Programming in High School by Vern Ceder (video link) — heard lots of good things about this talk
- Python-Aware Python by Ned Batchelder (video link) — looks interesting
- Status of Unicode in Python 3 by Victor Stinner (video link) — looks interesting (but, then, I'm abnormally interested in Unicode)
- Swarming the Web: Evolving the Perfect Config File by Kurt Grandis (video link) — looks interesting
- Through the Side Channel: Timing and Implementation Attacks in Python by Geremy Condra (video link) — looks interesting
- Useful Namespaces: Context Managers and Decorators by Jack Diederich (video link) — looks interesting
- Using Blender's new BPY Python API by Christopher Allan Webber (video link) — heard good things about it, I'd like to learn Blender's API
- Using Python to debug C and C++ code (using gdb) by Dave Malcolm (video link) — looks interesting
- Why is Python slow and how PyPy can help? by Maciej Fijałkowski and Alex Gaynor (video link) — PyPy is freaking awesome, I want to learn more about it
Python quickie: using `itertools.product` to generate binary strings
March 18, 2011 at 04:08 PM | Python | View CommentsProblem: you need to enumerate the binary strings between 000 and 111 (ie, 000, 001, 010, …).
Solution: itertools.product:
>>> from itertools import product
>>> for bits in product([0, 1], repeat=3):
... print "".join(str(bit) for bit in bits)
...
000
001
010
011
…
>>>
For more like this, check out Raymond Hettinger's (really good) talk Fun with Python's Newer Tools, from PyCon 2011.
Four reasons nose rocks:
Test discovery
After I've written my test_*.py files, I just run nosetests:
$ ls
foo.py bar.py test_foo.py test_bar.py
$ nosetests
........
--------
Ran 9 tests in 0.03s
Nose is more Pythonic
The xUnit-esque frameworks fit well in languages like Java where methods can be called without the this prefix, but they don't work so well in Python which forces the explicit self. xUnit-esque frameworks also assume that tests will always exist inside a class and camelCase is pretty. Compare:
from unittest import TestCase
class UnittestTestCase(TestCase):
def testStuff(self):
self.assertEqual(foo, bar)
from nose.tools import assert_equal
def test_stuff():
assert_equal(foo, bar)
class NoseTestCase(object):
def test_stuff(self):
assert_equal(foo, bar)
Test generators
The name basically says it all:
$ cat test_add.py
add_tests = [ (1, 2, 3), (1, 3, 4), (0, 0, 0) ]
def test_add():
def test_add_helper(a, b, expected):
assert_equal(a + b, expected)
for test in add_tests:
yield (test_add_helper, test)
$ nosetests
...
------
Ran 3 tests in 0.03s
Logging/stdout capturing
Nose automatically captures all text sent to stdout during tests, discarding it if the test passes or printing it if the test fails. For example: $ cat test_stuff.py from nose.tools import assert_equals
def passing_test():
print "everything is happy!"
assert_equals(1+2, 3)
def failing_test():
print "oh no..."
assert_equals(1+2, 0)
$ nosetests
.F
======================================================================
FAIL: test_stuff.failing_test
----------------------------------------------------------------------
Traceback (most recent call last):
File "/Library/Python/2.6/site-packages/nose/case.py", line 186, in runTest
self.test(*self.arg)
File "/private/tmp/test_stuff.py", line 9, in failing_test
assert_equals(1+2, 0)
AssertionError: 3 != 0
-------------------- >> begin captured stdout << ---------------------
oh no...
--------------------- >> end captured stdout << ----------------------
----------------------------------------------------------------------
Ran 2 tests in 0.001s
FAILED (failures=1)
NOTE: Michael Foord has been doing some crazy stuff with unittest2, so it's possible that it may be as awesome as nose.