.ssh/config every day

August 18, 2014 at 01:52 PM | Shell, OpenSSL | View Comments

I'd like to take a moment to share a few ways I use ~/.ssh/config file to make my life happier every day.

With these options I never need to remember host names, usernames, or port numbers, and the vast majority of my SSH commands look like:

$ ssh myapp
$ ssh myclient-prod-db
$ rsync -a app-backup:backups/jan01 .

Every time I get ssh access to a server I add an entry to my config file giving the host a name that's meaningful to me (for example, "someclient-server" or "myproj-backup") and setting the default username and port:

Host someclient-dev
    Hostname 11.22.33.44
    User dev

Host someclient-prod-app
    Hostname redbunny.myclient.com
    Port 4242
    User prod

Host someclient-prod-db
    Hostname bluefish.myclient.com
    Port 4242
    User db

These host alias can be used just about everywhere a hostname is passed to SSH, including:

  • SSH from the command line:

    $ ssh someclient-dev
    ...
    dev@11.22.33.44 $
    
  • git, mercurial, or other version control systems:

    $ git remote add dev someclient-dev:repo/
    
  • rsync:

    $ rsync -a media/ someclient-dev:media/
    

Not only does this mean I never need to remember weird hostnames or arbitrary usernames, but I can also open the file to see a list of all the machines I've ever had access to (which can be very useful when an old machines needs work done).

The bash-completion package is even .ssh/config aware, so tab completion will work as expected:

$ ssh someclient-<tab>
someclient-dev someclient-prod-app someclient-prod-db

Amazon EC2 key management is also a huge continence. Each time I get access to an Amazon EC2 instance I add the IdentityFile to the Host definition:

Host *.amazonaws.com
    User ec2-user

Host myapp
    Hostname ec2-1-2-3-4.compute-1.amazonaws.com
    IdentityFile ~/.ssh/aws-myapp.pem

As above, this will create the host alias myapp, and the identify file ~/.ssh/aws-myapp.pem will be used to connect (no more -I flag on the command line).

Finally, there are a few options that are useful to set for all hosts:

Host *
    # Instead of just printing the host key fingerprint as an opaque hex
    # string, print a pretty art. Ostensibly this is for security, but
    # mostly it's pretty:
    #     +--[ RSA 2048]----+
    #     | oE    ..        |
    #     |  ..   ...       |
    #     |   .  ooo        |
    #     |   oooooo        |
    #     |  . =+.+S+       |
    #     |   o.+o.o..      |
    #     |    o..          |
    #     +-----------------+
    VisualHostKey yes

    # Send explicit keepalive packets. This isn't often a problem, but I've
    # run into a few combinations of network and machine that will drop
    # inactive connections.
    KeepAlive yes
    ServerAliveInterval 60

    # SSH Agent Forwarding is described here:
    # http://www.unixwiz.net/techtips/ssh-agent-forwarding.html
    ForwardAgent yes

    # SSH Control Channels allow multiple SSH sessions to share one
    # connection. For example, the first time I run "ssh myapp", ssh will
    # create a new connection to the server (creating a TCP connection,
    # authenticating, etc). As long as that connection
    # is active, though, running "ssh myapp" from another terminal will
    # re-use the same TCP connection, authentication, etc, making the
    # command virtually instant.
    # Note that the ControlPersist option is important, otherwise all the
    # sessions will be disconnected when the master session closes.
    ControlPath ~/.ssh/control/master-%l-%r@%h:%p
    ControlMaster auto
    ControlPersist 60
Permalink + Comments

The Sadness of Python's super()

April 02, 2014 at 07:26 PM | Python | View Comments

The dangers of Python's super have been documented... but, in my humble opinion, not well enough.

A major problem with Python's super() is that it is not straight forward to figure out needs to call it, even if it doesn't seem like the method's parent class should need to.

Consider this example, where mixins are used to update a dictionary with some context (similar to, but less correct than, for example, Django's TemplateView):

class FirstMixin(object):
    def get_context(self):
        return {"first": True}

class BaseClass(FirstMixin):
    def get_context(self):
        ctx = super(BaseClass, self).get_context()
        ctx.update({"base": True})
        return ctx

class SecondMixin(object):
    def get_context(self):
        ctx = super(SecondMixin, self).get_context()
        ctx.update({"second": True})
        return ctx

class ConcreteClass(BaseClass, SecondMixin):
    pass

This looks correct... but it isn't! Because FirstMixin doesn't call super(), SecondMixin.get_context is never called:

>>> c = ConcreteClass()
>>> c.get_context()
{"base": True, "first": True} # Note that ``"second": True`` is missing!

Alternatively, image that FirstMixin.get_context() does call super():

class FirstMixin(object):
    def get_context(self):
        ctx = super(FirstMixin, self).get_context()
        ctx.update({"first": True})
        return ctx

This will also be incorrect, because now the call to super() in SecondMixin will trigger an error, because the final base class - object - does not have a get_context() method:

>>> c = ConcreteClass()
>>> c.get_context()
...
AttributeError: 'super' object has no attribute 'get_context'

What is a poor Pythonista to do?

There are three reasonably simple rules to follow when dealing with this kind of multiple inheritance:

  1. Mixins should always call super().
  2. The base class should not call super().
  3. The base class (or one of its super classes) needs to be at the right of sub-classe's list of base classes.

Note that this will often mean introducing an otherwise unnecessary *Base class.

To correct the example above:

# Following rule (1), every mixin calls `super()`
class FirstMixin(object):
    def get_context(self):
        ctx = super(FirstMixin, self).get_context()
        ctx.update({"first": True})
        return ctx

# Following rule (2), the base class does *not* call super.
class BaseClassBase(object):
    def get_context(self):
        return {"base": True}

# Notice that, to follow rule (3), an otherwise uneccessary base class has
# been introduced to make sure that the "real" base class (the one without
# the call to super) can be at the very right of the list of base classess.
class BaseClass(FirstMixin, BaseClassBase):
    pass

# Following rule (3), the base class comes at the right end of the list
# of base classess.
class ConcreteClass(SecondMixin, BaseClass):
    pass

This will guarantee that the mixins are always called before the base class, which doesn't call super() in get_context().

Note that this will still cause problems in the even that multiple base classess are used (ie, "true" multiple inheritance)... and there isn't much which can be done about that, at least in the general case.

It is also worth noting that that in many cases the best solution is to avoid inheritance all together, opting instead for a pattern better suited to the requirements of the specific problem at hand.

For example, in the sitaution from the example above - where many different "things" (in the above example: mixins and the base class) need to contribute to the "context" dictionary - one option which might be more appropriate is an explicit set of "context providers":

class FirstContextProvider(object):
    def __call__(self):
        return {"first": True}

class BaseClass(FirstMixin):
    context_providers = [
        FirstContextProvider(),
        lambda: {"base": True},
    ]

    def get_context(self):
        ctx = {}
        for provider in self.context_providers:
            ctx.update(provider())
        return ctx

class SecondContextProvider(object):
    def __call__(self):
        return {"second": True}

class ConcreteClass(BaseClass, SecondMixin):
    context_providers = BaseClass.context_providers + [
        SecondContextProvider(),
    ]

(recall that __call__ method is used to make instances of a call callable)

Edit: I was corrected by @lambacck, who pointed out the "base class on the right" rule: https://twitter.com/lambacck/status/451528854507905024

Permalink + Comments

Atomic Bank Balance Transfer with CouchDB

March 13, 2014 at 10:03 PM | Uncategorized | View Comments

Googling around the other day I was disappointed to find that the internet has a few incorrect examples of how atomic bank account transfers can be implemented with CouchDB... but I wasn't able to find any correct examples.

So here it is: the internet's first 100% complete and correct implementation of the classic "atomic bank balance transfer problem" in CouchDB.

First, a brief recap of the problem: how can a banking system which allows money to be transfered between accounts be designed so that there are no race conditions which might leave invalid or nonsensical balances?

There are a few parts to this problem:

First: the transaction log. Instead of storing an account's balance in a single record or document — {"account": "Dave", "balance": 100} — the account's balance is calculated by summing up all the credits and debits to that account. These credits and debits are stored in a transaction log, which might look something like this:

{"from": "Dave", "to": "Alex", "amount": 50}
{"from": "Alex", "to": "Jane", "amount": 25}

And the CouchDB map-reduce functions to calculate the balance could look something like this:

POST /transactions/balances
{
    "map": function(txn) {
        emit(txn.from, txn.amount * -1);
        emit(txn.to, txn.amount);
    },
    "reduce": function(keys, values) {
        return sum(values);
    }
}

For completeness, here is the list of balances:

GET /transactions/balances
{
    "rows": [
        {
            "key" : "Alex",
            "value" : 25
        },
        {
            "key" : "Dave",
            "value" : -50
        },
        {
            "key" : "Jane",
            "value" : 25
        }
    ],
    ...
}

But this leaves the obvious question: how are errors handled? What happens if someone tries to make a transfer larger than their balance?

With CouchDB (and similar databases) this sort of business logic and error handling must be implemented at the application level. Naively, such a function might look like this:

def transfer(from_acct, to_acct, amount):
    txn_id = db.post("transactions", {"from": from_acct, "to": to_acct, "amount": amount})
    if db.get("transactions/balances") < 0:
        db.delete("transactions/" + txn_id)
        raise InsufficientFunds()

But notice that if the application crashes between inserting the transaction and checking the updated balances the database will be left in an inconsistent state: the sender may be left with a negative balance, and the recipient with money that didn't previously exist:

// Initial balances: Alex: 25, Jane: 25
db.post("transactions", {"from": "Alex", "To": "Jane", "amount": 50}
// Current balances: Alex: -25, Jane: 75

How can this be fixed?

To make sure the system is never in an inconsistent state, two pieces of information need to be added to each transaction:

  1. The time the transaction was created (to ensure that there is a strict total ordering of transactions), and
  2. A status — whether or not the transaction was successful.

There will also need to be two views — one which returns an account's available balance (ie, the sum of all the "successful" transactions), and another which returns the oldest "pending" transaction:

POST /transactions/balance-available
{
    "map": function(txn) {
        if (txn.status == "successful") {
            emit(txn.from, txn.amount * -1);
            emit(txn.to, txn.amount);
        }
    },
    "reduce": function(keys, values) {
        return sum(values);
    }
}

POST /transactions/oldest-pending
{
    "map": function(txn) {
        if (txn.status == "pending") {
            emit(txn._id, txn);
        }
    },
    "reduce": function(keys, values) {
        var oldest = values[0];
        values.forEach(function(txn) {
            if (txn.timestamp < oldest) {
                oldest = txn;
            }
        });
        return oldest;
    }

}

List of transfers might now look something like this:

{"from": "Alex", "to": "Dave", "amount": 100, "timestamp": 50, "status": "successful"}
{"from": "Dave", "to": "Jane", "amount": 200, "timestamp": 60, "status": "pending"}

Next, the application will need to have a function which can resolve transactions by checking each pending transaction in order to verify that it is valid, then updating its status from "pending" to either "successful" or "rejected":

def resolve_transactions(target_timestamp):
    """ Resolves all transactions up to and including the transaction
        with timestamp ``target_timestamp``. """
    while True:
        # Get the oldest transaction which is still pending
        txn = db.get("transactions/oldest-pending")
        if txn.timestamp > target_timestamp:
            # Stop once all of the transactions up until the one we're
            # interested in have been resolved.
            break

        # Then check to see if that transaction is valid
        if db.get("transactions/available-balance", id=txn.from) >= txn.amount:
            status = "successful"
        else:
            status = "rejected"

        # Then update the status of that transaction. Note that CouchDB
        # will check the "_rev" field, only performing the update if the
        # transaction hasn't already been updated.
        txn.status = status
        couch.put(txn)

Finally, the application code for correctly performing a transfer:

def transfer(from_acct, to_acct, amount):
    timestamp = time.time()
    txn = db.post("transactions", {
        "from": from_acct,
        "to": to_acct,
        "amount": amount,
        "status": "pending",
        "timestamp": timestamp,
    })
    resolve_transactions(timestamp)
    txn = couch.get("transactions/" + txn._id)
    if txn_status == "rejected":
        raise InsufficientFunds()

A couple of notes:

  • For the sake of brevity, this specific implementation assumes some amount of atomicity in CouchDB's map-reduce. Updating the code so it does not rely on that assumption is left as an exercise to the reader.
  • Master/master replication or CouchDB's document sync have not been taken into consideration. Master/master replication and sync make this problem significantly more difficult.
  • In a real system, using time() might result in collisions, so using something with a bit more entropy might be a good idea; maybe "%s-%s" %(time(), uuid()), or using the document's _id in the ordering. Including the time is not strictly necessary, but it helps maintain a logical if multiple requests come in at about the same time.
Permalink + Comments

Why aren't composable test helpers a thing?

November 12, 2013 at 02:20 PM | Python | View Comments

Often test cases require a few stateful "things" to happen during each run, which are often configured in the setUp and tearDown methods of the test case.

For example, some common "things" in applications I've worked on are:

  • Capturing email messages sent during the test case
  • Capturing log messages emitted during the test case
  • Mocking Redis connections
  • Mocking the application's job queue

And the implementation often looks something like this:

class MyRegularTestCase(TestCase):
    def setup(self):
        self.mail = setup_mock_mail()
        self.logs = setup_mock_logger()
        self.redis = setup_mock_redis()
        self.jobs = setup_mock_job_queue()

    def teardown(self):
        self.mail.teardown()
        self.logs.teardown()
        self.redis.teardown()
        self.jobs.teardown()

    def test_foo(self):
        foo()
        self.mail.assert_message_sent("hello, world")
        self.logs.assert_logged("hello, world")
        self.redis.assert_list_contains("foo", "bar")
        self.jobs.assert_job_queued("some_job")

And that is a best case example; most of the time test code doesn't use high-level mocking objects, and setup code is copy+pasted between test classes.

This makes me wonder: why aren't composable test helpers a thing?

For example, a log capturing test helper might look something like this:

class LogCapture(object):
    def __init__(self, logger_name=''):
        self.logger = logging.getLogger(logger_name)

    def setup(self):
        self.records = []
        self.logger.addHandler(self)

    def teardown(self):
        self.logger.removeHandler(self)

    def emit(self, record):
        self.records.append(record)

    def assert_logged(self, message):
        for record in self.records:
            if message in record.getMessage():
                return
        raise AssertionError("No log message containing %r was emitted" %(message, ))

And this sort of composable helper could be used with a TestCase like this:

class MyBetterTestCase(ComposableTestCase):
    mail = MailCapture()
    logs = LogCapture()
    redis = MockRedis()
    jobs = MockJobQueue()

    def test_foo(self):
        foo()
        self.mail.assert_message_sent("hello, world")
        self.logs.assert_logged("hello, world")
        self.redis.assert_list_contains("foo", "bar")
        self.jobs.assert_job_queued("some_job")

I'll be building this kind of composable test helper into my application, and if it works well, I'll release a library.

Permalink + Comments

strftime: table of locale-aware formatters in different locales

April 13, 2013 at 10:44 PM | Python | View Comments

I got curious about what the different locale-specific strftime and strptime formatting directives produced in different locales… So I built a little script which would genreate a table showing each of the different locale-aware formatters in all of the different locales which I've got installed!

The code used to generate this table can be found at: bitbucket.org/wolever/locale-table.

%a %A %b %B %x %X %p %c
af_ZA Tue Tuesday Aug August 08/16/1988 21:30:05 PM Tue Aug 16 21:30:05 1988
am_ET ማክሰ ማክሰኞ ኦገስ ኦገስት 16/08/1988 21:30:05 PM ማክሰ ኦገስ 16 21:30:05 1988
be_BY аў аўторак жні жніўня 16.08.88 21:30:05 pm аў 16 жні 21:30:05 1988
bg_BG Вт Вторник Авг Август 16.08.88 21:30:05 pm Вт 16 Авг 21:30:05 1988
ca_ES dim dimarts ago agost 16/08/1988 21:30:05 PM dim 16 ago 21:30:05 1988
cs_CZ út úterý srp srpna 1988/08/16 21:30:05 od út 16 srp 21:30:05 1988
da_DK Tir Tirsdag Aug August 16.08.1988 21:30:05 pm Tir 16 Aug 21:30:05 1988
de_AT Di Dienstag Aug August 16.08.1988 21:30:05 pm Di 16 Aug 21:30:05 1988
de_CH Di Dienstag Aug August 16.08.1988 21:30:05 pm Di 16 Aug 21:30:05 1988
de_DE Di Dienstag Aug August 16.08.1988 21:30:05 pm Di 16 Aug 21:30:05 1988
el_GR Τρι Τρίτη Αυγ Αυγούστου 16/08/1988 21:30:05 μμ Τρι 16 Αυγ 21:30:05 1988
en_AU Tue Tuesday Aug August 16/08/1988 21:30:05 pm Tue 16 Aug 21:30:05 1988
en_CA Tue Tuesday Aug August 16/08/1988 21:30:05 pm Tue 16 Aug 21:30:05 1988
en_GB Tue Tuesday Aug August 16/08/1988 21:30:05 pm Tue 16 Aug 21:30:05 1988
en_IE Tue Tuesday Aug August 16/08/1988 21:30:05 pm Tue 16 Aug 21:30:05 1988
en_NZ Tue Tuesday Aug August 16/08/1988 21:30:05 pm Tue 16 Aug 21:30:05 1988
en_US Tue Tuesday Aug August 08/16/1988 21:30:05 PM Tue Aug 16 21:30:05 1988
es_ES mar martes ago agosto 16/08/1988 21:30:05 PM mar 16 ago 21:30:05 1988
et_EE T teisipäev aug august 16.08.1988 21:30:05 T, 16. aug 1988. 21:30:05
eu_ES as. asteartea Abu abuztua 1988/08/16 21:30:05 p.m. 1988 - Abu - 16 as. 21:30:05
fi_FI Ti Tiistai Elo Elokuu 16.08.1988 21:30:05 pm Ti 16 Elo 21:30:05 1988
fr_BE Mar Mardi aoû août 16.08.1988 21:30:05 Mar 16 aoû 21:30:05 1988
fr_CA Mar Mardi aoû août 16.08.1988 21:30:05 Mar 16 aoû 21:30:05 1988
fr_CH Mar Mardi aoû août 16.08.1988 21:30:05 Mar 16 aoû 21:30:05 1988
fr_FR Mar Mardi aoû août 16.08.1988 21:30:05 Mar 16 aoû 21:30:05 1988
he_IL ג' שלישי אוג אוגוסט 16/08/88 21:30:05 PM 21:30:05 1988 אוג 16 ג'
hr_HR Ut Utorak Kol Kolovoz 16.08.1988 21:30:05 pm Ut 16 Kol 21:30:05 1988
hu_HU Ked Kedd Aug Augusztus 1988/08/16 21:30:05 du Ked Aug 16 21:30:05 1988
hy_AM Երք Երեքշաբթի Օգս Օգոստոս 16.08.1988 21:30:05 Երեքշաբթի, 16 Օգոստոս 1988 ի. 21:30:05
is_IS þri þriðjudagur ágú ágúst 16.08.1988 21:30:05 eh þri 16 ágú 21:30:05 1988
it_CH Mar Martedì Ago Agosto 16.08.1988 21:30:05 pm Mar 16 Ago 21:30:05 1988
it_IT Mar Martedì Ago Agosto 16.08.1988 21:30:05 pm Mar 16 Ago 21:30:05 1988
ja_JP 火曜日 8 8月 1988/08/16 21時30分05秒 PM 火 8/16 21:30:05 1988
kk_KZ сс сейсенбі там тамыз 16.08.1988 21:30:05 сейсенбі, 16 тамыз 1988 ж. 21:30:05
ko_KR 화요일 8 8월 1988/08/16 21시 30분 05초 PM 화 8/16 21:30:05 1988
lt_LT An Antradienis Rgp rugpjūčio 1988.08.16 21:30:05 An Rgp 16 21:30:05 1988
nl_BE di dinsdag aug augustus 16-08-1988 21:30:05 pm di 16 aug 21:30:05 1988
nl_NL di dinsdag aug augustus 16-08-1988 21:30:05 pm di 16 aug 21:30:05 1988
no_NO tir tirsdag aug august 16.08.1988 21:30:05 pm tir 16 aug 21:30:05 1988
pl_PL wto wtorek sie sierpnia 1988.08.16 21:30:05 wto 16 sie 21:30:05 1988
pt_BR Ter Terça Feira Ago Agosto 16/08/1988 21:30:05 Ter 16 Ago 21:30:05 1988
pt_PT Ter Terça Feira Ago Agosto 16.08.1988 21:30:05 Ter 16 Ago 21:30:05 1988
ro_RO Mar Marţi Aug August 16.08.1988 21:30:05 pm Mar 16 Aug 1988 21:30:05
ru_RU вт вторник авг августа 16.08.1988 21:30:05 вторник, 16 августа 1988 г. 21:30:05
sk_SK ut utorok aug august 16.08.1988 21:30:05 ut 16 aug 21:30:05 1988
sl_SI tor torek avg avgust 16.08.1988 21:30:05 pm tor 16 avg 21:30:05 1988
sr_YU уто уторак авг август 16.08.1988 21:30:05 уто 16 авг 21:30:05 1988
sv_SE Tis Tisdag Aug Augusti 16.08.1988 21:30:05 pm Tis 16 Aug 21:30:05 1988
tr_TR Sal Salı Ağu Ağustos 16/08/1988 21:30:05 PM Sal 16 Ağu 21:30:05 1988
uk_UA вт вівторок сер серпня 16.08.1988 21:30:05 вт 16 сер 21:30:05 1988
zh_CN 星期二 8 八月 1988/08/16 21时30分05秒 下午 二 8/16 21:30:05 1988
zh_HK 周二 8 8月 1988/08/16 21時30分05秒 下午 二 8/16 21:30:05 1988
zh_TW 周二 8 8月 1988/08/16 21時30分05秒 下午 二 8/16 21:30:05 1988
Permalink + Comments