A Reasonable Approximation

Latest posts

The Mathematics of Matched Betting

I've been doing a little bit of matched betting lately1. The idea is that you place two opposite bets on the same event, calibrated so that your profit will be the same no matter which bet wins. If you do this entirely with your own money, your profit will (under reasonable assumptions) be negative. But bookmakers often offer free bets; you can use matched betting to extract most of the amount of that free bet as actual money.

This post isn't advice about how to get into matched betting. That market is probably saturated; if you want to learn, I used this guide and it served me well. (However, if anyone is inspired to try it by this post, I have a referral code for smarkets: safto14. It looks like if you sign up with that and bet $£20$, both you and I will receive $£10$ in risk-free bets. I swear that's not why I'm posting this. I might not even get around to using it.)

(Um, but after that I feel obliged to give at least some safety information. So here goes: if you're not a UK citizen currently in the UK, this may be a bad idea. Don't use credit cards to deposit funds; it seems they interpret that as a cash transaction and charge fees. Start small; that way there's less at risk in case you do something silly like use a credit card to deposit funds. Probably don't expect to earn lots of money this way, either in total or per-hour.)

Instead, I want to go into the math behind it, in more depth than I've seen in other places. None of this math is complicated, but some of it is useful, and I haven't found it anywhere else. (I even deliberately went looking.)

A simple example

(If you've seen one example of matched betting, you've seen them all, and you can skip this particular one.)

You have a $£10$ free bet at a bookmaker. You find a football game, say Manchester Utd versus Liverpool, that you want to bet on. The bookmaker offers odds of $4$ on Liverpool, and you bet your $£10$ on them.

(A note on odds: the usual convention in gambling seems to be to use decimal odds. Odds of $x$ mean that your potential winnings are $x-1$ times your stake. Thus, odds of $4$ mean a bet of $£10$ has the potential to pay out $£30$. If you're used to odds notated $a:b$ or (equivalently) $a/b$, then the decimal odds are given by $a/b + 1$.)

So if Liverpool wins, you'll earn $£30$; if they lose or draw, you lose nothing. You then look up the same match at a betting exchange. An exchange allows you to take both sides of a bet, which a bookmaker won't. The exchange offers odds of $4.3$ to lay Liverpool; this means that you win your bet in the exchange only if Liverpool doesn't win. You accept a stake of $£6.98$, which means your own stake is $£23.03$.

Now if Liverpool wins the match, the bookmaker pays you $£30$ and you lose $£23.03$ in the exchange, for a net profit of $£6.97$. And if Liverpool loses, you earn $£6.98$ in the exchange and lose nothing at the bookmaker, for a net profit of $£6.98$. You've turned a $£10$ free bet into almost $£7$ of actual money.

(I'm ignoring for now the commission that the exchange will usually collect when you win a bet on them. With $2\%$ commission, you would instead accept stakes of $£7.01$, wagering your own $£23.13$; if Liverpool doesn't win, you would earn $£7.01 · 0.98 = £6.87$, which is also what you'd earn if Liverpool does win.)

Before bookmakers will give you a free bet, you'll usually have to place a bet with them using your own money. You lose a small amount of money on this bet, but you can use the same principles to ensure that you lose the same amount no matter who wins. You might lose around $£0.50$ on a $£10$ qualifying bet, in which case you end up with around $£6.50$ profit when all's said and done.

This has been a very brief introduction to matched betting. Now, into the math. I'm going to be focusing on two kinds of bet: qualifying bets, which are usually known as just bets, and free bets, where you don't lose anything if your back bet loses. I'm also going to ignore rounding; let's just pretend that the sterling is infinitely divisible.

Some definitions

We can think of a paired bet as having six parameters, $(O_b, O_l, S_b, S_l, C_b, C_l)$. These are three parameters for each of a pair of back and lay bets.

$O_b, O_l ≥ 1$ are the odds on the back and lay bets. It's typically safe to assume $O_b < O_l$; otherwise, modulo commission, you could make a profit even on your qualifying bets2. They can't be less than $1$ because we're using decimal odds; that would correspond to a probability below $0$.

$S_b, S_l ≥ 0$ are the stakes on the back and lay bets. Note that $S_l$ is the stake offered by the other party to your lay bet; it's (roughly) the amount you stand to win on that bet, not the amount you stand to lose. This may seem strange, but it's the convention used.

And $C_b, C_l ∈ [0, 1]$ are the commission charged on your winnings on each side. Usually $C_b = 0$: bookmakers don't charge commissions, they make money by offering low odds. The two exchanges I've used have $C_l = 2\% = 0.02$ (Smarkets) and $C_l = 5\% = 0.05$ (Betfair).

I'm also going to introduce the symbol $C^+ = (1 - C_l)(1 - C_b)$. If you passed $£1$ through your bookmaker and exchange, and they each charged commission and nothing else, you would have $£C^+$ left at the end. $C^+$ isn't enough for us to fully understand a matched bet, we need the individual back and lay commissions as well, but it'll be convenient shorthand.

Now let $R_{xy}$ (where $x,y ∈ \{b,l\}$) be your return on side $y$ if your bet on side $x$ wins. So for a qualifying bet (which is just a regular bet, outside the context of matched betting), we have:

$$ \begin{align*}
     R_{bb} &= S_b (O_b - 1) (1 - C_b)     \\
     R_{bl} &= - S_l (O_l - 1)             \\
     R_{lb} &= - S_b                       \\
     R_{ll} &= S_l (1 - C_l).
\end{align*} $$

For a free bet, the only change is

$$ R_{lb} = 0. $$

So your profit is $R_{bb} + R_{bl}$ if your back bet wins; and $R_{lb} + R_{ll}$ if your lay bet wins.

And now we can say that a matched bet is simply a paired bet, where your profit is the same in either case. I won't need to talk about simple paired bets from now on; all bets are matched. When I talk about a "free bet" or "qualifying bet", those are matched bets too.3

Incidentally, the six prameters are over-determined. Most commonly we want to learn $S_l$ given the other five; but knowing any five of them will fix the value of the sixth.

Optimal lay stake

The first question we'll ask is, given $O_*$, $C_*$ and $S_b$, what must $S_l$ be to make our bet a matched bet? Or in other words, what $S_l$ should we choose to eliminate all risk?

We need

$$ R_{bb} + R_{bl} = R_{lb} + R_{ll} $$

which after some (perhaps silly-seeming) substitution and rearrangement gives

$$ S_l = { R_{bb} - R_{lb} \over R_{ll}/S_l - R_{bl}/S_l }. $$

This looks circular, but when we substitute for the values of $R_{**}$, $S_l$ disappears from the right hand side. For a qualifying bet, this gives

$$ S_l = S_b { (O_b - 1)(1 - C_b) + 1 \over O_l - C_l }, $$

and for a free bet,

$$ S_l = S_b { (O_b - 1)(1 - C_b) \over O_l - C_l }. $$

A thing to note here is that $O_l$ and $C_l$ only appear in the term $O_l - C_l$. In other words, the effect of lay commission is to decrease the effective lay odds in the most natural way. It would be nice if this happened in other contexts too, but unfortunately I haven't come across it. The $O_l - C_l$ term is common, but it's usually accompanied by another $O_l$ and/or $C_l$ somewhere else in the expression.

Profit

Next, we'll want to know how much profit we make. This is given by $R_{lb} + R_{ll}$, where we calculate $R_{ll}$ using the lay stake we just found. But since both of these terms are proportional to $S_b$, we'll find it more convenient to think in terms of profit per unit of back stake,

$$ P = { R_{lb} + R_{ll} \over S_b }. $$

Under a qualifying bet, this is

$$ P_q = C^+ { O_b + C_b/(1 - C_b) \over O_l - C_l } - 1, $$

and for a free bet, it's

$$ P_f = C^+ { O_b - 1 \over O_l - C_l }. $$

We can look at these functions graphically:

(Graph of P_q(O_b, O_l)) (Graph of P_f(O_b, O_l))

(all images link to larger versions)

each line represents a contour of the function, a set of points that all have the same profit. The sets of contours look superficially similar, but they're generally steeper for a free bet, and they get cut off on the bottom edge instead of the left edge. In both cases, profit increases with $O_b$ and decreases with $O_l$.

We can reparameterise in terms of $O_b$ and $σ = O_l - O_b$, the spread between the back and lay odds. Since $O_l ≥ O_b$, we only need to consider $σ ≥ 0$. This gives us

$$ \begin{align*}
    P_q &= C^+ { O_b + C_b/(1 - C_b) \over O_b + σ - C_l } - 1  \\
    P_f &= C^+ { O_b - 1 \over O_b + σ - C_l }.
\end{align*} $$

(Graph of P_q(O_b, σ)) (Graph of P_f(O_b, σ))

These are slightly more distinct. Looking at these graphs, it seems that for a qualifying bet, having low $σ$ is more significant than having high $O_b$; but for a free bet, having high $O_b$ is more significant than having low $σ$. If so, that suggests you might want to be looking for different sorts of bets at each stage. (It is so, and we'll make it more precise later.)

We can also look at $P_f - P_q$, the difference in profit between a qualifying bet and a free bet. This isn't particularly useful to compare bets: you place qualifying bets to get free bets, and you place free bets to get money, and if you're doing pure matched betting, I don't think you'll ever be asking yourself should I place this bet free or as a qualifier? Still, the difference is

$$ P_f - P_q = 1 - { (1 - C_l)(1 - 2C_b) \over O_l - C_l }. $$

The more $O_l$ grows, the worse a qualifier becomes relative to a free bet. This is another suggestion that you should be looking at different sorts of bets for your qualifiers and your free bets.

Liability

One more thing is important when making a matched bet: lay liability. This is how much you stand to lose on the exchange where you make your lay bet. (It's only important for boring real-world reasons like liquidity and the possibility of doing something stupid, but those are still important.) You need to have this much money in your account at the exchange, which means you need to be able to spare it from your bank account for a week or so. Low-liability bets are also safer if something goes wrong, which makes them a good choice for early dabblers in matched betting.

Liability is simply given by $-R_{bl} = S_l (O_l - 1)$, which is

$$ S_b (O_l - 1) { (O_b - 1)(1 - C_b) + 1 \over O_l - C_l } $$

for a qualifying bet and

$$ S_b (O_l - 1) { (O_b - 1)(1 - C_b) \over O_l - C_l }  $$

for a free bet.

(Graph of liability(O_b, O_l), qualifying bet) (Graph of liability(O_b, σ), free bet)

(I made the graphs in $σ$ as well, but they're too boring to include inline)

Unlike profit, liability increases with both $O_b$ and $O_l$. But it increases arbitrarily with $O_b$, and asymtotically with $O_l$; it's bounded above by roughly $S_b O_b$ for a qualifying bet and $S_b (O_b - 1)$ for a free bet. (If the graphs were extended further, as they stretch upwards the lines would become ever more vertical, but they'd always stay separate. To the right, the lines would become ever more horizontal, all of them converging on $O_l = 1$.)

Improving on a free bet

Matched bet calculators aren't hard to find, and what I've given so far is nothing that they can't do for you. But they don't tell you everything you might want to know. Let's look at a bet, and see how we might find a better bet. Since the two types have different behaviours, we'll treat them separately.

To maximise profit, we usually need to consider that $S_b, C_b$ and $C_l$ are fixed, and find the dependence of $P$ on $O_b$ and $O_l$. For a free bet, that means we want to maximise the term

$$ P_f ∝ {O_b - 1 \over O_l - C_l}. $$

This tells us a few things. The first is that we want high back odds and low lay odds. We already knew that, and it's not very helpful; we expect back and lay odds to more-or-less rise and fall together. It also tells us that adding a constant to both odds will increase profit; odds of 5 and 6 will be better than odds of 4 and 5. (This, too, we could have deduced before; or we could have seen it on the graph of $P_f(O_b, σ)$.)

But consider what happens when $σ = 0$. Then the term in question is

$$ { O_b - 1 \over O_b - C_l } $$

which, as $O_b$ ranges from $1$ to $∞$, takes all values in $[0, 1)$. But when $σ > 0$, the possible values are exactly the same; high $σ$ changes the $O_b$ that gives you any particular profit, but it doesn't make any profit value available or unavailable.

What that means is: given any free bet, we can construct another free bet with equal profit but $σ = 0$, not changing $S_b$ or $C_*$.

Or: given odds $O_b, O_l$, we can calulate the odds $O'$ that would give you the same profit, if you could find these odds for both a back and a lay bet.

In turn, that tells you that if you want to improve your profits, you can ignore bets with $O_b < O'$. (Because for those bets, $P_f(O_b, σ) < P_f(O', σ) ≤ P_f(O', 0)$. The first inequality comes from adding a constant to both odds, and the second comes from reducing $O_l$.) This is a useful thing to know, that matched bet calculators don't tell you.

To find $O'$, we set

$$ { O_b - 1 \over O_l - C_l } = { O' - 1 \over O' - C_l } $$

and deduce

$$ \begin{align*}
    O' &= { O_l - O_bC_l \over 1 + O_l - O_b - C_l } \\
       &= O_b { 1 - C_l + σ/O_b \over 1 - C_l + σ }.
\end{align*} $$

The expression with $σ$ isn't exactly simpler, but I think it's more aesthetically pleasing. (Consider that $1-C_l$ is approximately as fundamental as $C_l$ itself.) Graphically:

(Graph of P_f(O_b, Ol)) (Graph of P_f(O_b, σ))

We can also calculate $O'$ simply as a function of profit, and vice versa:

(Graph of P_f(O'))

$$ P_f = C^+ { O' - 1 \over O' - C_l }                 \\
   O' = { C_lP_f - C^+ \over P_f - C^+ } $$

$P_f$ approaches an asymtote at $C^+$, but slowly. With $C_b = 0, C_l = 0.02$, extracting $80\%$ of a free bet is only possible if $O_b ≥ 5.36$. For $90\%$, you need $O_b ≥ 12.03$. Such bets are somewhat rare in my experience, and typically have high spread.

We can go more general. Given a profit, we can calculate the level curve of all bets which generate that profit; the case $σ=0$ gives us only a single point on that curve. The curve divides bet-space into two regions, so that it's easy to see whether a bet gives more or less than this amount of profit.

(Earlier we saw this level curve graphically, for certain specific profits. Now we find the explicit formula for the curve, which I secretly already used to draw the graphs.)

We already have

$$ \begin{align*}
    P_f &= C^+ { O_b - 1 \over O_l - C_l }    \\
        &= C^+ { O_b - 1 \over O_b + σ - C_l },
\end{align*} $$

and it's just a matter of rearranging these:

$$ O_b C^+ = P_f (O_l - C_l) + C^+\\
   O_b (C^+ - P_f) = P_f (σ - C_l) + C^+. $$

These two equations can be used to find $O_b$ in terms of $O_l$ or $σ$, and vice-versa. Both are very simple at heart: they're linear relationships, that could be rearranged to the form $y = mx + c$.

Looking more closely at the second one, notice that $C^+$ is the upper bound on profit. So the term $C^+ - P_f$ can be thought of as how much profit is being left on the table, compared to what you could hypothetically get if odds of $∞$ were a thing. The less profit you leave behind, the less $σ$ has to change to compensate for a given change in $O_b$. In other words, when profit is high, the level curve on the graph of $P_f(O_b, σ)$ becomes shallower, as we saw above.

Improving on a qualifying bet

For a qualifying bet, we can't quite do the same thing. If we temporarily assume $C_b = 0$, then the term we want to maximise is

$$ P_q + 1 ∝ {O_b \over O_l - C_l}. $$

This doesn't work the same as the equivalent term for a free bet. If you keep $σ$ fixed and consider profit as a function of $O_b$, then this function acts differently depending on the sign of $σ - C_l$. If $σ ≤ C_l$, then regardless of $O_b$ you get more profit than is ever possible with $σ > C_l$.

This isn't immediately practically important, because $σ > C_l$ is a pretty safe assumption. But it's mathematically significant. For a free bet, setting $σ$ to $0$ doesn't rule out any profit levels, so we could ask "how would we get this particular profit with $σ = 0$?" If we try to ask that for a qualifying bet, the answer is typically that we can't. So the approach we used for a free bet doesn't work on a qualifying bet.

We also can't set $O_b$ to its best possible value, because it can go arbitrarily high. But we can try setting it to its limiting worst value ($O_b = 1$). We find $σ'$ such that

$$ { O_b + C_b/(1 - C_b) \over O_b + σ - C_l }
   = { 1 + C_b/(1 - C_b) \over 1 + σ' - C_l }, $$

which gives us

(Graph of σ'(O_b, σ))

$$ σ' = { σ + (O_b - 1)(1 - C^+) \over 1 + (O_b - 1)(1 - C_b) }. $$

Now we know that any bet with a spread less than $σ'$ will give better profit than the bet we started with. Unfortunately, I think this still isn't as good as what we got for a free bet, for three reasons.

  1. For a free bet, we had an easy negative test: some bets (those with $O_b < O'$) could be ruled out on a glance, but verifying them took more work. Here, the test is positive: some bets (those with $σ < σ'$) can be accepted on a glance, but verifying the others takes more work.4

    In practice, I expect the positive test will almost alway be inconclusive, meaning you still need to do the more difficult check on every bet. (I haven't done enough betting myself, while writing this, to say from experience.)

  2. My workflow is to find a plausible-looking back bet and then see how it would be matched. With a free bet, I can run the easy test without looking for the match. For a qualifying bet, I need to find both sides of the bet before I can run the easy test.

  3. Qualifying bets often must be placed at a minimum odds (on the back side) in order to count. That typically rules out the lowest-spread bets (see below digression).

Still, this is what we have. Following a similar theme as before, we can calculate $σ'$ and $P_q$ as functions of each other:

(Graph of P_q(σ'))

$$ P_q = { 1 - C_l \over 1 - C_l + σ' } - 1 \\
   σ' = { 1 - C_l \over P_q + 1 } + C_l - 1. $$

(Note that these equations don't contain $C_b$. That's not because we're assuming it's $0$: when you set $O_b = 1$, $C_b$ completely disappears from the equation for $P_q$.)

Interestingly, the bounds of $P_q$ don't depend on commission at all. As $σ'$ grows, $P_q$ always approaches an asymtote at $-1$, which isn't surprising: you can't quite lose your entire back stake, but you can get arbitrarily close to that, even with no commission.

On the other edge of the graph, we always have $P_q(O_b=1, σ'=0) = 0$. (That may not be clear on this image, but it's easy to see algrebraically.) That's because at $O_b = O_l = 1$, both bets are completely one-sided. On the back side you have a chance of losing money, but no way to win it; on the lay side you have a chance of winning money, but no way to lose it. In particular, if the back bet wins, you make no profit or loss on either bet, so commission is irrelevant. And so the lay stake is calibrated for your lay winnings, after commission, to cancel out your back loss. (But if someone is willing to give you free maybe-money, you might as well ask for as much maybe-money as they're willing to give you.)

And again, given profit, we can calculate the level curve of bets which return that profit. Unsurprisingly, we find another linear relationship; it comes to

$$ O_bC^+ + C_b(1 - C_l) = (Pq + 1)(Ol - Cl) \\
   O_b(1 - C_b - Λ) + C_b = Λ(σ - C_l), $$

where

$$ Λ = { P_q + 1 \over 1 - C_l }. $$

I'm afraid I can offer no particular interpretation of what $Λ$ means, though I observe that we can substitute it into a previous equation, $σ' = 1/Λ + C_l - 1$. Note that if $Λ ≥ 1 - C_b$, equivalently if $P_q + 1 ≥ C^+$, then $σ$ and $O_b$ start to move in opposite directions: for fixed profit, $σ$ goes up as $O_b$ goes down. At this point, you get more profit with low $O_b$ as well as with low $σ$, which would be convenient if it was ever realistically going to happen.

(It turns out that $P_q + 1 ≥ C^+$ precisely when $σ ≤ C_l + C_b/(1 - C_b)$. I noted above that if $C_b = 0$, the possible values of $P_q + 1$ depend on the sign of $σ-C_l$. This is the same result, generalized to all values of $C_b$.)

A digression on odds

Note that in general, you can expect spread to be lower at lower odds. That's because odds are more sensitive to evidence when they're high than when they're low.

There's a technical interpretation of that, but I'm just going to illustrate by example. Consider the probabilities $1/5$ and $1/6$. These are complementary to $4/5$ and $5/6$ - the two pairs of probabilities encode the same information. Converting to decimal, probabilities $1/5$ and $1/6$ are decimal odds $1.25$ and $1.2$; and probabilities $4/5$ and $5/6$ are decimal odds $5$ and $6$.

So the difference between the odds $1.25$ and $1.2$ is, in a very important sense, the same as the difference between $5$ and $6$. But when it comes to betting, the spreads of $0.05$ and $1$ are very different.

The takeaway from this is that for qualifying bets, you should be looking at bets with low odds. High odds have better returns, but the effect of low spread is much more significant, and low spread comes with low odds.

The effects of commission

I want to explore one more question: how does profit depend on commission? For this, we'll keep $C_b$ and $O_b$ fixed, and explore how $O_l$ and $C_l$ affect profit.

Conveniently, the term we want to maximise is the same,

$$ \begin{align*}
    P_q + 1 ∝ { 1 - C_l \over O_l - C_l }  \\
    P_f ∝ { 1 - C_l \over O_l - C_l }.
\end{align*} $$

So if we find the same lay bet on two different exchanges, we can compare them without regard for the back bet we'd be matching.

The two exchanges I've used have $C_l$ of $0.02$ and $0.05$, so they give equal profits when

$$ { 0.98 \over O_S - 0.02 } = { 0.95 \over O_B - 0.95 } $$

where $O_S$ is the odds offered on Smarkets and $O_B$ is the odds offered on Betfair. This rearranges to

$$ 98·O_b = 95·O_S. $$

Since $98/95 ≈ 1.03$, it's better to use Betfair than Smarkets if the offered odds are roughly $3%$ lower, which happens to be the difference in commission. So for example, odds of $6$ on Betfair correspond to roughly $6.19$ on Smarkets.

It should be easy to take a bunch of equivalent bets on the two sites, and compare to see which seems likely to give better profit in general. I started doing that, but then I looked at three football games and got bored and stopped.

They all had exactly the same odds on all three positions (win/draw/win), even when they fluctuated slightly pre-game. (I did look at one game as it began, and the two sites didn't quite stay in sync then. But betting while odds are fluctuating a lot is a bad idea.) This suggests that Smarkets typically offers better profits. But Betfair is a more popular site, which probably has advantages; in particular, it's plausible that a large bet would be harder to fully match on Smarkets.


And that's it. There are plenty more interesting questions you could ask5, but I'm going to stop there.

Something that would be nice would be a calculator that can make use of this. The online calculators all seem pretty crap: they only tell you profit, lay stake and lay liability, and only for one bet at a time. Being able to compare bets seems like it would be a pretty important feature, but I haven't seen it anywhere. (Some of them have features to work with more complicated types of bets than I've looked at, but I don't care about that. Here's one that's no worse than any other.) I've also seen an excel calculator, which had the neat feature of automatically adding bets to a spreadsheet. But again, only one bet at a time; plus, I don't have excel, and don't keep track of my bets in a spreadsheet. (I use ledger, though it's not a great fit.)

I've written a command-line tool that can show you multiple bets at a time for comparison purposes. It also shows you, for a free bet, the lowest possible back odds to improve on your highest-profit bet ($O'$); or, for a qualifying bet, the highest possible spread that won't improve on it ($σ'$).

But the interface isn't very good, I think partly because of the limits of the command line and partly because of a questionable design decision (see the README). And it can't show graphs, which I think would be nice. If you want to use it anyway, it's on github.

If I were to dedicate more time to the project, I currently think I'd start again in Javascript. I think I have a vague idea of how a decent one could work. But right now, as far as I can tell there are no good calculators.

  1. Well, I haven't done any for a few months now. This post took a long time to write. 

  2. I think maybe sometimes bookmakers will offer improved odds as a special offer, and maybe then sometimes you'll find $O_b > O_l$. But it's not a case I'll particularly consider in this post. 

  3. I'm also assuming that all free bets are "stake not returned". In a standard bet, if you win, you get your stake back and also collect your winnings. If you win a stake not returned free bet, you collect your winnings; but you don't also get to keep the stake, because that wasn't yours in the first place. If you have a "stake returned" free bet, that effectively increases the odds on the back side by $1$. I've not yet encountered one of these, myself.

    Another type is the "risk-free" bet, which I won't go into here partly because I'm not 100% sure what it means. But I think that "£10 in risk-free bets" allows you to make a bet of more than £10, and if you lose, you get £10 back. I think the way to treat it is as putting £10 into a free bet (stake not returned) and the remainder of your stake into a qualifying bet, and so by default you should put in no more than the risk-free amount. 

  4. Shape of the "more profitable" space

    Another way to look at this is by the shape of the "more profitable" space on the graphs of $P_q(O_b, σ)$ and $P_f(O_b, σ)$, as seen here. On the $P_f$ graph, $O'$ carves up the "less profitable" space without entering the "more profitable" space. On the $P_q$ graph, $σ'$ carves up the "more profitable" space without entering the "less profitable" space. There's no equivalent of $O'$ for a qualifying bet, or of $σ'$ for a free bet. 

  5. A thing I'd like to explore at some point, is qualifying bets where you only win a free bet under certain circumstances. For example, you bet on a team to win a football match, and you win a free bet iff your team is winning at the end of both halves. I think you can still guarantee a profit with the right combination of bets, but there are more variables involved. (On the exchange, you lay your team to win, like normal. That way you turn a profit if they win both halves, and lose a small amount otherwise. So you also lay them to win both halves - that's not a standard win/lose/draw bet, but I think it's generally available. If you pick the right stakes, you can turn a profit in both cases. Though probably not exactly the same profit in both cases, because you can't easily predict exactly how much the free bet will be worth.) 

Posted on 02 June 2018 | Comments

Pareto improvements are rarer than they seem

this is surely not an original insight, but I haven't seen it before

A Pareto improvement is where you make one party better off and no parties worse off.

Suppose Adam has a rare baseball card. He assigns no intrinsic value to baseball cards. Adam likes Beth, and somewhat values her happiness. Beth collects baseball cards, and would happily pay $100 for Adam's card.

If Adam just gives Beth his baseball card, is that a Pareto improvement? Naively, yes: he loses the card that he doesn't care about, and gains her happiness; she gains the card. Both are better off.

But I claim not, because if Adam has the card, he can sell it to Beth for $100. He would much prefer doing that over just giving her the card. But if Beth has the card, he can't do that. He assigns no intrinsic value to the card, but he can still value it as a trading chip.

Now suppose Adam has the baseball card but Beth also has a copy of that card. Then Beth has less desire for Adam's card, so this situation also isn't a Pareto improvement over the original. By giving something to Beth, we've made Adam's situation worse, even though Adam likes Beth and values her happiness.

And I think situations like this are common. The ability to give someone something they want, is a form of power; and power is instrumentally useful. And the less someone wants, the less able you are to give them something they want1.

For a closer-to-reality example, the reddit comment that sparked this post said:

bringing Platform 3 back into use at Liverpool Street Underground Station was denied because the platform would not be accessible. Neither of the platforms currently in use for that line is accessible, so allowing Platform 3 to be used would be a Pareto improvement

The model here is that there are two parties, people who can access the platforms at Liverpool St and those who can't. If Platform 3 is brought back into use, the first group gains something and the second group loses nothing.

But I think that if Platform 3 is brought back into use, the second group loses some power. They lose the power to say "we'll let you bring back Platform 3 if you give us…". Maybe Platform 3 can be made accessible for $1 million. Then they can say "we'll let you bring it back if you make it accessible", but they can't do that if it's already back in use.

And they lose some power to say "if you ignore us, we'll make things difficult for you". Maybe it would take \$1 trillion to make Platform 3 accessible. If Platform 3 remains out of use, people are more likely to spend \$1 million to make their building projects accessible, because they've seen what happens if they don't. Conversely, if Platform 3 comes back, people are more likely to exaggerate future costs of accessibility. "If I say it costs \$1 million, I'll have to pay. If I say it costs \$10 million, maybe I won't."

I haven't researched the situation in question, and I expect that the actual power dynamics in play don't look quite like that. But I think the point stands.

(My original reply said: "If it's easier to turn an unused inaccessible platform into a used accessible platform, than to turn a used inaccessible platform into a used accessible platform - I don't know if that's the case, but it sounds plausible - then opening the platform isn't a Pareto improvement." That still seems true to me, but it's not what I'm talking about here. There are lots of reasons why something might not be a Pareto improvement.)

This doesn't mean Pareto improvements don't exist. But I think a lot of things that look like them are not.


Update 2018-02-02: some good comments on reddit and LessWrong. Following those, I have two things in particular to add.

First, that I like /u/AntiTwister's summary: "If you have the power to prevent another party from gaining utility, then you lose utility by giving up that power even if you are allies. There is opportunity cost in abstaining from using your power as a bargaining chip to increase your own utility."

Second, that there is a related (weaker) concept called Kaldor-Hicks efficiency. I think that a lot of the things that look-like-but-aren't Pareto improvements, are still Kaldor-Hicks improvements - meaning that the utility lost by the losing parties is still less than the utility gained by the winners. In theory, that means that the winners could compensate the losers by giving them some money, to reach a Pareto improvement over the original state. But various political and practical issues can (and often do) get in the way of that.

  1. This feels like it generalizes far beyond questions of Pareto efficiency, but I'm not sure how to frame it. Something like game theory is more competitive than it appears. Even when no two players value the same resource, even when all players genuinely want all other players to do well, players still have an incentive to sabotage each other. 

Posted on 27 January 2018 | Comments

Improving goimports

Epistemic status: Pure dilettantism. I have never used Go. This might make me unqualified to say what I'm about to say. But it's okay, because I use the word "seems" a lot.

In Go, if you have an unused import, your program fails to compile. This has made a lot of people mildly annoyed and been sometimes regarded as a bad idea, but not universally.

The devs decline to add a compiler flag to change this behaviour, "because compiler options should not affect the semantics of the language and because the Go compiler does not report warnings, only errors that prevent compilation". This strikes me as reasonable, if not the decision I personally would make.

Instead, they suggest this pattern:

import "unused"

// This declaration marks the import as used by referencing an
// item from the package.
var _ = unused.Item  // TODO: Delete before committing!

- that is, add a line of code to trick the compiler into thinking that you're using the import, even though you're not. (Unless you forget to delete the trick line, which the compiler won't verify for you.)

This does not strike me as a very good substitute. With a compiler flag, I could turn it on while debugging and turn it off for production to make sure I had no unused imports. I could use a commit hook to keep unused imports out of the repository. By tricking the compiler, there's no particularly easy way to do this. (I suppose I could put the string "FAKEIMPORT" is a comment on those lines, and grep for that string. This is still not a great solution.)

I also refer to the devs' opinion that "if it's worth complaining about, it's worth fixing in the code. (And if it's not worth fixing, it's not worth mentioning.)" I claim that there's no fundamental difference between

import unused

and

import unused
var _ = unused.Item

Neither affects the semantics of the program, if we ignore that the first one doesn't compile1. If one is worth complaining about, so is the other. But the devs are sending mixed signals. It seems the first is worth complaining about, because the compiler complains. But it seems the second is not, because the devs recommend it. This should be a sign that something about this whole situation is less than ideal2.

(For myself, I would say that both are worth complaining about in production, and neither is worth complaining about when debugging. But the trick works equally well in both instances.)

There is another solution, in the form of a tool called goimports. The idea is that you don't write imports at all. If you have a symbol that isn't imported, it searches your filesystem for an appropriate package and adds an import line. If you have an unused import, it deletes it.

But word on the street is that some names are ambiguous, supplied by multiple packages, and goimports has no way to choose which one you mean. So if you comment out a line, and then uncomment it, goimports might add back a different import than the one you originally used. This, too, seems less than ideal.


All of which is to say: although I don't use Go, it seems to me that Go has a problem, and that the existing solutions aren't perfect.

I propose a new solution, which could be implemented as two new modes for goimports to run in or as one or two completely new tools.

In the first mode, this tool acts like goimports, but more conservatively. Instead of removing unused imports, it merely comments them out. And instead of searching your filesystem for packages, it merely searches your comment lines, and uncomments them if there's a match.

So if you're debugging, and comment out the only use of a package, this tool will comment out the import for you. When you uncomment that use, the import will be added back in, but without the ambiguous naming problem. At no point do you have to trick the compiler, so you don't have to remember to stop tricking the compiler.

In the second mode, this tool checks for commented out import lines, and tells you whether it found any (or optionally deletes them). It can be called in commit hooks, to prevent such lines from cluttering up a repository.

This seems to me like it would be an improvement on the status quo.

  1. At least, they don't look like they do. Not knowing Go, it's conceivable that they could. If the second can have effects, then the devs' recommendation seems particularly bad. In that case, if a var _ = unused.Item line shows up in a codebase, it's probably a mistake - but you can't be sure without checking, which takes much longer. 

  2. I want to clarify three things. First, I do not think the devs are wrong to recommend using this trick. Second, I do not think the second version is just as bad as the first. The second version has a more obvious smell, which is valuable, because bad code that smells bad is less bad than bad code that smells good. Third, I said the situation is less than ideal, and that's what I meant. I do not think the situation is terrible. 

Posted on 13 January 2017 | Comments

The Parable of the Clock

The monk Dawa had a clock that had stopped, and he was content. When he wished to know the hour, he would glance at the clock, and discover that it was noon.

One day a visiting friend commented on the clock. "Why does your clock say that the hour is noon, when I am quite sure that it is six in the evening?"

Dawa found this unlikely, for the hour had always been noon in his experience. But he had been instilled with the virtues of curiosity and empiricism. If the hour is noon, I desire to believe it is noon. If the hour is six in the evening, I desire to believe it is six in the evening. Let me not become attached to beliefs I may not want. Thus fortified, he sought out other clocks.

The time was indeed six in the evening. In accordance with the virtue of relinquishment, and gently laughing inside at his past foolishness, Dawa serenely set his broken clock forwards by six hours.

Posted on 22 December 2016 | Comments

Data surgery is not rocket science

This is a story of me failing to do something, and some thoughts on how I might have succeeded.

A friend had a problem. He'd been recording some audio on his phone when the battery died, leaving him with a .m4a file that couldn't be played. He had a look at the contents and it contained a bunch of data, so he offered $40 if someone could recover the audio for him.

This seemed like a fun challenge that would exercise skills there isn't much call for. I didn't necessarily expect to succeed, but I decided to give it a go. (I only remember doing something like this once before, when I had a corrupt FAT partition that I managed to fix with some bit-flipping.)

To help, the friend provided two small files from the same app: one a successful recording, and one corrupt like the target file.

The simplest thing was to simply try playing the broken files with mplayer, just in case. It didn't work, and gave an error message saying (among other things) "moov atom not found".

The next thing was to look at all the files in a hex editor, which in this case was "hexdump -C piped into less" because I don't think I have a dedicated hex editor installed. I quickly noticed that the good recording had the bytes moov at location 0x1d, while the corrupt recordings both had the bytes free there.

I also noticed that all three files had the bytes mdat at location 0xc95, followed by some low-entropy data, and then some apparently high-entropy data that seemed to go on to the end of the file. I guessed that this was the actual audio data, while the start of the good recording was a valid audio header.

00000000  00 00 00 18 66 74 79 70  6d 70 34 32 00 00 00 00  |....ftypmp42....|
00000010  69 73 6f 6d 6d 70 34 32  00 00 0c 79 66 72 65 65  |isommp42...yfree|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000c90  00 3f 3f 3f 3f 6d 64 61  74 01 40 22 80 a3 7f f8  |.????mdat.@"....|
00000ca0  85 2d 2d 2d 2d 2d 2d 2d  2d 2d 2d 2d 2d 2d 2d 2d  |.---------------|
00000cb0  2d 2d 2d 2d 2d 2d 2d 2d  2d 2d 2d 2d 2d 2d 2d 2d  |----------------|
*
00000da0  2d 2d 2d 2d 2d 2d 2d 2d  2d 2d 2d 2d 2e ff f1 0a  |------------....|
00000db0  5a 5a 5a 5a 5a 5a 5a 5a  5a 5a 5a 5a 5a 5a 5a 5a  |ZZZZZZZZZZZZZZZZ|
*
00000eb0  5a 5a 5a 5a 5a 5a 5a 5a  5a 5a 5a 5d e0 e2 14 b4  |ZZZZZZZZZZZ]....|
00000ec0  b4 b4 b4 b4 b4 b4 b4 b4  b4 b4 b4 b4 b4 b4 b4 b4  |................|
00000ed0  b4 b4 bc 00 e2 36 2e 70  5b 0c 09 88 8b b1 2a ae  |.....6.p[.....*.|
00000ee0  9d 55 6c 14 4c 2a 2a 55  45 44 01 c5 35 93 14 a0  |.Ul.L**UED..5...|

The start of the target file. hexdump prints a * to indicate that several lines have been skipped which were identical to the one above. The audio data seems to start around position 0xed2.

So that gave me another simple thing to try. I took the first 0xc95 bytes from the good file, and byte 0xc95 onwards from the small corrupt file, and concatenated them.

head -c 3221 tinytest_notbroken.m4a > fixtest-1.m4a
tail -c +3222 tinytest_broken.m4a >> fixtest-1.m4a

After this, fixtest-1.m4a played in mplayer. It printed lots of warning messages while it did so, but whatever.

So I did the same thing with the target file, and recovered two seconds of the original audio. I could report to my friend that it started with him saying "um".

This is what I was expecting, since the good recording was only two seconds, and it would presumably have had the length in the header. But now I needed to work out how to lengthen it.

I played around with ffmpeg, but it didn't seem to have an option for "ignore the audio duration stated in the header", so it seemed I would have to fix the header myself.

ftyppm42 seemed like it might be a file type indicator, so I googled that. A few clicks led to the MP4 registration authority, which suggested that the relevant standard was called "ISO/IEC 14496-14". So I then googled "ISO/IEC 14496-14 track header", and found a pdf of the relevant standard. Unfortunately this is a high-context document, I didn't particularly understand it, and it didn't help me very much.

I also found wikipedia on MP4, which pointed me to ISO base media file format, ISO/IEC 14496-12. Google then gave me a pdf of that standard, which was much more detailed, helpful, and lengthy.

I didn't attempt to understand it all. But I searched for "mdat", and shortly after one of the hits, I found the following description of a data structure:

aligned(8) class MovieHeaderBox extends FullBox(‘mvhd’, version, 0) {
    if (version==1) {
        unsigned int(64) creation_time;
        unsigned int(64) modification_time;
        unsigned int(32) timescale;
        unsigned int(64) duration;
    } else { // version==0
        unsigned int(32) creation_time;
        unsigned int(32) modification_time;
        unsigned int(32) timescale;
        unsigned int(32) duration;
    }
    template int(32) rate = 0x00010000; // typically 1.0
    template int(16) volume = 0x0100; // typically, full volume
    const bit(16) reserved = 0;
    const unsigned int(32)[2] reserved = 0;
    template int(32)[9] matrix =
        { 0x00010000,0,0,0,0x00010000,0,0,0,0x40000000 };
        // Unity matrix
    bit(32)[6] pre_defined = 0;
    unsigned int(32) next_track_ID;
}

Promising! My header contained the following two lines:

00000020  00 00 00 6c 6d 76 68 64  00 00 00 00 d4 6b 4f 0d  |...lmvhd.....kO.|
00000030  d4 6b 4f 0d 00 00 03 e8  00 00 08 17 00 01 00 00  |.kO.............|

That's mvhd from the description, followed by four null bytes, followed by two identical 32-bit ints - the creation and modification times would naturally have been the same - and then two more ints.

timescale was 0x3e8, decimal 1000, which per spec means 1000 time units pass in a second. duration was 0x817, decimal 2071, indicating a track 2.071 seconds long. So presumably I need to edit this value.

What to set it to? As long as it's sufficiently long, it's not important. So I set it to 0x10000817, using emacs (in hexl-mode) as my editor, and tried again. No dice, it still cut off at two seconds.

So I searched "duration" in the same file, and found two more places to edit. One soon after the bytes tkhd, which used the same timescale as the mvhd; and one soon after the bytes mdhd, which had its own timescale. In this case the timescale was 0xac44, decimal 44100.

But adding 0x1000000 to both of these durations didn't help either.

(After one of these three edits - I don't remember which - mplayer started reporting the file as being 74 hours long. But it still cut out after two seconds.)

At this point I was up late and out of ideas. But I realized that I could just ask my friend to record another long track, and then use the header from that to do the job. So I left him instructions, plus some notes on where I'd got to, so someone else could pick up where I'd left off if that didn't work.

Then while I was asleep, someone else came along and fixed it for him before he tried that. I don't know how.

But here are some thoughts on how I might have proceeded, if my friend hadn't been able to record the longer audio.

For one thing, I might have been able to generate my own long file in ffmpeg. But attempting this now, I can't make it work. I concatenate several copies of the good recording, and get a file that starts with these three lines:

00000000  00 00 00 18 66 74 79 70  4d 34 41 20 00 00 02 00  |....ftypM4A ....|
00000010  69 73 6f 6d 69 73 6f 32  00 00 00 08 66 72 65 65  |isomiso2....free|
00000020  00 07 93 5e 6d 64 61 74  01 40 22 80 a3 7f f8 85  |...^mdat.@".....|

It plays fine. But when I try to mix it with the target file, it doesn't play. It complains about the missing moov atom, even though that's also missing in the working concatenated file. I'm not sure what's wrong.

Failing that, I could have recorded the long file myself in the same app my friend used, to spare his attention for things he actually wanted to be doing. (Which, after all, was the reason he was willing to pay someone else to fix the file.) I could also, for curiousity's sake, have recorded another short file, and attempted to find more durations by comparing the headers.

But perhaps the simplest thing would have been to take a completely different approach from the beginning. It turns out that other people have encountered this problem before, and paved the way for those like me. For example, someone on facebook posted this page, which teaches you to fix these files using a piece of software called "faad.exe". More research reveals that this is open source, and even available in portage.

(I also find references to FAAD if I google "fix corrupt m4a file".)

It looks like this was a case of "more haste, less speed". At least the haste was fun, but probably not worth the $40 I might have earned by being thorough.

Posted on 17 December 2016 | Comments

The Sally-Anne Fallacy

I'd like to coin a term. The Sally-Anne fallacy is the mistake of assuming that somone believes something, simply because that thing is true.1

The name comes from the Sally-Anne test, used in developmental psychology to detect theory of mind. Someone who lacks theory of mind will fail the Sally-Anne test, thinking that Sally knows where the marble is. The Sally-Anne fallacy is also a failure of theory of mind.

In internet arguments, this will often come up as part of a chain of reasoning, such as: you think X; X implies Y; therefore you think Y. Or: you support X; X leads to Y; therefore you support Y.2

So for example, we have this complaint about the words "African dialect" used in Age of Ultron. The argument goes: a dialect is a variation on a language, therefore Marvel thinks "African" is a language.

You think "African" has dialects; "has dialects" implies "is a language"; therefore you think "African" is a language.

Or maybe Marvel just doesn't know what a "dialect" is.

This is also a mistake I was pointing at in Fascists and Rakes. You think it's okay to eat tic-tacs; tic-tacs are sentient; therefore you think it's okay to eat sentient things. Versus: you think I should be forbidden from eating tic-tacs; tic-tacs are nonsentient; therefore you think I should be forbidden from eating nonsentient things. No, in both cases the defendant is just wrong about whether tic-tacs are sentient.

Many political conflicts include arguments that look like this. You fight our cause; our cause is the cause of [good thing]; therefore you oppose [good thing]. Sometimes people disagree about what's good, but sometimes they just disagree about how to get there, and think that a cause is harmful to its stated goals. Thus, liberals and libertarians symmetrically accuse each other of not caring about the poor.3

If you want to convince someone to change their mind, it's important to know what they're wrong about. The Sally-Anne fallacy causes us to mistarget our counterarguments, and to mistake potential allies for inevitable enemies.

  1. From the outside, this looks like "simply because you believe that thing". 

  2. Another possible misunderstanding here, is if you agree that X leads to Y and Y is bad, but still think X is worth it. 

  3. Of course, sometimes people will pretend not to believe the obvious truth so that they can further their dastardly ends. But sometimes they're just wrong. And sometimes they'll be right, and the obvious truth will be untrue. 

Posted on 09 April 2016 | Comments

Political Polarization in the US House of Representatives

I've created an interactive graph of historical levels of political polarization in the US House of Representatives. It would be tricky to embed in this blog, so I'm only linking it. Summary:

The x-axis on this graph is based on DW-NOMINATE left-right scores of each member of each U.S. House of Representatives from 1865 to 2015. This uses a member's voting record to measure the direction and extremity of their political views, regardless of party affiliation.

If a member's score on this axis is known, it's possible to predict their vote on any given issue with high confidence, given no other information about the member. Members whose votes are typically left-aligned receive negative scores, while members whose votes are typically right-aligned receive positive scores.

(However, see The Thin Blue Line That Stays Strangely Horizontal, which questions the validity of DW-NOMINATE.)

Background: I made this last year for a Udacity course, "Data Visualization and D3.js". I needed to submit a visualization for marking, and this was my submission. I'm grateful for feedback provided by Udacity and by some of my friends. Without that, the result would certainly have been worse.

The source is available on GitHub, including datasets and some python scripts I used to process them. The README also documents some of the design history.

I'm aware of one bug: in firefox (38.6.1 on linux), the legend appears to display the 5-95 and 25-75 percentile boxes identically. They're implemented as rects with fill-opacity: 0.3: the 25-75 boxes have two of these rects on top of each other. This is also how the paths on the graph itself are colored.

I assume there are other bugs.

Posted on 26 February 2016 | Comments

Officially faceblind

I realized a few years ago that I was at least somewhat faceblind/prosopagnosic. A while back I took an online test out of curiousity, and scored low. They said that if I was in London and interested in further tests, I should leave my email address. A few days ago I went in for those tests, and now I have a PhD student (Katie) also telling me I'm faceblind. Which makes it official, I guess.

Next she wants to run EEGs on me, which should be cool. That will help work out where my brain is going wrong, in the long chain between "photons stimulate nerve endings in my eyeballs" and "I recognize a face" (whatever that means). Also, apparently there's a phenomon which sounds to me like blindsight, where some prosopagnosics' brains are clearly reacting to faces on some level that doesn't reach their consciousness. She wants to learn more about that too.

What follows is discussion of the tests, my scores, and what they mean. I've been given a powerpoint with my scores reported as percentiles, along with absolute scores and average control scores. 2% or lower is counted as "impaired". Percentiles are only given as integers, or as "<1%". On the day, Katie also gave me some numbers in terms of standard deviations (σ). Under a normal distribution, 2.5% would be approximately -2σ, but I'm not sure any of these results will be normally distributed, so I don't know if σ scores really tell me anything.

A note: if you think you might be faceblind, and you'd be interested in getting more detailed tests, it might be a good idea not to read the below. I expect it wouldn't significantly bias the results if you did, except for one bit that I've rot13ed. But I don't trust myself to make that call. If you're in London, you can take the above test like me and see what happens. Otherwise I'm not sure how you'd go about getting more tests.


The object/face recognition tests were "memorise these things, then we show you a sequence of things and you have to say if each of these things was a thing in the first set". The things were houses, cars, horses, and bald women's faces. I was bad at all of these: 4% for cars, 2% for houses, and <1% for horses and women. (Average score was higher for women than horses, and my score was higher for horses than women, so I'm worse at women than horses. I think Katie told me I was somewhere between -5σ and -6σ for women. Under normality, -5σ is one in thirty million, but this is clearly not normal.) So it seems I have some level of general object agnosia, but more specific prosopagnosia on top of that.

I was 11% for reading emotions from eyes, which is a point for Team Phil Does Not Have Aspergers (some of my friends are divided about that). In fact, the average score is 26 and I scored 23, and there were a few cases where I said an answer, then thought "wait no it's this" and didn't say anything because I wasn't sure if I should. (I was speaking my answer and Katie was recording it. I had to choose from four emotions, so I'm not sure why this wasn't recorded by a computer like most of the other tests.) So plausibly I'm actually above 11%.

I was <1% at famous face recognition, recognising five out of fifty that I'd been exposed to, out of sixty in total. (I got Jvyy Fzvgu, Uneevfba Sbeq, Dhrra Ryvmnorgu, Ebova Jvyyvnzf, and surprisingly Ovyy Pyvagba.) It seems that controls tend to get above 40, so even counting that "exposed to" is vague, I did really badly at this. I think Katie said I was -9σ, which would be one in 10^19 under normality.

I'm <1% at the Cambridge Memory Test for Faces, which is the one I linked above. I actually scored worse in the lab than online. (47% versus 58%, IIRC, with a control average of 80%, and 60% indicating impairment. But the lab score I've been given is 34 against control of 58, so it's clearly been adjusted.) There could be any number of reasons for this, including "chance". But when I took it online, I often thought that one of the faces looked a little like Matt Damon, and chose that one. I like to think that "mistaking people for Matt Damon" is the way forward in face recognition.

I was somewhat okay at half of the Cambridge Face Perception Test. In this one, they showed me a face at an angle, and below it the same face face-on, six times, with varying degrees of modification. I had to order them according to how similar each was to the original face, within a minute. For half the tests, the faces were all upside down. For all of the tests, they all looked incredibly similar and my instinctive reaction was WTF.

On the upright test, I got <1%. On the inverted test, I got 7%. One strategy I used a few times was to focus on the lips, specifically on the size of the dip in the bow. I just ordered them according to that. I guess it helps, but I found it a lot easier for inverted faces.

Doing better at inverted would seem to suggest that I'm doing some kind of holistic face processing that goes wrong and blocks off later avenues for perception. Buuut, objectively I scored worse on the inverted faces, just not as much worse as average, so I'm not sure if this does suggest that. (And I'm not sure it is "objectively" - if all the faces had been assigned to the other condition, would my scores have changed?)

Hypothetically, high scores on both tests could indicate my problem is with memory, not initial perception. The low score here doesn't mean I don't have a problem with memory, but it does seem to hint that I do have a problem with initial perception. And I suspect the famous faces test points at me also having a memory problem.

Posted on 19 January 2016 | Comments

A Scream of Swifties

Inspired by Slate Star Codex

Explaining a joke is like dissecting a frog: it's one way to learn about frogs. If you want me to explain any of these, ask, and I will explain without making fun of you.


"I hear someone playing a triangle in the corridor," said Tom haltingly.

"We've got to overturn every last insect in this garden," said Tom flippantly.

"Goose feathers are annoyingly fragile," said Tom, breaking down.

"Anastasia gives me pins and needles," said Christian gratingly.

"I miss my submissive," René opined.

"I didn't do it, and nor did any of my siblings," Tom insisted.

"It's not much paint, it won't hurt to run your tongue up it," said Tom metallically.

"I'm so sick, even my flu has the flu," said Tom metallurgically.

"It's pitch black and I can hear a monster doing arithmetic," said Tom gruesomely.

"Man City don't hold a candle to the real heros of Manchester," said Tom manually.

"I just bought Manchester United," said Tom virtuously.

"Lancelot told me I was his favourite!" said Tom, surprised.

"I don't think this tube of semen is entirely straight," said the incumbent.

"I can fit inside my GameCube!" said Tom inconsolably.

"In a former life, I was a priest in pre-Columbian Peru," said Tom inconsolably.

"I need a name for my squid-and-flatfish restaurant," said Tom inconsolably.

"I make a living as the red tellytubby," said Tom, apropos.

"I'm doing crunches so I can get a six-pack," said Treebeard absently.

"I'm half-fish and made of lithium," said Treebeard limerently.

"Figure three plots counts of close-ups on male versus female genitals," said Tom pornographically.

"My breasts don't have enough room in this corset," said Victoria, double depressed.

"Bring me the head of my enemy," said Emacs vicariously.

"I have affirming the consequent, base rate neglect, and now also ad hominem," said Tom, getting cocky.

"We lost the treaty, so we had to ratify it again," said Tom, resigned.

"I'm in the group supporting Shiva's wife," said Tom with satisfaction.

Posted on 01 January 2016 | Comments

Unix pro tip: Print and pipe

Sometimes you (or at least, I) want to run a command for its output, but also want to pipe it through another command. For example, see the results of a find but also count how many hits it got. I've sometimes lamented that there's no easy way to do this. But the other day I had a flash of insight and figured out how:

find . | tee /dev/stderr | wc -l
proc1 | tee /dev/stderr | proc2    # general case

(I'm pretty proud of this. I don't know if it's original to me, but I discovered it independently even if not.)

tee will print the output of proc1 to both stdout and stderr. stderr goes to the terminal and stdout goes to proc2.

You can make it more convenient with an alias:

alias terr='tee /dev/stderr | '
find . | terr wc -l

(Putting a pipe in an alias seems to work in both zsh and bash.)

If you want to concatenate the streams, to pipe them to another process, you can use subshells:

proc1 | ( terr proc2 ) 2>&1 | proc3

but note that stderr output from proc2 will also get sent to proc3, unless you send it somewhere else. I haven't yet thought of a use for this.

There are potential issues with buffering here. I'm not aware that tee makes any promises1 about which order it writes the streams in. It's going to be interlacing them while it writes, so that it doesn't need to keep a whole copy in memory. So (if the input is large enough) proc2 will be receiving input before it's finished being written to stderr, and might start writing output, and then the output streams can get interlaced.

For some values of proc2, commands which start printing before they've finished reading, this is inevitable. But I think useful proc2s are likely to be aggregators2 - by which I mean, commands which can't do anything until they've finished reading all their input3. In my tests so far, those have been safe, but that doesn't prove much.

We can do a more reliable check with strace:

find . | strace tee /dev/stderr | wc -l

By the looks of things, tee will read into a buffer, then write it to stdout (the pipe), then write it to the specified target (stderr, which goes to the terminal), and repeat to exhaustion. But the important thing is, it doesn't close any file descriptors until it's finished writing everything, and then it closes the target before it closes stdout. If this is consistent amongst tee implementations - and it seems sensible4 - then aggregators almost definitely won't interlace their output with the output from proc1. I don't want to say "definitely", because there might be other stuff going on that I haven't accounted for. But at any rate, tee will finish writing before the aggregator starts.

Anyway, I see this as being the sort of thing that you're likely use manually, not in an automated process. So if the output does get interlaced a little, it's probably not that big a deal.

  1. Note how I didn't say guaranTEEs? You're welcome. 

  2. In fact, wc is the only proc2 I can currently think of, that I expect to be regularly useful. (And then just piping through cat -n does something similar, but it's not ideal.) numsum or perhaps some other tools from num-utils could conceivably be handy from time to time. 

  3. You can turn a non-aggregator into an aggregator with moreutils' sponge command, which is like cat but reads all its input before writing anything. 

  4. They could reasonably close their outputs in either order, which might make a difference. But I don't think it's usual for programs to explicitly close their std* streams, that gets taken care of by the surrounding infrastructure. So if tee explicitly closes its target, I'd expect that to happen before stdout gets closed. 

Posted on 07 October 2015 | Comments

London Cycle Hires and Weather

Preface: I wrote this report for Udacity's "Explore and Summarize Data" module. The structure is kind of strange for a blog post, but I'm submitting the finished report essentially unchanged.

One thing I will note. I find that the cycle hire usage doesn't change much throughout the year. Shortly after submitting, I read this article which finds that it does vary quite a lot. I'm inclined to trust that result more. It's intuitively sensible, and it looks directly at the number of rides taken, instead of looking at a proxy like I do.

Take this as evidence for how much to trust my other results.

My goal is to investigate usage of the London cycle hire scheme, and in particular how it varies with the weather. I'm running an analysis from July 2013 to June 2014.

I'm using two data sets here. Daily weather data comes from Weather Underground, using the weather station at London Heathrow airport.

(London City Airport is closer to the bike stations that I use, but the data from that airport reports 0 precipitation on every single day. The data from Heathrow seems to be more complete, and I expect it to be almost as relevant.)

I collected the cycle hire data myself, over the course of the year, by downloading CSV files from an unofficial API which now appears to be defunct. It has a granularity of about ten minutes. That's about 50,000 entries per docking station for the year, so for this analysis, I'm only using the data from four docking stations near my office.

All data and source code used for this project can be found in the git repository.

Exploring the weather data

Temperature

plot of chunk temp.1v plot of chunk temp.1v

These variables measure the minimum, average, and maximum daily temperatures. The graphs all look similar, and overlap a lot. The shape is a little surprising, as I didn't expect the density graphs to be bimodal. It could potentially be caused by significant differences between summer and winter, with an abrupt shift between the two.

Rainfall

plot of chunk rain.1v plot of chunk rain.1v

According to the rain column, There are over 225 rainy days and only about 125 non-rainy days. But by far the most common bin for precip.mm is the leftmost one. Table of values of precip.mm:

## 
##     0  0.25  0.51  0.76  1.02  2.03  3.05  4.06  5.08   6.1  7.11  7.87 
##   207    35    20     9    17    22    12     8    12     4     4     2 
##  8.89  9.91 10.92 11.94 13.97 
##     3     5     2     1     2

Although more than half of observations have rain == TRUE, more than half of them also have precip.mm == 0, which needs more investigation. Rainfall as measured by precip.mm versus as measured by rain:

plot of chunk rain.precip.2v plot of chunk rain.precip.2v

The two measures don't always agree. Sometimes rain is false but precip.mm is nonzero; and often rain is true but precip.mm is zero. Neither of those is surprising individually: if rain is only counted when the rainfall exceeds a certain threshold, then that threshold could be large (giving false/nonzero) or small (giving true/zero). But the combination suggests that that isn't what's going on, and I don't know what is.

This table counts the anomalies by turning precip.mm into a boolean zero/nonzero (false/true) and comparing it to rain:

##        
##         FALSE TRUE
##   FALSE   119    9
##   TRUE     88  149

There are 88 instances of true/zero, 9 instances of false/nonzero, but the cases where they agree are the most common.

I find precip.mm to me more plausible here. I feel like fewer than half of days are rainy. This website agrees with me, saying that on average, 164 days out of the year are rainy (rain - 237, precip.mm - 158).

Wind

plot of chunk wind.1v plot of chunk wind.1v

These three measures of wind speed are all averages. wind is simply the average wind speed over a day. wind.max is the daily maximum of the average wind speed over a short time period (I think one minute). gust is the same thing, but with a shorter time period (I think 14 seconds).

Unlike with temperature, the three measures look different. All are right-skewed, although gust looks less so. There are several outliers (the isolated points on the box plots), and the quartiles don't overlap. The minimum gust speed (about 24) is almost as high as the median wind.max.

Exploring the bike data

Time between updates

plot of chunk dt.1v

There are a few outliers here. Not all the lines are visible due to rendering artifacts, but above 5000, we only have five entries:

##                     name        prev.updated             updated
## 46779    Earnshaw Street 2013-10-03 08:50:23 2013-10-13 09:20:28
## 46899  Southampton Place 2013-10-03 08:50:22 2013-10-13 09:20:27
## 46918       High Holborn 2013-10-03 08:50:24 2013-10-13 09:20:30
## 47049         Bury Place 2013-10-03 08:50:26 2013-10-13 09:20:32
## 175705 Southampton Place 2014-06-20 17:36:06 2014-06-30 08:30:03

The first four of these happened when my collection script broke and I failed to realize it. The other occurred when Southampton Place was taken out of service temporarily.

Let's zoom in on the lower ones:

plot of chunk dt.1v.left

There are several instances where the time between updates is unusually large, on the order of hours or days. The times of entries with between 2000 and 5000 minutes between updates:

##                     name        prev.updated             updated
## 32650       High Holborn 2013-08-31 15:10:07 2013-09-02 12:30:05
## 32660         Bury Place 2013-08-31 15:10:08 2013-09-02 12:30:07
## 32672  Southampton Place 2013-08-31 15:10:05 2013-09-02 12:30:04
## 32674    Earnshaw Street 2013-08-31 15:10:06 2013-09-02 12:30:05
## 38546       High Holborn 2013-09-14 22:39:00 2013-09-16 08:24:22
## 38719         Bury Place 2013-09-14 22:39:02 2013-09-16 08:24:23
## 38734  Southampton Place 2013-09-14 22:38:58 2013-09-16 08:24:20
## 38735    Earnshaw Street 2013-09-14 22:38:59 2013-09-16 08:24:21
## 84066         Bury Place 2013-12-27 15:40:08 2013-12-29 23:10:14
## 84069       High Holborn 2013-12-27 15:40:06 2013-12-29 23:10:13
## 84073  Southampton Place 2013-12-27 15:40:05 2013-12-29 23:10:11
## 84078    Earnshaw Street 2013-12-27 15:40:05 2013-12-29 23:10:12
## 84186    Earnshaw Street 2013-12-30 00:10:05 2013-12-31 13:10:07
## 84202       High Holborn 2013-12-30 00:10:06 2013-12-31 13:10:09
## 84269  Southampton Place 2013-12-30 00:10:05 2013-12-31 13:10:06
## 84330         Bury Place 2013-12-30 00:10:07 2013-12-31 13:10:11
## 89443  Southampton Place 2014-01-12 20:20:10 2014-01-14 18:40:07
## 89459       High Holborn 2014-01-12 20:20:13 2014-01-14 18:40:11
## 89467         Bury Place 2014-01-12 20:20:14 2014-01-14 18:40:16
## 89524    Earnshaw Street 2014-01-12 20:20:11 2014-01-14 18:40:09
## 121381   Earnshaw Street 2014-03-15 14:50:06 2014-03-17 01:50:04
## 121398      High Holborn 2014-03-15 14:50:07 2014-03-17 01:50:05
## 121444        Bury Place 2014-03-15 14:50:10 2014-03-17 01:50:07
## 121591 Southampton Place 2014-03-15 14:50:05 2014-03-17 01:50:04
## 133765      High Holborn 2014-04-11 16:59:37 2014-04-14 01:29:07
## 133900   Earnshaw Street 2014-04-11 16:59:36 2014-04-14 01:29:05
## 133961        Bury Place 2014-04-11 16:59:38 2014-04-14 01:29:08
## 134027 Southampton Place 2014-04-11 16:59:35 2014-04-14 01:29:05

It looks like these happened to all stations simultaneously, suggesting problems with either my collection script or the API, rather than problems with individual locations.

Entries with less than 60 minutes between updates, no longer on a log scale:

plot of chunk dt.1v.60

In the vast majority of cases, updates are approximately ten minutes apart. This encourages me to take a subset of the data (bikes.all -> bikes), considering only entries with d.updated less than 15 minutes. This eliminates many outliers in future graphs.

Date and time of update

plot of chunk date.time.1v plot of chunk date.time.1v

All times of day are approximately equally represented to within ten minutes, which is good. There are five noticeable troughs preceeded by spikes, but they probably don't signify much. Dates are a lot less uniform, however. Even apart from the ten-day period where my script was broken, many days have significantly fewer updates than typical, and some have none at all.

Number of days spent with a given number of active docks

plot of chunk ndocks.time.2v

It was common for every station to report less than a full complement of docks. At least two had a full complement for less than half the time (High Holborn and Bury place are unclear in that respect). This isn't surprising, since a bike reported as defective will be locked in, using up a slot but not being available for hire.

Journeys taken throughout the year

plot of chunk date.journeys.2v

The time of year makes very little difference to the number of rides. There appears to be a slight sinusoidal relationship, but it's very weak. (I didn't do a PMCC test because that assumes that any relationship is linear, which we would naively expect not to be the case here, and also doesn't look true from the graph.)

Journeys by weekday

plot of chunk weekday.journeys.2v plot of chunk weekday.journeys.2v plot of chunk weekday.journeys.2v

Fewer journeys are taken on weekends. The median number of bikes available doesn't change much throughout the week (5 on monday and friday, 4 on other days), but the distribution does. Saturday and Sunday have noticeably different shapes to the others. They have a single peak, while weekdays are somewhat bimodal, with a small peak where the station is full (probably when people are arriving at work).

(Since the stations have different numbers of docks, I did a graph of fullness rather than of number of bikes. The density plot doesn't show peaks exactly at 0 and 1 because of how the density window works, but histograms of num.bikes and num.spaces show that that's where they are. It would be difficult to use a histogram for this graph because there's no sensible binwidth.)

Change in number of bikes between updates

plot of chunk bikes.prevbikes.name.mv

## 
##  Pearson's product-moment correlation
## 
## data:  bikes$num.bikes and bikes$prev.num.bikes
## t = 2466.8, df = 173250, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.9859301 0.9861908
## sample estimates:
##      cor 
## 0.986061

There's very strong correlation between the number of bikes in adjacent entries. This is as expected, especially given what we saw about d.num.bikes previously. The colors here don't show any particular station-dependent trends.

Number of bikes at any given time