(Content note: minor untagged spoilers for the solutions to certain logic puzzles. No major spoilers.)
(Epistemy note: It’s impossible to talk about how easy/hard a puzzle is that I discovered the answer to years ago, sometimes without working it out for myself. I’m about to do that anyway.)
There seems to be some kind of taxonomy for logic puzzles and how hard they are and the sort of strategies that are needed to solve them.
This might be too subjective to be worth talking about, but I suspect not, and the way to find out is to talk about it and see if people understand me, so without further ado:
Let's vaguely pretend that logic puzzles have a sort of fundamental particle which is the insight. You collect insights, and then you use insights to collect other insights, and eventually you win.
Now I'm going to describe five types of puzzle.1
Deep: For example, the blue eyes puzzle. I consider this one to be actually pretty easy. You ask what happens with one blue-eyed person, then two, then three, and just proceed by induction. (Even if you don’t know the terms common knowledge and mutual knowledge and the difference between them, in the two-person case you can say “she tells Alice that Bob can see blue eyes” and in the three-person case you can say “she tells Alice that Bob knows that Carol can see blue eyes” and so on.) There's only one real insight, which you apply lots of times. Or just a couple of times until you get bored and assume/prove the pattern continues.
(I feel like there's a lot of similarity between solving a logic puzzle and proving a math theorem. Here, the math equivalent is proof by induction.)
Wide: Cheryl’s birthday is also quite easy, but what difficulty it has lies in breadth rather than depth. Every line tells you something new, and you need to extract a different insight from each of them.
(Math equivalent: a theorem with a complicated statement but a proof where every line just reads "because X, Y".)
Trivial puzzles are the limiting case of either of the above. I don't know any famous ones, but the blue-eyed puzzle with one blue-eyed person would count.
(Math equivalent: modus ponens.)
Verify: Then there are problems where as far as I know you kind of just have to intuit (or guess) the solution, and then demonstrate that it works. (The demonstration is a recursive problem that deserves its own classification, or possibly doesn't even count as a logic puzzle. Often it is just math.2) The key insight is hidden away, with no hints as to what it might be. The fork in the road is a good example of this, I think, where the demonstration has reasonable claim to being trivial.
It’s a very fuzzy and personal category, because there are patterns that you may be able to recognize which would move it towards wide.
(Math equivalent: a proof which introduces some new device that you don't know where it came from, but it turns out to be exactly what you need.)
Learn: And there are probably also problems where, if you happen to know the right thing and it comes to mind at the right time, this will be trivial (or wide or...). Otherwise you’ll have to deduce thing for yourself, good luck with that. I can’t offhand think of any canonical examples.
This category may be even fuzzier and more personal, and probably overlaps a lot with verify, and is hard to detect except in hindsight. (If you can work out what you need to know, it moves towards wide or verify.)
Of course every puzzle has a thing which is the solution to that particular puzzle; and many puzzles rely on the ability to do basic arithmetic or other such skills. Those obviously "shouldn't count" to clasify a puzzle as learn, but I don't have a principled way to distinguish them from things that should.
(Math equivalent: a proof in complex analysis that relies on an obscure theorem from number theory.)
I don't intend for these descriptions to be fully general. I think that they're pointing at real clusters in puzzlespace, but don't fully capture the essences of those clusters.
Even if they did, I expect these clusters are going to fail to capture a lot of variation, and attempting to shove things into them will sometimes be a silly thing to do. But let's make a start with them, and try to classify some problems that come to mind:
This transfinite epistemic logic puzzle challenge is mostly a combination of deep and wide I think, but also some learn because I wouldn't recommend attempting it if you don't know about order types. I didn't find it overly challenging. (Except that reading the solution, I no longer remember if I got the answer exactly correct. I might have made a fencepost error.)
I haven't solved the sum and product puzzle, but as far as I've got has been a wide. I'm pretty sure I could finish it with brute force,3 but I suspect there's an elegant way to do it which is either a verify or a learn, possibly involving number theory.
The hardest logic puzzle ever is the fork in the road on steroids. Given the solution, proving that that solution works would be a moderately challenging wide puzzle. Generating it from scratch seems to require multiple levels of verify and wide.
The hundred prisoners problem is very verify. The verification step is a combinatorics problem. (I feel like there are a lot of problems based on having a hundred prisoners and an intellectually peverse gatekeeper, and I feel like most of them are verify. Here's another example.)
This card trick/puzzle is a verify. (I'm pretty sure I have a solution, but I've only verified it in my head, and I didn't fully work out the details. My solution is very different to the posted one. Arguably, the posted solution has a learn on the pigeonhole principle and mine doesn't.)
The pirate game is a little deep and not much else.
Logic grid puzzles are wide. They differ from Cheryl's birthday in that every clue needs to get used multiple times, but each clue-application is still straightforward. That doesn't mean they're easy.
Here are some other features that we might want to talk about.
Logic puzzles often (not always) have a number of characters. Sometimes you know more than/just as much as the characters, and need to work out how each of them will act. Sometimes you know less, and need to work out what they know from how they act.
Sometimes you need to come up with a strategy rather than a specific answer. Sometimes there are multiple strategies. Sometimes there's a foolproof strategy, other times you just need to find one as effective as possible. If that's the case, then it might be difficult to work out exactly how effective it is, even if it seems obviously more effective than the naive strategy; and proving that it's optimal might be harder again. It took three years to prove optimality on the hundred prisoners.
Different puzzles require different "levels" of theory of mind, and I suspect there's a lot of variation in how well people can keep track of those without accidentally collapsing them. "Alfred knows that Bernard doesn't know" might become just "Bernard doesn't know"; and "Alice doesn't know that Bob knows that Carol can see blue eyes" might become just "I have no idea what's going on".
Another future direction to explore might be a taxonomy of video game puzzles. I think there'd be some similar themes, but for example, video games can have progression in a sense that doesn't exist in most of the puzzles I've described so far.
Two senses, in fact. You can progress within a level, when you change the board state.4 Different games/levels have different attitudes to (a) how many ways you can do this at any given moment, and (b) whether it's possible to make a level unsolvable.
But also a level can teach you something by holding your hand, and in future levels you can use that without having your hand held. So level ten would have seemed impossible if you'd started with it, but after levels one through nine it's just challenging enough to be fun. Logic puzzles can have this: the fork in the road teaches you a tool to apply to THLPE. But the level order in video games gets to be designed deliberately. Whereas in real life, I mostly encounter logic puzzles by accident, with no guarantee that I've already encountered any precursors.
(When I was playtesting Sokobond, Alan said that I approach puzzles differently to most people. I think I was thinking along the lines of: "okay, the target molecule will only fit here, which means that I need to move into place from there which means...". This kind of thinking comes naturally to me, and I think it serves me well, most of the time - though I have known myself to occasionally prove that this particular level can't possibly have a solution. It seems well-suited to puzzles with predictable mechanics and clear unwinnable states.)
Posted on 20 May 2015
|
Comments
Suppose you and I are farmers, owning adjacent fields. One day you have a brilliant idea. If we dig a ditch from the nearby river, between our fields, then irrigating our fields becomes a lot less work. It would cost two utils to dig the ditch - one utilon each - and we'd get five utils each from its existence.
You come to me with this suggestion. "That sounds great," I say, "but I don't think I'm going to bother."
You object that I'm being dumb not to take those four utils.
"No," I say, "I just think that if I don't help you, you'll do all the work yourself. You still get three utils if you dig it alone, so you'll do it even if I don't help you. And by not helping, I get five utils instead of four. Why would I pay a utilon to help you?"
(Unfortunately for you, I am a member of H. economicus and have a natural immunity to arguments about "fairness" and "not being a douchebag".)
The farmer's dilemma is game-theoretically equivalent to chicken. Both of us choose to either cooperate by digging the ditch ("swerve" in chicken), or defect by sitting at home ("straight" in chicken). If both of us cooperate ("C/C"), we both get an okay result. If both of us defect ("D/D"), we both get a terrible result. If one of us cooperates while the other defects ("C/D"), then the defector gets the best possible result for themselves, and the cooperator gets a result between C/C and D/D.
If you're cooperating and I'm defecting (or vice versa), then neither of us have any incentive to change our strategies. I could start to cooperate, but then I'd just be giving you utility. You'd like that, but I wouldn't. And you could start to defect, but then you'd be throwing away utility. Neither of us would like that.
On the other hand, if we're both cooperating, then we both have an incentive to defect, as long as the other doesn't do the same; and if we're both defecting, we both have an incentive to cooperate, as long as the other doesn't do the same.
(Formally, there are two Nash equilibria, at C/D and at D/C. This distinguishes it from the prisoner's dilemma, which has an equilibrium at D/D.)
There are lots of ways this story can continue.
In one of them, you dig the ditch yourself. Months later, after harvesting and selling your crop, you sit in the pub complaining that being a farmer is such hard work, you've only come out with three utils of profit this year. Nobody's very sympathetic, because they're comparing you to me, and I've made a cool five utils. Because this is a thought experiment, there's no difference between us other than how we act. So obviously you're doing something wrong.
In another possible continuation, you threaten to burn some of my crops if I don't help you dig. Maybe I help you, maybe not; if not, maybe you were bluffing, maybe not; if not, maybe I call the police on you and you go to jail; or maybe I do help you, but I secretly recorded the conversation and leak it to the press later on... a lot of things can happen. Even if this works out great for you, it's at least a little villainous.
Instead of being so blunt, you might choose to convince everyone else to threaten me. Perhaps you dig the ditch, and then talk to our local government and convince them that you should be allowed to extract rent on it.
Another option is to tell me that if I don't help you dig, you'll spend an extra utilon to build a brick wall on my side of the ditch, so that it doesn't do me any good. If I believe that you'll do it, I'm likely to go ahead and help.
You can also tell me that if I don't help, you're not going to dig at all. Or even that you're simply not going to dig, if I'm going to be an asshole about it I can dig the whole thing myself or go without. Once again, I'm likely to dig if I think you're not bluffing.
(Precommitment is a powerful strategy in many situations. But two can play at that game...)
The real world is usually more complicated than game theory. Here are some variations on the farmer's dilemma:
Maybe I have a bad back, and digging is more costly for me than for you. This may or may not change the Nash equilibria, and it may or may not change the amount of sympathy we each get in the various continuations.
Sometimes there are lots of farmers. In this case, the ditch might cost more to dig than the value it provides to any individual, so that nobody will want to dig by themselves; but little enough that you don't need literally everyone to dig before it becomes valuable, so that everyone still wants to defect individually.
Sometimes the ditch might be an agent in its own right. For example, a company might perform R&D only if someone funds them to do so; and everyone wants them to do it, but would prefer that someone else pays.
(They might not have an explicit agreement for funding with anyone, but acausal trade and early adopters and so on.)
(And having developed a super-awesome version of their product, they might also sell a cheaper version, where they've gone out of their way to disable some of the features. This is like building part of a brick wall against people who only contribute a little to the digging.)
Sometimes the ditch might become more valuable if more people help to dig.
Sometimes the ditch requires constant maintenance. We could model that as a sequence of games, where the payoff structure changes between iterations (and might depend on the results of previous games). The ditch might not become profitable until after several rounds.
Why am I talking about this? I think farmer's dilemma situations come up from time to time in online discussions, and I want to be able to say "let's not be too harsh on AcmeCorp here, they're cooperating in a farmer's dilemma and everyone else is benefiting from that". (I don't want to discuss the specific examples I have in mind because they're kind of mind-killey.)
Although the farmer's dilemma and chicken are game-theoretically equivalent, I think our intuitions about them are different. At any rate, mine are. I can think of two reasons for this. One is that game theory only considers utility up to affine transformations. The games "Global thermonuclear war", where every player loses a million utils, and "Global happiness project", where every player gains a hundred utils, are also equivalent. But in the real world, two people crashing their cars into each other is a worse outcome than two people failing to dig a ditch.
The other reason, which is kind of the same reason, is that game theory assumes you've decided to play. If nobody wants to play chicken, you both get a better outcome than C/C. If nobody notices how valuable a ditch would be, you get the same outcome as D/D.
Another equivalent game is the snowdrift dilemma: the road is covered with snow, and we both want it cleared, but we'd both rather not clear it ourselves. My intuitions about this feel different again. You can't decline to play (except by living somewhere less snowy), but if you could, that would be better than C/C.
So describing a situation as chicken, or snowdrift, or a farmer's dilemma, all seem different. I don't know of an existing name that feels like a good fit for the farmer's dilemma. (For a while I thought the farmer's dilemma was a standard name, but I can't find anything about it online. Wikipedia redirects it to the prisoner's dilemma, but that has a very different structure.)
So it seems like a useful concept with no name. Now it has a name. You're welcome.
Posted on 05 May 2015
|
Comments
Sometimes at LW meetups, I'll want to raise a topic for discussion. But we're currently already talking about something, so I'll wait for a lull in the current conversation. But it feels like the duration of lull needed before I can bring up something totally unrelated, is longer than the duration of lull before someone else will bring up something marginally related. And so we can go for a long time, with the topic frequently changing incidentally, but without me ever having a chance to change it deliberately.
Which is fine. I shouldn't expect people to want to talk about something just because I want to talk about it, and it's not as if I find the actual conversation boring. But it's not necessarily optimal. People might in fact want to talk about the same thing as me, and following the path of least resistance in a conversation is unlikely to result in the best possible conversation.
At the last meetup I had two topics that I wanted to raise, and realized that I had no way of raising them, which was a third topic worth raising. So when an interruption occured in the middle of someone's thought - a new person arrived, and we did the "hi, welcome, join us" thing - I jumped in. "Before you start again, I have three things I'd like to talk about at some point, but not now. Carry on." Then he started again, and when that topic was reasonably well-trodden, he prompted me to transition.
Then someone else said that he also had two things he wanted to talk about, and could I just list my topics and then he'd list his? (It turns out that no I couldn't. You can't dangle an interesting train of thought in front of the London LW group and expect them not to follow it. But we did manage to initially discuss them only briefly.)
This worked pretty well. Someone more conversationally assertive than me might have been able to take advantage of a less solid interruption than the one I used. Someone less assertive might not have been able to use that one.
What else could we do to solve this problem?
Someone suggested a hand signal: if you think of something that you'd like to raise for discussion later, make the signal. I don't think this is ideal, because it's not continuous. You make it once, and then it would be easy for people to forget, or just to not notice.
I think what I'm going to do is bring some poker chips to the next meetup. I'll put a bunch in the middle, and if you have a topic that you want to raise at some future point, you take one and put it in front of you. Then if a topic seems to be dying out, someone can say "<person>, what did you want to talk about?"
I guess this still needs at least one person assertive enough to do that. I imagine it would be difficult for me. But the person who wants to raise the topic doesn't need to be assertive, they just need to grab a poker chip. It's a fairly obvious gesture, so probably people will notice, and it's easy to just look and see for a reminder of whether anyone wants to raise anything. (Assuming the table isn't too messy, which might be a problem.)
I don't know how well this will work, but it seems worth experimenting.
(I'll also take a moment to advocate another conversation-signal that we adopted, via CFAR. If someone says something and you want to tell people that you agree with them, instead of saying that out loud, you can just raise your hands a little and wiggle your fingers. Reduces interruptions, gives positive feedback to the speaker, and it's kind of fun.)
Posted on 14 April 2015
|
Comments
Content note: politics, gender politics.
For a while I've been vaguely aware of a petition to "stop taxing periods. Period." I didn't pay it much attention until today, but now I've looked at it and done way more research than I expected to.
According to the petition,
A 5 per cent tax rate has been placed on sanitary products, while exotic meats walk tax-free. HM Revenue and Customs justified this tax by classifying sanitary products as "non-essential, luxury" items.
At least the first sentence of this is true. Sanitary products have VAT of 5% imposed on them. Exotic meats (including horse, ostrich, crocodile and kangaroo) do not.
Sanitary products are covered by VAT notice 701/18, which reduces the rate on them from the standard rate (currently 20%) to 5%. It applies only to "any sanitary protection product that is designed and marketed solely for the absorption or collection of menstrual flow or lochia (discharge from the womb following childbirth)". That is, this reduction was introduced specifically to reduce tax on sanitary products.
Exotic meats are covered by 701/14, which covers food in general. Most food is zero-rated. There are further exceptions for some things, including chocolate covered biscuits (but not chocolate chip biscuits), which are standard-rated; exotic meats are not one of those things. What seems to have happened here is that the government decided that most food should be zero-rated, and then made a list of foods that shouldn't, and exotic meats didn't happen to make it on to the second list for whatever reason.
I'm less sure about the second sentence, HM Revenue and Customs justified this tax by classifying sanitary products as "non-essential, luxury" items. More from the petition:
After the UK joined the Common Market in 1973, a 17.5% sanitary tax was introduced. It was justified when Parliament classified sanitary products as "non-essential, luxury" items.
I don't think this is true, but I think I can see how the story could have been chinese-whispered into existence:
- When the UK joined the Common Market in 1973, we introduced VAT. The standard rate of VAT used to be 17.5%.
- My current belief is that until 2000 or 2001, sanitary products were standard-rated. It's not that there was a specific "sanitary tax", but VAT was a tax that applied to sanitary products.
- It seems to be widely assumed that VAT exemptions are given to essential items, and not to others. Since sanitary products were standard-rated, they must not have been considered essential, so they must have been considered luxury.
But: In 1973, the standard rate of VAT was 10%. I'd be very surprised if sanitary products were taxed at 17.5% in 1973.
VAT actually replaced purchase tax, which did only apply to luxury goods. So if sanitary products were taxed for the first time in 1973, it's because they weren't considered luxury.
Putting "non-essential, luxury" in quotes would seem to imply that this is a direct quote from something, but the first page of google results for that page all seems to be inspired by the petition at hand. The second page has nothing helpful, the third is back to this petition. I haven't found a source for this phrase.
In fact, I haven't found any official source that suggests that the government thinks sanitary products are non-essential. The evidence for this seems to be purely the fact that they have 5% VAT imposed on them.
But this assumes that VAT is not applied to essential products, which as far as I can tell just isn't true. Here is the entirety of the official-seeming evidence I've found which connects essential-ness to VAT status: a page on gov.uk which says "You pay 20% VAT most of the time - but less on essential items."
One reading of this is that all essential items, and only essential items, are taxed at less than 20%. By this reading, sanitary products are essential, being taxed at 5%.
On the other hand, when you look at the list of VAT exemptions, it includes gambling and antiques, which seem pretty non-essential; and it doesn't include clothes for adults, which are pretty damn essential. (If I don't wear clothes, I will get locked up. I'm forced to pay tax on the clothes that I'm forced to wear. This is almost Kafkaesque, in a really really minor way.)
Just in the healthcare genre, it doesn't include toothpaste or dental floss, most glasses or hearing aids, sticking plasters, paracetamol, or the creams I use for acne and eczema. (I haven't specifically researched most of these. I think I've looked in the places where I would find them if they were covered, but it's possible I've made a mistake.)
It does seem that most of the things in that list are things that most people think should be easily accessible. But if I showed someone that list without telling them what it was, I don't think they'd say "this is a list of things which are essential".
Wikipedia also doesn't use the word "essential".
If we ignore that one linked page, it just doesn't seem like the government classifies things as essential or not, and then assigns them a VAT status based on that classification. It just assigns VAT status. To ask whether the government considers something "essential" seems to be asking a null question. That's not a word that the government, as a body, knows.
There is still that one linked page, but currently I'm dismissing it as an anomaly. Gov.uk is intended to be accessible and accurate, and in this case I suspect that accessibility triumphed over accuracy, and/or someone just made a mistake while writing it.
I worry that I'm simply ignoring contarary evidence here, but consider: the government publishes a list of things. It doesn't put things on the list just for being essential, or deny them from the list for being nonessential. It doesn't correlate very well with what most people would consider essential. I can only find one place where the government comes close to describing it as a list of essential things, and that's in a descriptive context, not a binding one.
If it doesn't look like a duck, quack like a duck, or fly like a duck...
So. In 1973, the government probably did not consider sanitary products to be "non-essential, luxury" items. It just considered them to be things, and it taxed them the same as it taxed most things.
Since 2001, the government almost certainly doesn't consider sanitary products to be non-essential. It taxes them at a reduced rate compared to almost everything else, but at a higher rate than some other things.
Under EU law, the government isn't allowed to zero-rate them. We have other zero-rated things, but they were zero-rated before we joined the EU. The VAT on sanitary products is as low as it is permitted to be.
We might (or, quite likely, might not) be able to exempt sanitary products instead of zero-rating them. But there's an important difference. If you buy raw materials, produce a product, sell it, and have to pay VAT on the sale, then you can claim back the VAT that your supplier paid on the materials. If you don't pay VAT on the sale, you can't claim back that VAT. If sanitary products were exempt rather than not zero-rated, the selling price might (or might not) go up because of this, depending on the cost of raw materials. It's almost certainly more complicated than this, but:
Suppose you buy £30 worth of cloth, of which £6 is VAT. You make a lot of tampons and sell them for £100. You have to pay £5 VAT on that sale, and can claim back the £6 VAT you paid on the cloth. Profit: £100 - £30 - £5 + £6 = £71. If tampons were VAT exempt, you'd need to sell them for £101 to make the same profit.
(The past two paragraphs were written from the perspective that the seller pays VAT. Mostly we seem to talk about the buyer paying VAT. It's all equivalent, but I hope I wasn't too confusing.)
Having established (what I believe to be) the facts, here are my thoughts on the petition itself:
Firstly, it doesn't strike me as a particularly big deal.
Financially, we're talking about £150 a lifetime lost to taxation on sanitary products, assuming £5/month for 50 years. I don't think anyone is claiming it's a particularly heavy burden, it's not about money.
From a gender-politics perspective... it also just doesn't seem like a big deal. If sanitary products were still standard-rated, I could maybe kind of get behind this, even though I don't actually think that standard-rated implies non-essential. But they're at the lowest VAT rate the UK government is allowed to put them at. I just don't see this as a sign that we value crocodile meat more than female participation; or of male-focused agenda setting; or any form of institutional misogyny.
It seems like most of the conversation is fueled by outrage. I don't think there's much here to be outraged about, and I would prefer that the energy being put into this outrage goes elsewhere.
Secondly, I don't really like the tactic of petitioning the UK government over this. The petition acknowledges that they can't do anything under current EU law. So the goal is to get the UK to get the EU to change its laws.
That seems... really unfair. I don't think I can argue clearly and succinctly for that position, which usually means I should spend more time clarifying my thoughts to myself. Instead of doing that, I'm going to argue by analogy. I don't think these analogies are the same in every way, or even in every relevant way; they just seem to me to have some structural similarities to the case at hand.
So, suppose I notice that I've been overcharged on my phone bill. I call up the company, and the customer service rep is very polite and helpful and refunds me. I tell her, "that's not good enough, I also want to be refunded for the price of this phone call to you, that I shouldn't have had to make". She tells me that she simply doesn't have the power to do that. So I tell her that she needs to complain to her managers until they give her that power, and I'm going to keep calling her every day until she can refund me for the cost of this phone call.
This isn't her battle. She's sympathetic to my plight, but she has other priorities. And I'm blackmailing her into fighting for me.
This feels kind of like that kind of power dynamic.
Or: "hey man, I just need 20p for my bus fair, can you help?" "Sorry, I'd like to, but I have nothing smaller than a tenner." "So give me a tenner, don't be a tightass."
There's only so far that you can expect someone to go for you, and I feel like this petition is asking for more than that.
Posted on 22 February 2015
|
Comments
It's widely regarded that documentation is an important task that doesn't get as much attention as it deserves.
There are a lot of proposals for how to fix this, but I think they typcially miss something. "Write the documentation before the code", for example. Sounds good, but a lot of the time, until I write the code, I don't know what it's going to do. Maybe I have a rough idea, but the details are going to evolve. And sometimes I'm just going to get things wrong. Oh, I can't frobnicate the spam here unless I also make the user pass in an egg - this is the sort of detail that would be easy to miss when I'm thinking about a function, but impossible to miss when I'm trying to frobnicate the spam without an egg.
And of course, having written a function and documented it (in whatever order) - you can subsequently rewrite the function without redocumenting it. This is the well-known problem where the code and the docs get out of sync.
You can't have a rule "don't edit the code unless you first edit the documentation", because a lot of code edits don't need doc changes. A more complicated rule like "don't edit the code unless you first edit the documentation, unless the documentation doesn't change" would probably just be forgotten. After a while, whenever I ask myself if the documentation is going to change, I'm just going to return the cached answer "no" without thinking. In any case it has the same problem: I don't know what I'm going to change until I change it.
So here's my half-baked proposal that might help to solve some of the problem of keeping code in sync with documentation:
If your code is in a git (or other VCS) repository, you can look at a function and see whether the code changed more or less recently than the docstring. It would be possible to write a tool which searches for such functions automatically.
Bam, done. The rest is just details. Edit code all you want, and if you forget to update the docs, this tool will remind you next time you run it.
Okay, some details are important. Here are some things to think about:
Not every code edit needs you to touch the docstring, so you want a database of some sort, which can be used to mark "the docstring is up to date as of commit A, even though it hasn't been touched since commit B". Keep this database inside your repository. This is probably quite a large detail, with many details of its own.
It might actually be better (certainly simpler) to have some way of marking this in-file. For example, in python:
def foo(bar):
"""Do a thing
""" # 2014-01-07 11:25 (edit this comment to mark the docs up to date)
pass
Tracking a function across revision changes is probably really hard, in general. In the specific case of "the sort of revision changes that actually occur in real codebases", it might be possible most of the time.
((I would start by working out which lines constitute a function's code and docstring, and running git blame to see which was updated most recently. This might be good enough, but for example it breaks if you swap the order of two functions in the source file. One of them is going to be blamed entirely on that commit. It also breaks if you delete lines from the function body.
A more complicated solution would be to track the function by its identifier. Look at past revisions of the file in its entirety, search for the function, and see whether it's changed. This could go wrong in so many ways, and even if it doesn't, it still won't catch you if you move a function from one file to another.))
This probably works best if you run it frequently. You could have it in a hook: if there are unsynced changes, you aren't allowed to push unless you sync them (either by editing the docs, or marking them clean).
This could be applied to classes as well as functions, but I'm not sure when you'd want to flag them. Most changes to code inside a method probably won't touch the class docstring, even if they touch the method docstring. Applying it to whole files would be even harder. In general, this will apply better to low-level documentation than high-level.
You need custom support for each language and documentation style used, as well as for each VCS. In many cases, supporting a language will mean writing a parser for it. (In python, you can import a module and discover every class and function that it exposes, and their docstrings, but I don't know if you can find their source locations like that.)
Would this work? How much effort would it be to make? Would it be helpful, assuming it did work? For that matter, does something like it already exist?
Posted on 16 January 2015
|
Comments
In which I inexpertly channel Randall Munroe
Soon the seas will turn red with the blood of the human race, as the unspeakable terrors come from beyond the gate, which is Yog Sothoth, to devour all in their path! Ia! Shub Niggurath! Ia! Ia!
— Who will be eaten first?
If you were to mix the blood of the human race into the oceans, how red would they turn?
A quick fermi estimate: there are seven billion humans, with about eight pints of blood each, which is about five litres. So we have about 35 billion litres of blood to work with.
How about the oceans? They cover two thirds of the earth's surface. I know that the surface of the earth is $4πr^2$ where $r$ is 6,400 km. $(64 × 10^5)^2 = 2^{12} × 10^{10} = 4096 × 10^{10}$. $4π$ is approximately ten, so the surface area of the earth is about $4×10^{14}$ m². Two thirds of that is $2.5 × 10^{14}$. I don't know how deep they are, but 100m seems like a reasonable guess at an average. That gives us about $2.5 × 10^{16}$ m³ of ocean.
A cubic metre is a thousand litres, so we have about 25 billion billion litres of ocean. For every drop of blood, we have almost ten billion drops of ocean.
I don't know how blood and water mix, but I'm confident that the oceans will not be running red with blood any time soon, on average. Sorry cultists.
Fact checking: Wolfram Alpha tells me that there are actually about 10¹⁸ m³ of ocean, which is 40 times what I guessed, which is okay for a Fermi estimate. The oceans are still not running red with blood.
NOAA, whoever they are, tell me that the average depth of the ocean is 4km, which is also 40 times what I guessed. That's not great for a Fermi input.
(Initially, I forgot the $4π$ factor in the surface area of the earth, putting me off by 400 times. I'm not sure how to judge myself on Fermi estimates when I get somewhat-reasonable inputs and do the math wrong.)
However! The quote specifically mentioned seas. I've cheated by counting oceans as seas. What if we exclude them? I'm going to do this just by listing the oceans and subtracting their volume from the total. We have:
It's possible that this double-counts the Southern ocean, since some people divide it up between the Pacific, Atlantic and Indian oceans. But I'm going to allow that, to make the seas as small as possible.
In total, this makes $1.29 × 10^{21}$ L of ocean. That's more than the $10^{18}$ m³ I gave earlier, but that was rounded. The actual number given is $1.332 × 10^{21}$ L. So there are $0.042 × 10^{21} ≈ 4 × 10^{19}$ L of seas in the world. This is actually pretty close to my initial guess for the oceans. Still no red.
This does not mean you are safe from the Elder Gods.
Posted on 09 August 2014
|
Comments
I like Python, but one of the things I dislike is the assert statement. Its simplest form provides no help if it fails:
will raise an exception so you can see where it failed, but you don't get to see what x or y were.
There's a longer form,
assert x == y, "%r != %r" % (x,y)
but this is verbose, and evaluates x and y twice. And if x and y are dictionaries nested three deep, it might not be easy to tell what's different between them.
I'm aware of two approaches that improve the situation. nose has a --failure-detail plugin that tries to automatically give you more detail. When
fails, it:
- Finds the source location of the failed assert,
- Reads and parses this line,
- Substitutes variables with their values,
- Reports the substituted line.
This is an awesome hack, and I love that it's possible, but I don't find it all that useful. You still need to play spot-the-difference with deeply nested data structures, but that would be pretty easy to fix. The deeper problem is that it also doesn't help with
assert frobnicate(3) == frobnicate(4)
because there are no variables to replace. (frobnicate is a variable, but IIRC it doesn't substitute functions. I don't remember the exact algorithm it uses.) I had a look at the code, and I don't think it would be possible, in general, to report the values on the LHS and RHS. You'd have to re-evaluate the expressions, and there's no guarantee they'd return the same thing the second time.
The second approach is to get rid of assert statements completely. In a unittest test, you do
and if x != y it tells you what x and y are, with a helpful diff format for dicts, lists and sets.
This is great, but I just don't like writing asserts like that. So here's a new approach:
from bsert import bsert
bsert | x == y
How it works is that bsert | x returns a new object, _Wrapped(x); and _Wrapped(x) == y calls assertEqual(x, y). Other comparison methods are overloaded as well. Now we can do things like:
$ python
Python 2.7.5 (default, Dec 1 2013, 00:22:45)
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from bsert import bsert
>>> bsert | 3 == 3
True
>>> bsert | 3 == 4
Traceback (most recent call last):
...
AssertionError: 3 != 4
>>> bsert | [3] + [4] == [3, 4]
True
>>> bsert | [3] + [4] == [3, 4, 5]
Traceback (most recent call last):
...
AssertionError: Lists differ: [3, 4] != [3, 4, 5]
Second list contains 1 additional elements.
First extra element 2:
5
- [3, 4]
+ [3, 4, 5]
? +++
>>> bsert | {1: {2: 3, 4: 5}, 6: 7} == {1: {2: 4, 4: 5}, 6: 7}
Traceback (most recent call last):
...
AssertionError: {1: {2: 3, 4: 5}, 6: 7} != {1: {2: 4, 4: 5}, 6: 7}
- {1: {2: 3, 4: 5}, 6: 7}
? ^
+ {1: {2: 4, 4: 5}, 6: 7}
? ^
>>> bsert | 1 / 2 != 0
Traceback (most recent call last):
...
AssertionError: 0 == 0
>>> bsert | 1.0 / 2 != 0
True
>>> import time
>>> bsert | time.time() != time.time()
True
>>> bsert | time.time() == time.time()
Traceback (most recent call last):
...
AssertionError: 1399731667.416066 != 1399731667.416123
>>> bsert | [3] * 3 == [3,3,3]
True
>>> bsert | {1, 2, 3} <= { 1,2,3,4}
True
>>> bsert | {1, 2, 3} >= { 1,2,3,4}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "bsert.py", line 28, in __ge__
self.assertGreaterEqual(self.wrapped, other)
File "/usr/lib64/python2.7/unittest/case.py", line 950, in assertGreaterEqual
self.fail(self._formatMessage(msg, standardMsg))
File "/usr/lib64/python2.7/unittest/case.py", line 412, in fail
raise self.failureException(msg)
AssertionError: set([1, 2, 3]) not greater than or equal to set([1, 2, 3, 4])
>>> bsert | 3|8 == 11
True
>>>
There are a few limitations. For one, you can't use chained comparisons, and you won't get any kind of error if you try. The reason is that
cashes out as
(bsert | 3 <= 5) and (5 <= 4)
so there's no way for bsert to know that there's another comparison going on. For two, you can't do
because there's no way to overload the in operator from the left hand side. (In this case, you at least get an AssertionError: 1 != 3 telling you you did something wrong, because a in someList basially does any(a == x for x in someList), and so it fails at bsert | 3 == 1. If you had a dict, set, or empty list on the right hand side, it would just return False and not raise an exception.)
Similarly, bsert | x is y, bsert | x and y and bsert | x or y don't work, because those operators can't be overridden at all. (Even if they did work, they're low precedence, so it would need to be e.g. (bsert | x) and y, which is horrible.) You also can't do
because that just returns _Wrapped(False)
I think all the other operators should work fine, if you're using them in ways that make sense. Most of them have higher-precedence than |, so that for example
cashes out to
The only exception is | itself, and I've added support so that _Wrapped(x) | y returns _Wrapped(x|y).
I don't necessarily recommend that you use bsert. I'm not sure that I will. But it's there.
I've put bsert on github, but it's also short enough that I might as well just post it inline:
import unittest
class _Bsert(object):
def __or__(self, other):
return _Wrapped(other)
class _Wrapped(unittest.TestCase):
def __init__(self, obj):
# TestCase needs to be passed the name of one of its methods. I'm not
# really sure why.
super(_Wrapped, self).__init__('__init__')
self.wrapped = obj
def __eq__(self, other):
self.assertEqual(self.wrapped, other)
return True
def __ne__(self, other):
self.assertNotEqual(self.wrapped, other)
return True
def __le__(self, other):
self.assertLessEqual(self.wrapped, other)
return True
def __ge__(self, other):
self.assertGreaterEqual(self.wrapped, other)
return True
def __lt__(self, other):
self.assertLess(self.wrapped, other)
return True
def __gt__(self, other):
self.assertGreater(self.wrapped, other)
return True
def __or__(self, other):
return _Wrapped(self.wrapped | other)
bsert = _Bsert()
Belated update:
On reddit, Liorithiel informs me that py.test can extract useful failure messages from assert statements. Like what nose does, but implemented differently, so that it can show the values of intermediate expressions in more detail than bsert can. (It rewrites the AST on import time, which is an even more awesome hack than nose's.) As far as I'm concerned, this knowledge makes bsert obsolete.
Meanwhile, obeleh on github has provided a patch which allows bsert to be used with boolean expressions, with different syntax. So
is like assert 3 in [1,2,3], but with a slightly nicer exception message. (The old syntax still applies for expressions using comparison operators.) Now you get
AssertionError: bsert(3 in [1,2,3])
instead of just
It comes at the cost of a slightly longer traceback - I wonder if that can be worked around. And it doesn't provide intermediate expressions at all, so it's kind of a neat trick, but I don't know if it's useful. (Especially since the old traceback had the failing expression near the bottom anyway - there are cases where you'll see the exception but not the traceback, but we're getting into niche territory.) But that's pretty much how I felt about bsert in the first place, so I decided to go ahead and include it.
Posted on 10 May 2014
|
Comments
I don't expect anyone cares about what I think about the movies I've seen lately, so I'm bundling those thoughts into one post to be less annoying.
The Lego Movie
This is the most pure fun I've had watching a movie in a long time. It's going for funny, and it is, the whole way through. As I write this, I saw the movie almost a month ago, and remembering some of the jokes still makes me smile.
The attention to detail is also impressive. I think that if I rewatched this, I'd catch a bunch of stuff that I missed the first time. Going frame-by-frame would help at times.
Highly recommended.
Divergent
I found Divergent decent while watching it, but no more than that; and thinking about it afterwards only makes it worse.
It's pretending to be a failed-utopia story: at some point in the past, someone came up with a really dumb idea ("let's end conflict by separating everyone into five factions based on personality traits!") and apparently society rolled with it. Now the heroine, along with the audience, learns that it's a really dumb idea ("oh hey, it turns out that some people have more than one personality trait, guess we'll have to kill them"). So far, so Gattaca.
But in this case, just in case you weren't convinced that the faction idea was dumb, you get to see what happens when the faction system meets mind-control drugs. The answer is "pretty much what happens when mind-control drugs enter any society", but because it's happening to the faction system, it's the faction system that looks bad.
There's also a sort of Mary-Sue thing going on, but a viewer insert rather than an author insert. In the film world, people supposedly have only one of the traits brave/honest/friendly/selfless/clever, except for the rare divergents. So the message is "if you have two or more of these traits, you are rare and special and immune to mind-control drugs, and people will hate you for it", which is probably a message that will appeal to angsty teenagers.
Captain America: The Winter Soldier
This is an action film. It has people shooting people, and punching people, and jumping through glass windows. I have it on good authority that a lot of people enjoy that sort of thing.
There's a political-thriller plot in there, as well, but not a particularly interesting one. Director Fury recommends delaying the project to create a doomsday device that will only be used to kill bad people. Director Fury can't be trusted, because he recently hired a merc to hijack a ship owned by SHIELD (I never quite worked out why he did this). Also, someone has just killed Director Fury. Full speed ahead on the doomsday project!
There's the standard trope that when things seem hopeless for the heros, we find out that they set things up in advance to control for this very circumstance. And the same with the villains. This keeps the tension high, or something.
Brain uploads are possible, and have been possible since the seventies or so, but no one really cares, and only one person has been uploaded. The upload commits suicide to try to kill the heros, where any normal villain would have an escape plan, or at least some lines about how he's giving his life in the service of Evil; we don't get that here, because people aren't people unless they're made out of meat. (Actually, I assume that the Marvel universe has many people who aren't made of meat, but are still given personhood-status. Maybe the rule is "unless they look like us, except that they're allowed to be a different color"?)
(There's a similar thing with cryonics, but it's not clear whether normal humans can be suspended and revived, or just superhumans.)
The Winter Soldier thing feels tacked on. He's an assassin who is approximately as good at fighting as the hero is. It also turns out he's the hero's brainwashed childhood friend, returning from the first movie. The hero gets kind of sad about this, but they fight anyway. In the end, the Soldier saves the hero's life, but he only gets to do it because the hero was trying to save his life, and they don't have a big moment of reconcilliation, it just kind of happens. I guess the idea is to set something up for the third movie, but in this one, there was no reason that character couldn't have just been someone new, or even completely removed from the film.
Really though, I think the main thing I'd like to change about the film is: to a first approxmation, nobody gets hurt. People fight, and everyone who isn't a named character dies, and the named characters don't have a scratch on them despite getting repeatedly punched in the face. I know some people enjoy watching this, but I find it kind of boring, and I think that if we saw the hero sustain a split lip and a bloody nose, I might find it a lot more engaging because suddenly it might feel like he's actually in danger. (I can forgive that people don't get cut when they crash through glass windows. If they did, they would be a lot less willing to keep doing it.)
Of course, this film is only a 12A. If you show people getting hurt in fights, kids might not be allowed to watch it any more.
I did like how there was no romantic subplot.
Also, faceblind world problems: a minor character suddenly unleashes some ass kicking. Everyone looks at her like wtf. She presses a button, her face shimmers as a hologram disappears, and suddenly she looks exactly the same. (She then proceeds to remove her wig, so it's not too confusing.)
Noah
Everybody in this film fails ethics and/or metaethics forever.
Spoilers ahead, even if you've read the Bible. I can't be bothered to mark them all.
Noah thinks the world would be better off if all the people were dead, including himself and his family. He thinks his task is to save the animals and then to die.
I can accept the hypothesis that the rest of the world is evil. It's a stretch, but I think we don't see anyone outside Noah's family do anything good, so let's roll with it, and say that sure, they deserve to die.
What about Noah and his family? He explains this to his wife, Naameh, pointing out their flaws to her. Their youngest son, Japheth, seeks only to please. Or something like that. I agree that that's a character flaw, but if it's the worst you can say about someone, they're a pretty good person. I don't remember what he said about Shem and Ham - I think maybe Shem's flaw was being sexually attracted to his girlfriend Ila (in Shem's defence, Emma Watson), and Ham wanted a girlfriend too much - but it definitely wasn't "deserves to die" grade stuff. And Noah and Naameh would both kill to protect the children, which apparently makes them just like everybody else.
I'm not being entirely fair. The world is beautiful, humans ruined it once, and however good Japheth may be, we can't trust that his kids will be as good as him, and their kids will be as good as them, and so on. We can turn this into a parable about AI safety and the Löbian monster, but I don't think that was the intent.
Okay, fine. Humans need to die off. That's not Noah's problem. Noah's problem isn't even necessarily that he thinks God is talking to him, because that does happen. I'm not sure I can pinpoint exactly what Noah's problem is. Maybe it's that he seems to think that he and he alone is vital to God's plans.
When Ham tells Noah that he wants a wife, Noah tells Ham that God will provide everything they need. Later Noah goes to look for a wife for Ham. When he fails, he decides that God doesn't want Ham to have a wife, and that's when he works out that his family isn't meant to repopulate the Earth. When Ham finds a wife for himself (well, a scared lonely little girl, but in those days, what was the difference?), Noah abandons her to die.
Noah's thought process seems to be: if God wants something to happen, He will do it himself or He will work through Noah. So when Methuselah blesses Ila, and Ila subsequently becomes pregnant, the idea that this might have been God's will doesn't cross Noah's mind, and he proclaims that if the child is a girl (capable of repopulating the Earth), he will kill her.
(Incidentally, Methuselah blesses Ila when Ila is looking for Ham, who needs to be found fairly urgently. Their conversation wastes minutes, and after he blesses her, she goes and finds Shem - also looking for Ham - for making out, leading to an implied offscreen shag. Ham is forgotten.)
This brings us to the problem that most of the other characters share: they are far too passive. No one apart from Noah is cool with the human race dying off, but when Noah announces his plans, they don't do very much about it. Naameh tells him that if he kills the child, she will hate him. Japheth sends out one bird at a time to try to find land before Ila gives birth, maybe so that they can run away from Noah? Shem and Ila build a small raft, with food for a month, and plan to desert the Ark. But they neglect to make any secret of this, and when Noah burns it, they're surprised.
I am probably a little too quick to say that fictional characters should be killed, in circumstances where that wouldn't fly in the real world. But in fiction, the stakes tend to be a lot higher. In this case, the stakes are the survival of the human race. The obvious thing to do is to kill Noah.
Don't run away on a raft. The Ark is safer. You are the last hope for the human race. Kill Noah.
When Noah has attempted to kill your children, and discovered that he can't do it, kill him anyway. He hasn't changed his mind about what God wants, he just thinks he's failed God, and do you trust him not to try again? Kill him. (Eventually, Noah does change his mind, when Ila suggests that maybe God deliberately left the choice to Noah. Now Noah believes he's not only vital to God's plans, but is permitted to shape them.)
Well, Shem eventually makes a cursory attempt, at least. But it's not good enough. NPCs, the lot of them.
There are two more characters. Tubal-Cain is the villain. He thinks that man should have dominion over nature. He also decides that since God has forsaken them, so much for ethics, and he'll kill anyone he wants.
Ham is the only one of Noah's sons to express any independence. His motivation is that he wants a wife. When Noah fails to find him one, he goes to find one himself. He rescues a girl from the human settlement, they run away back to the Ark together, and when she gets caught in a bear trap with the bad guys in hot pursuit, he tries to save her. Noah arrives to save him, and he trusts Noah to save her as well, which he doesn't.
When Tubal-Cain breaks on to the Ark, Ham finds him and becomes his protégé. Sort of like Anakin Skywalker being turned against the Jedi Council by Senator Palpatine. But when it comes down to it, Tubal-Cain and Noah are fighting, and Ham decides to kill Tubal-Cain instead. (At this point, Tubal-Cain tells Ham, he has become a man. Tubal-Cain returns the magical glowing snakeskin that he took when he killed Noah's father. It's the skin shed by the snake that tempted Eve. It's all very significant, presumably, but I don't know what it signifies.)
Meanwhile, God is classic Old Testament God. He's not very communicative, which causes Noah no end of confusion; He only ever intervenes at the last minute, because He loves a fight scene as much as the rest of us; and apparently "flood the world" was the best plan He could come up with for making everything cool again.
Apart from all the ethics, the film is still pretty bad. The script is uninspired. Tubal-Cain gets some villainous monologues and a rousing pre-battle speech, but they pretty much consist of him talking about how much killing he's going to do. I did enjoy the visuals, and Russell Crowe and Emma Watson do the best they can.
I do not recommend Noah.
Ocho Apellidos Vascos (Eight Basque Surnames / Spanish Affairs)
This reminded me a lot of Intouchables. That was a French feelgood comedy-drama which ended up making a lot of money. This is a Spanish feelgood romcom which is making a lot of money. Both are heavily based on the cultural divide between the main characters. Both are well-made, but very paint-by-numbers.
Apellidos features the boy and the girl meeting. The boy thinking there's something Special between them. The girl disagreeing. The two of them getting up to crazy antics. The girl changing her mind. The boy rejecting her, and then changing his mind. The assumption that these people, who first met a week ago, and now think they're in love, will live happily ever after.
In this case, the boy is Andalusian, and the girl is Basque. I don't know much about Spanish politics, and the film doesn't go out of its way to explain (why would it?), but I get the impression that there's a great deal of tension between the Basque people and everyone else in Spain; much like Northern Ireland and the rest of the UK, back in the days when the IRA was bombing pubs and cars and stuff (okay, I don't know too much about local politics either). Accordingly, just about every character in the film is incredibly casually racist. Sometimes that's played for laughs, and probably exaggerated; other times it seems entirely plausible. It was kind of distracting: I kept thinking there is no way you could show this if you substituted Basque for Black. But then, maybe you could if you went for Irish instead.
I found the subtitles hard to follow at times. There were some words which weren't translated (I think maybe these were spoken in the Basque language), and they were easy enough to pick up but required slightly more effort than normal. And there were a number of references to things I'm not familiar with, which I had to piece together from context; again, not difficult to do, but not effortless.
Mostly, I thought this film was funny, sweet, and forgettable.
Posted on 19 April 2014
|
Comments
I'm a fan of the board/card game The Resistance, but I feel like the base game is significantly unbalanced in favour of the spies. I think that if the spies play well, they will usually win regardless of how well the resistance plays; and I think that playing well simply involves not acting like spies. (To be precise, I think that if the spies always act in public the way they would if they weren't spies, the game becomes very easy for them; they don't need to execute complicated plots to try to throw suspicion on an innocent, and they don't need to subtly communicate with each other about which of them is going to throw a failure card down on this mission. This isn't necessarily easy for them, but it's fairly simple in principle, and I expect it becomes easier with practice.)
To test this, I intend to analyse a similar, simpler game which I call Resistance-B. "B" for "brainwashed", because in this game the spies don't know they're spies, and they don't know who the other spies are either. However, any time they go on a mission, their programming takes over and they sabotage it without knowing what they're doing. If there are multiple spies on a mission, they all sabotage it.
This probably wouldn't be very fun to actually play as a group (and you'd need a GM, or some other way to find out how many spies were on a mission without the spies themselves knowing). But it seems clear that, provided all players are good at acting, this is a variant which is strictly harder for the spies to win. Their strategy in this game is available in plain Resistance (always act like a resistance member in public, but sabotage all missions), but some options have been removed, and the resistance members know that this is their strategy.
This game is also a lot easier to analyse than the original. Everyone has the same information, so there's no need to vote on teams. In fact, we can think of it as a one-player game against a random opponent; all the opponent does is pick the spies, and the player needs to pick teams to execute three successful missions.
My hypothesis is that for many game sizes, the spies will have considerably greater than even odds of winning. To be less vague, I expect the spies to have a disadvantage in five-player games at least1, but for six I'm not sure which way it would go, and by eight I expect the spies to be usually winning. (I also expect the game generally gets harder for the resistance as you add players, except that nine players is obviously easier than eight. Having one more exception wouldn't surprise me much, but two would. (If ten is easier than eight, that doesn't count as an exception, because ten is clearly harder than nine. However, it would also surprise me a little if ten was easier than eight.))
1. Although I don't remember specifically predicting this prior to beginning the analysis below. I got as far as "40% of the time, the resistance wins just based on the first round" before I started to write this post.
Note that if I'm wrong, that doesn't mean the original Resistance is unbalanced, although it's evidence against it (strength depending on just how badly wrong I am); but if I'm right, the Resistance certainly is unbalanced.
For people not familiar with the original game, here are the rules of Resistance-B. You have a specified number of players2, some of whom are "resistance members" (good guys) and some of whom are "government spies" (bad guys). A third of the players, rounded up, are spies, but you don't know which. The game progresses in up-to-five rounds. In each round, you select a specified number of your players, and find out how many of them were spies. If none of them were spies, you win that round. If any of them were spies, you loose that round. Your goal is to win three rounds. The number of players selected in each round depends on the total number of players, and the round number (the table is given on wikipedia). In addition, for seven players and up, you win round four unless you select two or more spies, not one or more.
2. The way I'm presenting the game, these aren't real players. You are "the player", and the game contains fictional players whom you control.
Resistance-B is easy to analyse case-by-case for five players (three resistance, two spies). With no information, we select the starting team (two players) randomly. There are three equivalence classes of outcomes:
- No spies ($p = { 3 \choose 2 } / { 5 \choose 2 } = 0.3$). In this case, the resistance always wins. Mission three also only takes two members, so it's a guaranteed success. On each of missions 2, 4 and 5, take one of the remaining players until you find the one who isn't a spy.
- Two spies ($p = 1 / { 5 \choose 2 } = 0.1$). Now we know exactly who the spies are. Easy win.
- One spy ($p = 0.6$, by elimination). We have players A through E, and precisely one of A and B is a spy, and precisely one of C, D, E is a spy. For mission two, our choices are (wlog) ABC and ACD. (CDE is stupid, because we gain no information and are guaranteed the mission will fail).
- ABC is a guaranteed fail, but
- If we get two failure cards ($p = 1/3$), we know C is a spy. We can't afford to fail any more missions, and we can't distinguish between A and B. So we send AD on mission 3; if it succeds ($p = 1/2$) we win, if it fails ($p = 1/2$) we lose.
- If we get one failure card ($p = 2/3$), we know C is not a spy. One of AB, and one of DE, is a spy; we can't afford to send any more spies on missions, so we have a $1/4$ chance of winning.
So ABC gives us a $1/3$ chance of winning.
- ACD has a ${1 \over 2} \cdot {1 \over 3} = 1/6$ chance of succeding, but if it does we know who the spies are and win. It has a ${1 \over 2} \cdot {2 \over 3} = 1/3$ chance of getting two failure cards; now we know BE are good, A is bad, and have a 50/50 chance of selecting between CD to win. And there's a $1/2$ chance of getting one failure card.
In this case, $1/3$ of the time, AE are spies and BCD are not. $2/3$ of the time, the spies are B and one of CD. Again, we can't afford to fail any more missions. So we can choose either BCD or ACE to go on the remaining missions, and we'll have a $1/3$ chance of winning.
So ACD has a ${1 \over 6} + \left({1 \over 3} \cdot {1 \over 2}\right) + \left({1 \over 2} \cdot {1 \over 3}\right) = 1/2$ chance of winning.
So if there's one spy amongst AB, we select team ACD for mission two, and win half the time.
In total then, we win the five-player game 70% of the time. That's not surprising.
The six-player analysis could probably be done analagously without too much trouble, since there are still only two spies. But the mission sizes go 2-3-4-3-4, which means that if we select no spies in the first mission, it's not obvious that we win, and if we select one spy we also have the choice of selecting CDE for mission two... the decision tree gets a lot more complicated, and I'd rather try to solve this problem intelligently (to work for game sizes 7+ as well). I haven't done this yet, though; it's a job for future-me.
Posted on 29 March 2014
|
Comments
A photo from a (different) recent LW London meetup
Cross-posted to LessWrong
I wasn't going to bother writing this up, but then I remembered it's important to publish negative results.
LessWrong London played a few rounds of paranoid debating at our meetup on 02/02/14. I'm not sure we got too much from the experience, except that it was fun. (I enjoyed it, at any rate.)
There were nine of us, which was unwieldy, so we split into two groups. Our first questions probably weren't very good: we wanted the height of the third-highest mountain in the world, and the length of the third-longest river. (The groups had different questions so that someone on the other group could verify that the answer was well-defined and easy to discover. I had intended to ask "tallness of the third-tallest mountain", but the wikipedia page I found sorted by height, so I went with that.)
I was on the "river" question, and we did pretty badly. None of us really knew what ballpark we were searching in. I made the mistake of saying an actual number that was in my head but I didn't know where from and I didn't trust it, that the longest river was something like 1,800 miles long. Despite my unconfidence, we became anchored there. Someone else suggested that the thing to do would be to take a quarter of the circumference of earth (which comes to 6,200 miles) as a baseline and adjust for the fact that rivers wiggle. I thought, that's crazy, you must be the mole. I think I answered 1500 miles.
In reality, the longest river is 4,100 miles, the third longest is 3900 miles, and the mole decided that 1800 was dumb and he didn't need to do anything to sabotage us. (I don't remember what the circumference-over-four person submitted. I have a recollection that he was closer than I was, but I also had a recollection that circumference-over-four was actually pretty close, which it isn't especially.)
The other team did considerably better, getting answers in the 8,000s for a true answer of 8,600.
I'd devised a scoring system, where every player submits their own answer, and non-moles score proportional to $-\left|\log\left({\text{given answer} \over \text{true answer}}\right)\right|$; the mole scores the negative mean of everyone else's score. But after calculating it for a few people, we decided we didn't really care, and we probably wouldn't be playing enough rounds for it to become meaningful.
Those questions weren't so great, because we felt there wasn't much you could do to approach them beyond having some idea of the correct answer. For round two we tried to pick questions more amenable to Fermi estimates: annual U.S. electricity consumption (sourced from Wolfram Alpha), and the number of pennies that could fit inside St. Paul's Cathedral. This round, we gave the correct answers to the moles.
I was on team Cathedral, and again did pretty badly. We started by measuring pennies using notepaper for scale, and calculating packing densities, to work out pennies per cubic metre. (I don't remember the answer, but we got pretty close.) But after that it was just a matter of knowing how large St. Paul's Cathedral was likely to be.
I had been stood outside St. Paul's Cathedral a few weeks back, but mostly facing in the wrong direction while tourists took photos of the group I was with. From that vague recollection I thought maybe it was about a thousand metres square at the base, and four stories so about twelve metres high? (Later I looked at a picture of the Cathedral, and realized that I was probably thinking of the dimensions of the entrance hall.) Someone else, who had actually been inside the cathedral, was giving much higher numbers, especially for the base, and someone else was agreeing with his general ballpark. And the other people weren't saying much, so I figured one of those two had to be the mole, and decided to not update very much in that direction. Those numbers were pretty reasonable, and mine were pretty bad. One of them was the mole (not the one I most suspected); I don't remember what he said his strategy was.
Again, I'm not too sure it was a great question; pennies-per-cubic-metre is a question of geometry rather than estimation, and the interior dimensions of St. Paul's Cathedral don't seem much more Fermi estimable than the river question.
The other team got very close to the answer I'd given the mole. Apparently they actually came up with a number off the top of someone's head that was pretty damn close. Embarassingly, the answer I gave the mole was an order of magnitude too high.... I'd sourced it from Wolfram Alpha ahead of time, but then someone asked me to clarify whether it was total energy usage, or just electricity usage. I looked again to check, and I think I used a different query, saw that it said "electricity usage", and didn't see that the number was different. The answer I actually gave was for energy usage.
The mole on that team reported that it wasn't really helpful to know the correct answer without any intermediate steps. That mechanic might be worth experimenting further, but currently I think it doesn't add much, and it's a mild extra hassle when setting up.
I had fun, and hopefully in future I will put much less trust in my estimates of the dimensions of things, but I wouldn't say the session was a particular success. Not a failure either, just kind of "it happened".
Posted on 16 February 2014
|
Comments
A book sometimes cited on LessWrong as recommended reading is E.T. Jaynes' Probability Theory: The Logic of Science. I intend to write a series of posts reading this book, summarizing the key points, and solving the exercises. (There are no solutions in the book.)
The book has over 750 pages. This will take me a long time, if I finish at all. I'm not committing to finishing. For example, if this turns out not to be a thing worth doing, I hope that I will notice that and stop. I'm also not committing to any particular posting rate while I continue.
Preface
Jaynes starts by telling us the audience of his book: readers with some knowledge of applied math "at the advanced undergraduate level or preferably higher". Prior experience with statistics and probability is unnecessary and might be harmful.
Next we get a brief history of the book. The rules of probability theory have been known for centuries, but in the mid 20th century, these were discovered to be rules for statistical inference in general, unique and inescapable. You cannot violate them without producing absurdity. These discoveries interested Jaynes, and he started giving lectures on them, and eventually they turned into a book.
Jaynes contrasts his system of probability with that of Kolmogorov and de Finetti. Kolmogorov's system looks totally different, but Jaynes finds himself agreeing with it on all technical points, and considers it merely incomplete. De Finetti's system looks similar to Jaynes', but has very little agreement on technical matters. (I don't know what Kolmogorov's and de Finetti's systems actually look like.)
One point is touched on briefly, which I think will become a large theme of the book: infinite set paradoxes. Jaynes claims that these appear to come about because of failure to properly define the objects we're working with. In Jaynes' system, an infinite set can only be specified as a well-behaved limit of finite sets. Sometimes you can arrive at the same set in two different ways, and you can ask questions of it which depend on the limiting process used; but if the limiting process is specified, answers become unique and paradoxes disappear.
PTTLOS is sometimes hailed as a champion of Bayesian statistics over frequentist ones, and indeed we learn Jaynes' view on the dispute. He has always been Bayesian, but previously it was for philosophical reasons. Now he claims that theorems and worked-out examples demonstrate Bayesian superiority independent of philosophy. But he also says that neither is universally applicable, and that both methods fall out as special cases of Jaynes' approach.
Frequentist methods have certain problems; Bayesian methods correct these problems, but can only be applied with a certain amount of knowledge of the problem at hand. The "principle of maximum entropy" provides enough structure to use Bayesian methods when we lack knowledge.
Jaynes claims that probability theory also models certain aspects of human reasoning. Someone who tells the truth, and whose listeners are reasoning consistently, but who is not believed. A policy debate where discussion of the issues causes society to polarize into two camps with no middle ground, rather than to bring about a consensus as might naively be expected.
The work is not intended to be merely abstract, and Jaynes gives an example of practical applications in safety concerns. We might discover that a substance is harmful at certain doses, and it might be natural to assume a linear response curve: half the dose is half as harmful. But knowing a little about biology tells us that this will be untrue in many cases; there will be a threshold level, below which the substance is eliminated as fast as it enters the body and causes no ill effects. Using a model which ignores the possibility of threshold levels can lead to false conclusions, however good our data is.
Finally in the preface, we are told the style of presentation of the work. Most chapters in part 1 start with verbal discussion of the nature of a problem, before getting into the math (which Jaynes considers the easy part, for many students). Part 2 is more advanced, and chapters open directly with math. Jaynes places an emphasis on intuitive understanding over (but not replacing) mathematical rigor.
Chapter 1: Plausible Reasoning
1.1. Deductive and plausible reasoning.
Suppose a policeman hears a burglar alarm, sees a broken shop window, and a man in a mask climbing out of that window with a bag of jewelry. The policeman will decide that the masked man is dishonest. But this cannot come from purely deductive reasoning: it may be that the man is the owner, on his way home from a masquerade party, and that someone else broke the window and he is merely protecting his own property.
Deductive reasoning follows from two familiar syllogisms: "A implies B; A; therefore B" and "A implies B; not-B; therefore not-A". We often don't have the necessary information to apply these.
There are two weaker syllogisms which we use in inferential reasoning: "A implies B; B is true; therefore A becomes more plausible", and symmetrically, "A implies B; A is false; therefore B becomes less plausible".
(These are related to fallacies of deductive reasoning which say: "A implies B; B; therefore A" and "A implies B; not-A; therefore not-B". But deductive reasoning has only the states "certain", "impossible" and "unknown". Inferential reasoning is more subtle, and lets us say "A becomes more plausible" without saying "A becomes certain".)
The word "implies" is meant to indicate logical consequences, not physical ones. We might say: "if it will start to rain by 10 AM at the latest, then the sky will become cloudy before 10 AM". The rain at 10 AM can hardly be the cause of the clouds before 10 AM. Clouds cause rain, but uncertainly so, so we cannot say "clouds imply rain". "Rain implies clouds" is a true logical connection, but not (directly) a causal one.
And there is a still weaker syllogism which the policeman uses: "if A is true, then B becomes more plausible; B is true; therefore A becomes more plausible". Concretely, "if a man is a burglar, then it becomes more likely that he will wear a mask and climb out of a broken window carrying a bag of jewelry. He is wearing a mask and climbing out of a broken window carrying a bag of jewelry; therefore it becomes more likely that he is a burglar." The syllogism may seem weak abstractly, but we accept that the man is almost certainly a burglar.
(Jaynes does not here mention the natural sixth syllogism: "if A is true, then B becomes more plausible; A is false; therefore B becomes less plausible". But of course this is true too.)
Deductive reasoning is nice, because it never loses strength. An arbitrarily long chain of deductive inferences, if our premises are certain, will produce a certain result. Inferential reasoning does not permit this, but we will still be able to approach near-certainty in many cases, such as with the policeman and the probably-burglar.
1.2. Analogies with physical theories
Jaynes analogizes our study of common sense to that of physics. It's too complicated to learn everything all at once, but sometimes we produce a mathematical model that reproduces some features of our observations, and we consider this progress. We expect our models to be replaced by more complete ones in future. We don't expect that the most familiar aspects of mental activity will be the easiest to model.
1.3. The thinking computer
In principle, a machine can do anything that a human brain can do. Proof by existence: the human brain does it. If we don't know how to make a machine do something, it's because we don't know how to describe the thing in sufficient detail. This is also true of "thinking".
But as we study common sense, we're going to start to learn about thinking in more detail than we currently have. Whenever we make a mathematical model of some aspect of common sense, we can write a computer program to apply this model.
If we have two very different hypotheses, unaided common sense might be enough to choose one over the other. If we have a hundred similar ones, we need a computer, and we need to know how to program it. Our goal is to develop the theory that lets us program it correctly to solve such problems, in as much depth and generality as we can manage.
Talking about machines is also useful for keeping ourselves focused. The reasoning process actually used by human brains is complicated and obscure, and hard to talk about without becoming side tracked. But we're not trying to explain or emulate a human brain. We're speaking normatively, not descriptively.
Up until now we've been asking how to build a mathematical model of human common sense. A better question would be: how can we make a machine carry out plausible reasoning, following clear principles that express an idealized common sense?
1.4. Introducing the robot
To keep our attention focused, we're going to invent an imaginary robot. We design its brain as we see fit, and we see fit to make it reason according to definite rules. We choose the rules according to desiderata that seem like they would be desirable in human brains: if a rational person discovered that they were violating one of these desiberata, they would wish to revise their thinking.
Our robot reasons about propositions, and for now we restrict it to unambiguous true-or-false propositions. We don't require their truth or falsehood to be easy, or even possible, to establish with certainty, but it must be clear what truth and falsehood actually mean. For example, Jaynes considers both of these propositions to be true:
$$\begin{aligned}
A ≡ & \text{Beethoven and Berlioz never met.}\\
B ≡ & \text{Beethoven's music has a better sustained quality than that of}\\
& \text{Berlioz, although Berlioz at his best is the equal of anybody.}
\end{aligned}$$
Our robot can think about proposition $A$, although its truth or falsehood probably can't be established today. But for now, $B$ is off-limits. Later we'll see whether this restriction can be relaxed, and our robot can help with propositions like $B$. ("See chapter 18 on the $A_p$ distribution.")
1.5. Boolean algebra
At this point Jaynes briefly introduces Boolean algebra. I assume the reader is familiar with it, but I note that we denote conjunction (logical AND) by $AB$, disjunction (logical OR) by $A+B$, and denial (logical NOT) by $\bar A$. $≡$ means "equals by definition".
Also, a brief reminder of certain identities:
- Idempotence:
$AA = A+A = A$.
- Commutativity:
$AB = BA$; $A+B = B+A$.
- Associativity:
$A(BC) = (AB)C = ABC$; $A+(B+C) = (A+B)+C = A+B+C$.
- Distributivity:
$A(B+C) = AB + AC$; $A + (BC) = (A+B)(A+C)$.
- Duality:
$\overline{AB} = \bar A + \bar B$; $\overline{A+B} = \bar A \bar B$.
If $A=B$, by which we mean that $A$ and $B$ are logically equivalent, then $A$ and $B$ must also be equally plausible. This seems obvious, but Jaynes notes that Boole himself got it wrong.
As usual, $A ⇒ B$ means "$A$ implies $B$", i.e. $A+\bar B$. Remember that this is a much narrower statement than it would be in ordinary language; it doesn't mean that there is any actual connection between $A$ and $B$.
1.6. Adequate sets of operations
We have four operations (AND, OR, NOT and IMPLIES) that we can apply to propositions. We can use these to generate many new propositions, but two questions occur to us. Firstly, are there propositions defined from $A$ and $B$ that we can't represent using these operations? Secondly, can we reduce the number of operations without losing any propositions that we can currently generate?
We answer the first question no: any logical function on $n$ variables (of which there are $2^{2^n}$) can be written as a disjunction of conjunctions of arguments and their negations. Each conjunction includes every argument precisely once, either purely or negated, and is true at precisely one point of its domain. For example, $AB\bar C + \bar A \bar B C + \bar A B \bar C$. There is one semi-exception, where we have zero conjunctions (the function which is constantly false), and this can be written $A\bar A$.
We answer the second question yes: it is clear from the previous answer that IMPLIES is unnecessary, and from duality that we can do away with either (but not both) of AND and OR. We can't reduce either of the sets (AND, NOT) or (OR, NOT), but there are two operators which suffice by themselves: NAND, $A↑B = \overline{AB}$, and NOR, $A↓B = \overline{A+B}$.
1.7. The basic desiderata
We now move on to our extension of logic. This will follow from the conditions to be discussed in this section. We don't call them "axioms" because we're not asserting that they're true, just that they appear to be desirable goals. In chapter 2 we'll discuss whether the goals are contradictory, and whether they determine any unique extension of logic.
Our robot must assign a degree of plausibility to each proposition that it thinks about, based on the evidence it has available. When it collects new evidence, it must update these degrees of plausibility. To store these plausibility assignments in the brain, they have to be associated with some physical quantity, such as a voltage level. This means there must be some kind of association between degrees of plausibility and real numbers. Thus we have
- Desideratum (I). Degrees of plausibility are represented by real numbers.
This is motivated by physical necessity, but appendix (A) shows that it's also a theoretical necessity.
Being more specific, two other properties of this representation will be useful. Firstly, that a greater degree of plausibility is represented by a greater real number. And secondly, continuity; this is difficult to state precisely yet, so we say it intuitively: an infinitesimally greater plausibility should correspond to an infinitesimally greater number.
The plausibility assigned to a proposition will often depend on whether some other proposition is true. We use the symbol $A|B$ to denote "the conditional plausibility that $A$ is true, given that $B$ is true", or "$A$ given $B$". We also have, for example, $A | BC$, the plausibility of $A$ given that $B$ and $C$ are true; and $A+B \mid CD$, the plausibility that at least one of $A$ and $B$ is true, give that both of $C$ and $D$ are true; and so on.
(I mildly dislike this notation, myself, but we're stuck with it. I think the problem is spacing: the way it's usually typeset, $A + B | CD$ looks like it should be parethesised $A + (B|(CD))$ rather than $(A+B)|(CD)$. I prefer to have more whitespace around operators with low precedence.)
We're not going to attempt to define constructions like $A|B$ when $B$ is impossible; for example, if we write $A|BC$, we are implicitly assuming that $B$ and $C$ are compatible.
We want our robot to think in a way which is qualitatively like the way humans try to reason, as described by the above weak syllogisms and similar ones. So suppose the robot has prior information $C$ which gets updated to $C'$ such that the plausibility for $A$ is increased, but the plausibility of $B$ given $A$ does not change:
$$ A|C' > A | C; \\
B|AC = B|AC'. $$
This can never decrease the plausibility that both $A$ and $B$ are true:
This doesn't say anything about how much the plausibilities change, but it gives a sense of direction. These qualitative requirements will be stated explicitly in chapter 2; for now we sum them up as
- Desideratum (II). Qualitative correspondance with common sense.
Finally, we wish our robot to reason consistently. By this we mean
- Desideratum (IIIa). If a conclusion can be reasoned out in more than one way, then every possible way must lead to the same result.
- Desideratum (IIIb). The robot always takes into account all of the evidence it has relevant to a question. It does not arbitrarily ignore some of the inormation, basing its conclusions only on what remains. In other words, the robot is completely nonideological.
- Desideratum (IIIc). The robot always represents equivalent states of knowledge by equivalent plausibility assignments. That is, if in two problems the robot's state of knowledge is the same (except perhaps for the labeling of the propositions), then it must assign the same plausibilities in both.
Desiderata (I), (II) and (IIIa) are structural requirements on the robot's brain, while (IIIb) and (IIIc) show how the robot should relate to the outside world.
It turns out that these are all the desiderata we need. There is only one set of mathematical operations for manipulating plausibilities which satisfies all of them; they uniquely determine the rules by which our robot must reason. These rules will be deduced in chapter 2.
1.8. Comments
Our robot's mental state about any proposition is going to be represented by a real number. A human's mental state is much more complicated: we judge propositions as being plausible, desirable, amusing, etc. A human's mental state might be better represented as a many-dimensioned vector of real numbers.
Unemotional propositions like "the refractive index of water is less than 1.3" can be represented with fewer dimensions than propositions like "your mother-in-law just wrecked your new car". The situations we encounter in real life are often the ones requiring many coordinates. This may help explain why human reasoning about such situations is hard to model; and why math and science (dealing with propositions that generate simple mental states) are so successful.
Many of these coordinates are not useful to us. We don't want our robot to get bored with a lengthy problem, or to get confused by emotional factors. But there is a large unexplored area of possible generalizations of the theory that we'll be developing, which could more accurately model actual human brains.
1.8.1. Common language vs. formal logic
Ordinary language, if used carefully, does not need to be less precise than formal logic. But it's more complicated, giving it richer possibilities of expression. In particular, it has many ways to imply something without saying it, which are lost on formal logic. The claim "I believe what I see" and the retort "he doesn't see what he doesn't believe" would have the same meaning in formal logic, but convey opposite meanings in common language.
Another example is that the word "is" is not commutative in common language. This example is taken from a math textbook: consider a straight line in the plane, and an infinite set of points in the plane, and the projections of the points onto the line. Then the statements
- The projection of the limit is the limit of the projections
- The limit of the projections is the projection of the limit
Are not considered equivalent. The projections may have a limit while the points themselves do not (but not vice versa). The first statement implicitly asserts that the points have a limit, and is true conditional on that premise; the second implicitly asserts only that the projections have a limit, and is false.
We can also distinguish between two different senses of the word "is". The epistemological sense, "the room is noisy", expresses something about the speaker's perception. The ontological sense, "there is noise in the room", asserts the physical existence of something. Mistaking one's thoughts and sensations for external realities is called the "mind projection fallacy", and causes much trouble in probability theory and elsewhere.
(This is Jaynes' way of reminding us that the map is not the territory. It is not meant to imply that one's thoughts and sensations are completely unrelated to the real world.)
We will not attempt to make our robot grasp common language.
1.8.2. Nitpicking
Sometimes people question the second strong syllogism, "A implies B; not-B; therefore not-A". But Jaynes is happy to use it, noting among other things that if we exhibit a counterexample to a supposed theorem, then the supposed theorem is considered disproved. A new logic might lead to results which Aristotelian logic can't talk about, and that's just what we're trying to create here. But if a new logic was found to disagree with Aristotelian logic, then we would consider that a fatal flaw in the new logic.
There are attempts to develop multiple-value logics. But arguments presented in appendix A suggest that there is no new content in these. An $n$-valued logic applied to a set of propositions is equivalent to a two-valued logic applied to a larger set; or it is inconsistent.
Posted on 02 February 2014
|
Comments
There's a fairly well-known (as these things go) IOCCC entry ("westly.c") to calculate pi. It looks like this:
#define _ F-->00 || F-OO--;
long F=00,OO=00;
main(){F_OO();printf("%1.3f\n", 4.*-F/OO/OO);}F_OO()
{
_-_-_-_
_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_
_-_-_-_
}
This prints 3.141, but you could get more digits by increasing the size of the circle (and changing the printf call).
I recently decided to port this to python. Here's the result:
class F:pass
s=F();_=F();
a=[];3;d=[];
A=(lambda:a.
append(1) or
s);D=(lambda
:d.append(1)
or s);l=len;
_.__neg__=(#
(lambda: _))
_.__sub__=(#
lambda x: (D
() and A()))
s. __sub__=\
lambda x:A()
-_
_-_-_-_
_-_-_-_-_-_
-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_
-_-_-_-_-_-_-_
_-_-_-_-_-_
_-_-_-_
-_
print"%.3f"%(8.*l(a+d)/l(d)**2)
The rest of this post is spoilers, so stop reading if you'd like to figure out what's going on for yourself.
How it works
The original C version works by counting the area of the circle in variable F, and its diameter in variable OO. Then it calculates $ π = {4A \over d^2} $, and prints this to 3 decimal places.
In more detail, _ is #defined to the statement F-->00 || F-OO--;. Since F is never positive in this program, by default this decrements both F and OO. But _ mostly appears in the context of -_, which becomes -F-->00 || F-OO--;, which will only decrement F (since F is negative whenever this statement occurs).
So the diameter of the ascii circle is counted as the number of lines, and its area is counted as the total number of _s it contains.
The python version uses a similar algorithm, but implemented differently. There's no preprocessor, so instead I use operator overloading. _ is a variable. Strings like _-_-_ count the total number of _s minus one into variable a, and count one into variable d. Then we use $ π = {8(a+d) \over d^2} $. 8 because it's only a semicircle, and $(a+d)$ because a is missing one _ per line.
How it does that is easy enough. _ has a __sub__ method which increments both a and d, and returns a new variable, s. s has a __sub__ method which just increments a and returns s. So a line like _-_-_-_ becomes s-_-_ becomes s-_ becomes s, and on the way it counts 1 into d and 3 into a.
We can also start a line with a -. -_ is defined to just return _, so it doesn't change anything. The top and bottom lines, -_, are just there for aesthetic reasons, they don't do anything at all.
As an implementation detail, a and d are actually lists. You can't call a += 1 in a python lambda, because python's lambdas are crippled (limited to a single expression, whereas a+=1 is a statement). So I have functions
A = lambda: a.append(1) or s
D = lambda: d.append(1) or s
which are slightly shorter, easier to lay out, and I think more in the spirit of the thing, than writing def A(): global a; a+=1; return s. (There's also some historical motivation. I originally used I = lambda x:x.append(1) or s and called I(a) and I(d). But I had difficulty fitting it into the square block, and in the course of experimentation ended up with what I have now. I could have gone with a and d as one-element arrays and def I(x): x[0]+=1; return s. But, whatever.)
Then in the final line, l(a+d) is just len(a+d), and summing arrays concatenates them so that's the same as len(a)+len(d). And l(d) is just len(d).
Here's an unobfuscated version of the header:
class F:pass
s = F()
_ = F()
a = []
d = []
A = lambda: a.append(1) or s
D = lambda: d.append(1) or s
l=len
_.__neg__ = lambda: _
_.__sub__ = lambda x: D() and A()
s.__sub__ = lambda x: A()
Incidentally, I only have one code block, class F:pass. When I decided to lay the header out in a rectangle, because yay geometric shapes, that line forced the width because I can't put anything before or after it. I wanted to go with F = type('', (), {}). But for some reason, that makes subtraction throw a TypeError:
>>> class F: pass
...
>>> x = F()
>>> x.__sub__ = lambda x: 3
>>> x-x
3
>>> G=type('',(),{})
>>> y = G()
>>> y.__sub__ = lambda x: 3
>>> y - y
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for -: '' and ''
I don't know why that happens.
You might be wondering, if I'm using the same algorithm as the C version, why is my semicircle so much larger than the original circle? That's an annoying quirk of mathematics.
With a circle, we have $ π = { 4a \over d^2 } $. For fixed $d$, we need an area $ a = { πd^2 \over 4 } $. But our area is an integer. So for fixed $d$, we can get an upper and lower bound on $π$, and we choose to display one of these. As we increase $d$, the bounds get tighter, but the lower bound doesn't move monotonically up, and the upper bound doesn't move monotonically down. $ d=16 $ is the first time that either bound gives us three decimal places, with an area of 201.
I had to go with a semicircle, because of significant whitespace, so I instead need an area $ a = { πd^2 \over 8 } $. With a diameter of 16, I could choose between an area of 100 or 101. These give π as 3.125 or 3.15625, which aren't even correct to two decimal places. If I wanted three decimal places, I needed a diameter of 32 and an area of 402. (My semicircle is actually 34 lines tall, but as noted above, the first and last lines don't do anything.)
Pedants will note that π rounded to three decimal places should actually be 3.142. I could have acheived that with a diameter of 34 and an area of 454. With a full circle, we could go for diameter 17 and area 227.
Posted on 18 January 2014
|
Comments
Cross-posted to Less Wrong.
It feels like most people have a moral intuition along the lines of "you should let people do what they want, unless they're hurting other people". We follow this guideline, and we expect other people to follow it. I'll call this the permissiveness principle, that behaviour should be permitted by default. When someone violates the permissiveness principle, we might call them a fascist, someone who exercises control for the sake of control.
And there's another moral intuition, the harm-minimising principle: "you should not hurt other people unless you have a good reason". When someone violates harm-minimisation, we might call them a rake, someone who acts purely for their own pleasure without regard for others.
But sometimes people disagree about what counts as "hurting other people". Maybe one group of people believes that tic-tacs are sentient, and that eating them constitutes harm; and another group believes that tic-tacs are not sentient, so eating them does not hurt anyone.
What should happen here is that people try to work out exactly what it is they disagree about and why. What actually happens is that people appeal to permissiveness.
Of course, by the permissiveness principle, people should be allowed to believe what they want, because holding a belief is harmless as long as you don't act on it. So we say something like "I have no problem with people being morally opposed to eating tic-tacs, but they shouldn't impose their beliefs on the rest of us."
Except that by the harm-minimising principle, those people probably should impose their beliefs on the rest of us. Forbidding you to eat tic-tacs doesn't hurt you much, and it saves the tic-tacs a lot of grief.
It's not that they disagree with the permissiveness principle, they just think it doesn't apply. So appealing to the permissiveness principle isn't going to help much.
I think the problem (or at least part of it) is, depending how you look at it, either double standards or not-double-enough standards.
I apply the permissiveness principle "unless they're hurting other people", which really means "unless I think they're hurting other people". I want you to apply the permissiveness principle "unless they're hurting other people", which still means "unless I think they're hurting other people".
Meanwhile, you apply the permissiveness principle unless you think someone is hurting other people; and you want me to apply it unless you think they're hurting other people.
So when we disagree about whether or not something is hurting other people, I think you're a fascist because you're failing to apply the permissiveness principle; and you think I'm a rake because I'm failing to apply the harm-minimisation principle; or vice-versa. Neither of these things is true, of course.
It gets worse, because once I've decided that you're a fascist, I think the reason we're arguing is that you're a fascist. If you would only stop being a fascist, we could get along fine. You can go on thinking tic-tacs are sentient, you just need to stop being a fascist.
But you're not a fascist. The real reason we're arguing is that you think tic-tacs are sentient. You're acting exactly as you should do if tic-tacs were sentient, but they're not. I need to stop treating you like a fascist, and start trying to convince you that tic-tacs are not sentient.
And, symmetrically, you've decided I'm a rake, which isn't true, and you've decided that that's why we're arguing, which isn't true; we're arguing because I think tic-tacs aren't sentient. You need to stop treating me like a rake, and start trying to convince me that tic-tacs are sentient.
I don't expect either of us to actually convince the other, very often. If it was that easy, someone would probably have already done it. But at least I'd like us both to acknowledge that our opponent is neither a fascist nor a rake, they just believe something that isn't true.
Posted on 04 January 2014
|
Comments
If you hang out on /r/scifi for long enough, you'll hear someone say that Star Wars is not science fiction. It's fantasy in space, or space opera, or whatever, but it isn't science fiction.
(The example that inspired this post, nine months ago, was this thread.)
And, well, they're wrong.
I want to call it a "no true Scotsman" fallacy. Unfortunately, there's no good definition of science fiction that I can point to and say "whatever Star Wars has or doesn't have, that you think makes it not science fiction, is not part of the definition".
Worse, the person in question usually has a definition of science fiction. (In this case, "science fiction embraces change".) And indeed, it's a definition that Star Wars does not attain.
But what this should tell us is not "Star Wars is not science fiction". It's "your definition of science fiction is broken". Consider:
"No Scotsman takes sugar with their porridge!"
"But my uncle Hamish takes sugar with his porridge."
"No, sorry, I was unclear. I define a Scotsman as a person of Scottish citizenship who does not take sugar with their porridge."
It's a bit ridiculous, isn't it?
Star Wars isn't as bad as that, for (at least) two reasons. Since there's no agreed-upon definition of science fiction, choosing your own is more reasonable than choosing your own definition of Scotsman. (Though, I wonder just how well-defined "Scotsman" is. Is the important factor citizenship, birth, personal identity? A mix? Is a "Scotsman" actually necessarily male, with "Scotsperson" being gender neutral? Still, it's better defined than science fiction.)
The other reason is that the proposed definition of "Scotsman" is a silly definition even to consider. Why are you talking about the conjunction of "Scottish citizenship" and "doesn't take sugar with their porridge"? Is there anything you can say about "people of Scottish citizenship who do not take sugar with their porridge", that doesn't follow trivially from what you know of "people of Scottish citizenship" and "people who take sugar with their porridge"? Whereas the proposed definition of science fiction is often quite a reasonable definition of something that we might want to talk about.
Just not a reasonable definition of science fiction: it doesn't include Star Wars.
I don't think that it's bad, in general, to take a poorly-defined concept and attempt to give it a rigorous definition. And sometimes we might find that in doing so, there are some things that we've been thinking of as frantles that don't seem to fit in very well with all the other things that we think of as frantles, and maybe we should stop calling them frantles. I think something like that happened with the question of whether or not 1 is a prime number, for example.
But I think that if you do that to science fiction, and eliminate Star Wars, you've done something wrong.
My reasoning for this is basically "Star Wars is obviously science fiction, duh". But a more compelling argument is: if you eliminate Star Wars, you're bound to be eliminating a whole load of other stuff. For example, if the Force disqualifies Star Wars for being too magical, then presumably Stranger in a Strange Land doesn't count either. It's been a while since I read the 2001 series, but I think David Bowman would disqualify those as well.
There are stories that I consider science fiction that I wouldn't make this complaint about. If someone's definition excluded Ted Chiang's 99 Letters, that would be fine. But when you say that Star Wars isn't science fiction, you're no longer speaking the same language as your audience, unless your audience is composed mostly of the sort of people who try to come up with rigorous definitions of science fiction.
It's also important to note that whether or not a work counts as science fiction is fundamentally unimportant. If you read a book, and you're not sure whether it counts as science fiction, it's not because you're ignorant about the book. Learning that it is or isn't science fiction won't teach you anything new.
But still. Communicating clearly is important, and if you talk about science fiction and don't intend to include Star Wars, you're failing at that.
Posted on 22 December 2013
|
Comments
Chicken is a well-known two-player game in which the players get inside different cars, drive towards each other, and hope that the other player swerves out of the way. If both players swerve, they both lose small amounts of charisma. If one player swerves and the other doesn't, the swerver loses a slightly larger amount of charisma, and the non-swerver gains a small amount. If neither player swerves, both lose a large (occasionally fatal) amount of HP, and a small amount of charisma.
It is sometimes said[citation needed] that there is an optimal strategy to playing Chicken: after pointing your car at your opponent, remove your steering wheel. Now you can't swerve, so she knows that she can either swerve or crash.
Hearing this strategy, you head down to your local Chicken field, armed with your trusty screwdriver, and eager for a game. You quickly find an opponent. The two of you perform the traditional squaring-off ritual, enter your respective vehicles, and wheel around to face each other. You take your screwdriver in hand, smile sweetly at your opponent... and realise that she has had the same idea, and is already removing her steering wheel.
Are you still going to remove your own wheel? Or are you going to swerve?
It's easy to think that Chicken can turn into a game of "who can precommit first". We race to remove our steering wheels; if I win, then you have to swerve or crash, and if you win, then I have to swerve or crash.
But what if I put on a blindfold? Now if you remove your steering wheel, I won't know about it. What good does it do you then? Even if you always remove your steering wheel, assuming your opponent doesn't remove theirs first - and even if I know that you do that, and you know that I know, and so on - do I know that you're still going to do it when I can't see you do it? You'd like me to think that you will, but how do you convince me? And even if you do convince me, I can still remove my own steering wheel, and if I do it while blindfold I won't know if you beat me to it.
With a blindfold, I'm not precommiting not to swerve, although I can do that as well if I choose. I'm precommiting to ignore your precommitments. (Sort of. I'm precommiting not to receive any sensory input about your precommitments. I can still react to your precommitments, e.g. if both you and my model of you precommit despite me not seeing you do it.)
Of course, you can also put on a blindfold. So maybe we race to do that? But then I can tell you before the game that even if you put on a blindfold, I'm still removing my steering wheel. And you say there's no reason for me to do that, so you don't believe that I will. And I say, "try me". If I've said this in front of a large audience, the cost to me of swerving just went up. Moreso if some of the audience are potential future blindfold opponents.
And you can say the same thing to me. But this meta-level is different. If you've removed your steering wheel, you can't swerve. If you've blindfolded yourself, you can't see me precommit. Those are credible precommitments. There's no object-level reason for me to copy you after you've already done it. But just telling me of your precommitment is far less credible, you can always change your mind.
So if you tell me that you're going to ignore my precommitment to ignore your precommitment, I can tell you the same thing, hoping to change your mind.
Assuming neither of us claims to back down, what happens now? We face off against each other, and if I blindfold myself, I don't know whether you're going to do the same, and I don't know whether you're going to remove your steering wheel, and you know I don't know, and I know you know I don't know... and it seems to me that we're back at the original game.
(Incidentally, I've ignored the possibility that a steering wheel might be replaceable; or that a blindfold can be removed, or light enough to see through from the inside. These possibilities make your precommitments less credible.)
Posted on 07 December 2013
|
Comments
(Cross-posted to LessWrong.)
Human brains are bad at evaluating consequences. Sometimes we want to do something, and logically we're pretty sure we won't die or anything, but our lizard hindbrains are screaming at us to flee. Comfort Zone Expansion (CoZE) is an exercise that CFAR teaches to get our lizard hindbrains to accept that what we're doing is actually pretty safe.
Roughly it involves two steps. One: do something that makes our lizard hindbrains get pretty antsy. Two: don't get eaten as a result.
I organised a CoZE exercise for LessWrong London on Septmeber 1st. We had a total of eight participants, I think I was the only one who'd done any structured CoZE before.
My plan was: we meet at 11am, have a discussion about CoZE, play some improv games to warm up, and then head to the nearby mall for 1.30 pm. In reality, the discussion started closer to 12pm, with some people showing up part way through or not until it was finished.
After finishing the discussion, we didn't end up doing any improv games. We also became slightly disorganised; we agreed on a meeting time an hour and a half in the future, but then our group didn't really split up until about twenty minutes after that. I could have handled this better. We got distracted by the need for lunch, which I could have made specific plans for. (Ideally I would have started the discussion after lunch, but shops close early on Sunday.)
Report
My solo activities went considerably less well than I'd expected. My first thing was to ask a vendor in the walkway for some free chocolate, which annoyed her more than I'd expected. Maybe she gets asked that a lot? It was kind of discouraging.
After that I wanted to go into a perfume shop and ask for help with scent, because I don't know anything about it. I wandered past the front a couple of times, deciding to go when the shop was nearly empty, but then when that happened I still chickened out. That, too, was kind of discouraging.
Then I decided to get back in state by doing something that seemed easy: making eye contact with people and smiling. It turns out that "making eye-contact" is a two-player game, and nobody else was playing. After some minutes of that I just gave up for the day.
In my defense: I had a cold that day and was feeling a bit shitty (while wandering near the perfume shop I got a nose-bleed, and had to divert to the bathroom temporarily), and that might have drained my energy. I did also get a few minor victories in. The most notable is that I did some pull-ups on some scaffolding outside, and someone walking past said something encouraging like "awesome". (I'd like to do these more often, but the first time I tried there was ick on the scaffolding. There wasn't this time, so I should collect more data.)
[[I spoke with Critch from CFAR a few days afterwards, and he gave me a new perspective: if I go in expecting people to respond well to me, then when they don't, that's going to bother me. If I go in expecting to annoy people, but remembering that annoying people, while bad, doesn't correspond to any serious consequences, then it's going to be easier to handle. For any given interaction, I should be trying to make it go well, but I should choose the interactions such that they won't all go well. (Analogy: in a game of Go, the stronger player might give a handicap to the weaker player, but once play starts they'll do their best to win.)
He also gave me a potential way to avoid chickening out: if I imagine myself doing something, and then I try to do it and it turns out to be scary, then that feels like new information and a reason to actually not do it. If I imagine myself doing something, being scared and doing it anyway, then when it turns out to be scary, that no longer counts as an excuse. I haven't had a chance to try this yet.]]
Other people had more success. We'd primed ourselves by talking about staring contests a lot previously, so a few people asked strangers for those. I think only one stranger accepted. Trying to get high-fives was also common; one person observed that he sometimes does that anyway, and has a much higher success rate than he did in the mall. One person went into a high-end lingerie store and asked what he could buy on a budget of £20 (answer: nothing). And of course there were several other successes that I've forgotten. I got the impression that most people did better than me.
There was interest in doing this again. At the time I was hesitant but realised that I would probably become less hesitant with time. I've now reached a point where I, too, would quite like to do it again. We haven't got any specific plans yet.
Things to take away:
- Have some well-defined way to transition from "about-to-start" to "started".
- Having an audience makes some things much easier. This is potentially a way to escalate difficulty slowly.
- When I did CoZE at the CFAR workshop I was hanging out with someone, who encouraged me to actually do things as well as provided an audience. At the time I wondered whether the fact that we were together meant I did less, but I did far more with her than by myself.
- We didn't do anything beforehand to get in state, like improv games. I can't say whether this would have helped, but it seems like it might have done.
- We had a nonparticipant volunteer to be a meeting point if participants wanted to finish early. If she hadn't been there, I might not have quit twenty minutes early, but given that I did, it was nice to be able to hang out. This might be a good way to enable longer sessions.
An objection
During the discussion about ethics, someone brought up an objection, which I think cashes out as: there are good places to expand our comfort zones into, there are bad places, and there are lame places. A lot of the stuff on the recommended list of activities is lame (do we really need to be better at asking people for staring contests?), and it's not clear how much it generalises to good stuff. Under the novice-driver line of thought, bothering people is an acceptable cost of CoZE; but if we're using CoZE for lame things, the benefits become small, and maybe it's no longer worth the cost.
I guess this boils down to a question of how much the lame stuff generalises. I'm optimistic; for example, it seems to me that a lot of the lame stuff is going to be overcoming a feeling of "I don't want to bother this person", which is also present in the good stuff, so that particular feature should generalise. (It may also be that the lame stuff is dominated by the good stuff, so there's no reason to ever practice anything lame; but that seems a sufficiently complicated hypothesis to be low prior.)
(There's also a question of whether or not people are actually bothered by strangers asking for staring contests. My initial assumption was not, but after doing the exercise, I'm not sure.)
Posted on 18 November 2013
|
Comments
(Fri, 6 Dec 2013: Importing this post from its original home as a gist.)
The point of this post is an attempt to calculate e to given precision in bash, a challenge given in a job listing that I saw recently. I kind of got nerd sniped. I wrote this as I went along, so there may be inconsistencies.
First attempt
The obvious method to compute e is as the infinite sum of 1/n!, n from 0 to ∞. This converges quickly, but how far do we have to calculate to get the n'th digit of e? We can deal with that later.
We obviously need a factorial function.
fac() {
if [[ $1 -eq 0 ]]; then
echo 1
else
echo $(( $1 * `fac $(($1 - 1))` ))
fi
}
Since we only have integer division, we obviously can't calculate 1/2!. But note that x/a! + y/(a+1)! = (x(a+1) + y)/(a+1)!. We can use this to recursively calculate (sum 1/k! [0 ≤ k ≤ n]) as numer(n)/n!, where numer(0) = 0, numer(k) = k*numer(k-1) + 1.
numer() {
if [[ $1 -eq 0 ]]; then
echo 1
else
echo $(( $1 * `numer $(($1 - 1))` + 1 ))
fi
}
So now we can calculate the partial sums. Since we still only have integer division, we need to multiply them by a power of 10 to get rid of the decimal point.
nthsum() {
echo $(( 10**($1-1) * `numer $1` / `fac $1` ))
}
Note that this fails for n=0 (10**-1 isn't allowed), but we can live with that.
So this kind of works:
$ for i in `seq 1 15`; do nthsum $i; done
2
25
266
2708
27166
271805
2718253
27182787
271828152
2718281801
27182818261
2252447557
-1174490000
104582974
1946803
Up to n=11, we accurately calculate the first (n-3) digits of e. For n=12 and above, we get integer overflows.
It doesn't look like we can go very far with this: the numbers we're working with are simply too large.
Second attempt
If you google "algorithm to calculate a specific digit of e", this paper comes up: http://eprints.utas.edu.au/121/1/Calculation_of_e.pdf. It provides a simple algorithm using (mostly) integer arithmetic, implemented in ALGOL. It's simple enough to translate into bash:
ecalc() {
let n=$1
echo -n 2.
for (( j = n; j >= 2; j-- )); do
coef[j]=1
done
for (( i = 1; i <= n; i++ )); do
let carry=0
for (( j = n; j >= 2; j-- )); do
let temp=coef[j]*10+carry
let carry=temp/j
let coef[j]=temp-carry*j
done
echo -n $carry
done
echo
}
This isn't quite accurate: the original algorithm calculates m such that m! > 10^(n+1), and the loops over j go from m to 2 instead of n to 2. This means the algorithm is inaccurate for small n. (For n ≥ 27, n! > 10^(n+1) so it works out okay; for 22 ≤ n ≤ 26, we have 10^(n-1) < n! < 10^(n+1) and the result is accurate anyway. It seems like the algorithm is unnecessarily conservative, but we might also find that m! > 10^(n-1) is insufficient for larger n.) For large n, we do unnecessary work, but get the correct result.
We can fix both these problems, but this algorithm isn't especially nice anyway. Its time complexity is O(n^2). Can we do better?
Third attempt
(Spoiler alert: this doesn't go so well.)
The same google search also gives this page: http://www.hulver.com/scoop/story/2004/7/22/153549/352 which hints at an algorithm without providing it explicitly. We can adapt our first attempt for this.
Write numer(n) as a_n, so a_0 = 1 and a_n = n * a_(n-1) + 1. This gives 1/0! + 1/1! + ... + 1/n! = a_n / n!. We know that e = lim [n→∞] a_n / n!; but more than this, we can show that for any n ≥ 1, a_n / n! < e < (a_n + 1)/n!.
(Proof of this: (a_n+1) / n! = 1/0! + 1/1! + ... + 1/n! + 1/n!. This is greater than e if 1/n! > 1/(n+1)! + 1/(n+2)! + ..., which holds if 1 > 1/(n+1) (1 + 1/(n+2) (1 + ... )). For n ≥ 1, RHS is ≤ 1/2 (1 + 1/3 (1 + ... )) which we know is e-2 < 1.)
So if a_n / n! and (a_n + 1) / n! agree up to k decimal places, these must be the first k decimal places of e.
Moreover, we can extract specific decmial places while keeping to integer division: the fractional part of x/y is (x%y)/y, so the first decmal digit is int( (10*(x%y))/y ) or int( (x%y)/(y/10) ) (assuming y%10 = 0), and we can extract further digits by doing the same thing again.
This gives us an algorithm for calculating e to n decimal places, one digit at a time:
ecalc() {
let a=1
let b=1
let d=0
let k=1
let n=$1
while (( d <= n )); do
while (( a/b != (a+1)/b || b%10 != 0 )); do
let a=k*a+1
let b*=k
let k+=1
done
echo -n $(( a / b ))
let d+=1
let a%=b
let b/=10
done
echo
}
Unfortunately, this only works up to three decimal places before we get overflows. The problem is that b only gets a new power of 10 every time k%5 = 0. Unfortunately 24!/10000 overflows, so we only get digits from k=5, 10, 15, 20. (In fact, ecalc 4 is also correct; but this seems to be just coincidence.)
We can delay the inevitable by keeping track of powers of ten explicitly: when we generate a new digit, if b%10 != 0, increment a counter and consider (10^powten * a)/b and (10^powten * (a+1))/b. This gives us a few more digits, but before long 10^powten * a overflows.
So, how to get around this? Why not just implement arbitrary-precision integers in bash?
It sounds crazy, but we don't actually need a complete implementation. The only operations we need are:
- Add one to a bigint.
- Multiply a bigint by an int.
- Divide a bigint by 10.
- Modulo a bigint by 10.
- Integer division of bigints, with small ratio.
- Modulo a bigint by another bigint, also with small ratio.
The latter two can be implemented with subtraction and comparison, so it shouldn't be too hard.
Let's represent a big integer as an array of numbers, each smaller than 2^32. Since bash can represent numbers up to 2^63 - 1, we can raise n up to 2^31 - 1 before overflows become a serious problem. As of 2010, e was only known up to about 2^40 digits, so this is an acceptable limit. But it's admittedly quite arbitrary, and there's no reason to hardcode it.
We'll want some way of taking an array of numbers and turning it into a bigint, if some of its elements are greater than MAXINT or less than 0. I don't think there's a convenient way of passing around arrays in bash, so let's use a register-based approach, and operate destructively on the variable $res. (This is for convenience: $res will be the output variable of other operations, and we expect normalisation to be the last thing they do.)
normalise() {
local i
for (( i = 0; i < ${#res[@]}; i++ )); do
if (( res[i] >= MAXINT )); then
let res[i+1]+=(res[i] / MAXINT)
let res[i]=(res[i] % MAXINT)
elif (( res[i] < 0 && ${#res[@]} > i+1 )); then
let res[i+1]-=1
let res[i]+=MAXINT
fi
done
}
This doesn't handle every case; for example, a term smaller than -MAXINT will break things. But it will be sufficient for our purposes.
With this, addition and subtraction are easy. We only need addition of an int and a bigint, so will call this addi (i for integer) and operate on the variable $op1.
addi() {
res=( ${op1[@]} )
let res[0]+=$1
normalise
}
Subtraction needs to be defined between two bigints, but we only need positive results.
sub() {
local i
res=()
for (( i = 0; i < ${#op1[@]}; i++ )); do
let res[i]=op1[i]-op2[i]
done
normalise
}
Multiplication and division follow similarly. (We only need to divide a bigint by 10, but allowing an arbitrary int is no harder.)
muli() {
local i
res=(${op1[@]})
for (( i = 0; i < ${#res[@]}; i++ )); do
let res[i]*=$1
done
normalise
}
divi() {
local i
res=(${op1[@]})
for (( i = ${#res[@]}-1; i > 0; i-- )); do
let res[i-1]+="MAXINT*(res[i] % $1)"
let res[i]/=$1
done
let res[0]/=$1
normalise
}
(We note that muli might break if the multiplicand is close to 2^32: if two adjacent terms in $res are sufficiently large, normalise might cause overflows. But we're assuming the multiplicand is at most 2^31 - 1, and testing indicates that this works fine.)
For modi, even though the result is a normal integer, we'll return it in $res like a bigint. The other option would be to echo it, but then we'd need to spawn a subshell to use it. (Test this yourself: compare echo 'echo hi' | strace -f bash to echo 'echo $(echo hi)' | strace -f bash. The first doesn't fork at all, because echo is a builtin command; but the second forks a subshell to run echo hi.) Forking isn't cheating, but it seems worth avoiding.
modi() {
local i
let res=0
for (( i = 0; i < ${#op1[@]}; i++ )); do
let res+="${op1[i]}%$1 * (MAXINT%$1)**i"
done
let res%=$1
}
For division and modulo, we need a ≤ operation; we can use its exit code for the return value. (We return 0 (true) if op1 ≤ op2, and 1 (false) otherwise.)
le() {
local i
local len=${#op1[@]}
(( len < ${#op2[@]} )) && len=${#op2[@]}
for (( i = len-1; i >= 0; i-- )); do
if (( op1[i] > op2[i] )); then
return 1
elif (( op1[i] < op2[i] )); then
return 0
fi
done
return 0
}
Finally we can implement division and modulo. We'll just define a mod operator, which can store the division result in a variable $div.
mod() {
local temp=( ${op1[@]} )
let div=0
res=( ${op1[@]} )
until le; do
let div+=1
sub
op1=( ${res[@]} )
done
op1=( ${temp[@]} )
}
So mod stores $op1 % $op2 in $res, and $op1 / $op2 in $div. Since we know $op1 / $op2 will always be less than 10, we could maybe get a slight speed improvement with a binary search, but I really doubt that's going to be a bottleneck.
It would be foolish not to test these. These Haskell functions (entered into GHCI, which only accepts one-line definitions) will help:
let x = 2^32
let splitint n = if n < x then (show n) else (show (n `mod` x)) ++ " " ++ splitint (n `div` x)
let unsplit s = sum $ map (\(a,b) -> b*x^a) $ zip [0..] $ map read $ words s
splitint turns an arbitrary-precision Integer into a string that we can copy into a bash array. unsplit does the opposite, taking a space-separated list of integers and turning them into an arbitrary-precision Integer. (Haskell is good for this because it has arbitrary-precision arithmetic. I originally tried this in Python, and wasted a lot of time trying to track down a bug in my bash code before realising that Python was wrong.) So we choose a few arbitrary large numbers and verify that everything works as expected. (Unsurprisingly, I caught a few bugs when I actually tried this.)
Having implemented bigints to the extent necessary, we can hopefully extract more digits from e. Arithmetic is ugly now, so we'll split off some functions, all using variables $a and $b.
Some things to note:
b_has_power_10 is faster than got_next_digit, and more likely to fail. So we test that first.
- To avoid repeating computations,
echo_next_digit and reduce_a_b simply use the results of a mod b calculated in got_next_digit.
got_next_digit() {
op1=( ${a[@]} )
addi 1
op1=( ${res[@]} )
op2=( ${b[@]} )
mod
div1=$div
op1=( ${a[@]} )
mod
(( div1 == div ))
}
echo_next_digit() {
echo -n $div
}
b_has_power_10() {
op1=( ${b[@]} )
modi 10
(( res == 0 ))
}
increase_a_b() {
op1=( ${a[@]} )
muli $1
op1=( ${res[@]} )
addi 1
a=( ${res[@]} )
op1=( ${b[@]} )
muli $1
b=( ${res[@]} )
}
reduce_a_b() {
a=( ${res[@]} )
op1=( ${b[@]} )
divi 10
b=( ${res[@]} )
}
ecalc() {
a=(1)
b=(1)
d=0
k=1
n=$1
while (( d <= n )); do
until b_has_power_10 && got_next_digit; do
increase_a_b $k
let k+=1
done
echo_next_digit
reduce_a_b
let d+=1
done
echo
}
This still seems to act as O(n^2). Which, in hindsight, shouldn't be too surprising: arithmetic is O(log b); b grows as O(k!); and k grows to approximately 4n. (k! needs to have n powers of 5, which means n ≈ k/5 + k/25 + k/125 + ... = k/4.) Since there are O(n) arithmetic operations, we should actually find that this is O(n^2 log n) if we look close enough. That's disappointing; and the algorithm is considerably slower than the previous one (7 minutes versus 40 seconds to calculate 100 digits), which is even more disappointing.
But maybe we can still salvage something. (Or maybe we're just doubling down on a sunk cost.) The bottleneck right now is probably the powers of 10 in b. There's an easy way to see this: ask for e up to 20 places. Then take the values of a and b, put them into Haskell, and see just how many digits we'd actually generated at this point.
It turns out to be about 130. (And this only took twelve seconds, so we're beating the second attempt considerably.)
So if we can extract all these digits without needing so many powers of 10 in b, we can do a lot better. We might even be able to beat O(n^2), if k grows slower than O(n). So let's try to do that.
Fourth attempt
We can't simply multiply a by 10 every time we'd like to divide b by 10. That would break the algorithm, for one thing: we'd have to keep track of what power of 10 to multiply a by, and only use it when checking to see if we've got the next digit, not in increase_a_b. (It's okay for b because b only ever gets multiplied, so it doesn't matter whether we do that before or after dividing by 10. But when we do a = k*a + 1, it matters that we haven't already multiplied a by 10.)
That's a minor problem. More severely, our division algorithm was designed for small ratios. If we know a/b < 10, it's okay to examine, a, a-b, a-2b, ... to see when we get below b. That won't work so well if a/b could be in the thousands.
Fortunately, we can improve division by using the digits we've already calculated. If we have e = 2.71828… and we haven't reduced a or b at all, then we know a / b = 2.71828…, 10a / b = 27.1828…, 100a / b = 271.828…, etc.
And we know (a-2b)/b = 0.71828, so 10(a-2b)/b = 7.1828…; and (10a - 27b)/b = .1828… so 10(10a - 27b)/b = 10(10(a - 2b) - 7b)/b = 1.828…; and so on.
In short, if we know that a/b = d_0 . d_1 d_2 d_3 … d_n …, then we can extract unknown d_(n+1) by:
let x = a
for i from 0 to n:
x -= d_i * b
x *= 10
d_(n+1) = x/b
These operations are all reasonably fast. (It is, however, O(n), which means we're not going to beat O(n^2).) So, let's try it. We'll store digits of e in an array named e_digits.
next_digit_helper() {
local i
local tmp1=( ${op1[@]} )
local tmp2=( ${op2[@]} )
local x=( ${op1[@]} )
local y=( ${op2[@]} )
for (( i = 0; i < ${#e_digits[@]}; i++ )); do
op1=( ${y[@]} )
muli ${e_digits[$i]}
op1=( ${x[@]} )
op2=( ${res[@]} )
sub
op1=( ${res[@]} )
muli 10
x=( ${res[@]} )
done
op1=( ${x[@]} )
op2=( ${y[@]} )
mod
op1=( ${tmp1[@]} )
op2=( ${tmp2[@]} )
}
got_next_digit() {
op1=( ${a[@]} )
addi 1
op1=( ${res[@]} )
op2=( ${b[@]} )
next_digit_helper
div1=$div
op1=( ${a[@]} )
next_digit_helper
(( div1 == div ))
}
found_digit() {
echo -n $div
e_digits[${#e_digits[@]}]=$div
}
ecalc() {
a=(1)
b=(1)
e_digits=()
d=0
k=1
n=$1
while (( d <= n )); do
until got_next_digit; do
increase_a_b $k
let k+=1
done
found_digit
let d+=1
done
echo
}
Now this works, but it's even slower than the last attempt. We could improve things by reducing a and b as before when possible, but that's not going to gain us much.
There is one other thing we can do, though it seems potentially unsafe. There's a lot of repeated work involved in figuring out how many digits we've accurately calculated. If we guess in advance how high we need to take k, we can save ourselves a lot of work.
Fifth attempt
Recall that we have n digits if ak/k! and (ak+1)/k! agree up to n decimal place
s. The difference between these is 1/k!. If k! > 10^(n+1), then the decimal expansion of 1/k! will start with at least n+1 zeros. The only way ak/k! and (ak+1)/k! could disagree at or before the nth decimal place is if the digits of ak/k! in positions n+1, n+2, ... log10(k!) are all 9. If additionally ak/k! disagrees with e at the nth decimal place, it follows that the digits of e in positions n+1, n+2, ..., log10(k!) are all 0.
e is conjectured to be normal, which would mean that any arbitrarily long string of 0s can be found in its decimal expansion. But the probability of l 0s starting at a given position is 10^-l; so if we ask for, say, 100 digits and take k up to 100, then since log_10(100!) = 157, the probability of getting a digit incorrect is 10^-57, which I feel is acceptable. So let's ignore the problem for now.
extract_many_digits() {
let d=0
op1=( ${a[@]} )
op2=( ${b[@]} )
while (( d <= $1 )); do
mod
echo -n $div
op1=( ${res[@]} )
muli 10
op1=( ${res[@]} )
let d+=1
done
}
ecalc() {
a=(1)
b=(1)
k=1
n=$1
while (( k <= n )); do
increase_a_b $k
let k+=1
done
extract_many_digits $n
echo
}
It's a lot faster, but still slightly slower than the second attempt. (And has the same caveat that as written, it only works for sufficiently large n.)
I've now run out of ideas, which is a bit anticlimactic. (At least, I've run out of ideas that I think will do any good. We could get a more accurate bound on how high to take k; but that could be applied to our second attempt as well. We could reduce a as we go along; but that would make increaseab slower, probably gain us very little overall, and certainly not improve on O(n^2).)
So let's return to the second attempt.
Second attempt, take two
Recall that this is the algorithm we're using:
ecalc() {
let n=$1
echo -n 2.
for (( j = n; j >= 2; j-- )); do
coef[j]=1
done
for (( i = 1; i <= n; i++ )); do
let carry=0
for (( j = n; j >= 2; j-- )); do
let temp=coef[j]*10+carry
let carry=temp/j
let coef[j]=temp-carry*j
done
echo -n $carry
done
echo
}
It seems foolish to rely on an algorithm without understanding it, so how does it work? The paper doesn't make it entirely clear, but what's going on is this:
We approximate e as 2 + 1/2! + 1/3! + 1/4! + ... + 1/m!, where in our implementation m=n. Rewrite this as 2 + 1/2 (1 + 1/3 (1 + ... 1/(n-1) (1 + 1/n) ... )).
This in turn is 2 + 1/10 (1/2 (10 + 1/3 (10 + ... 1/(n-1) (10 + 1/n (10)) ... ))). Some of the 10/k terms are greater than 1; we refactor so that, for example, 1/2 (10 + 1/3 (10 + ...)) becomes 1/2 (13 + 1/3 (1 + ...)) and then 6 + 1/2 (1 + 1/3 (1 + ...)). Eventually we have e = 2 + 1/10 (c + 1/2 (c_2 + 1/3 (c_3 + ... 1/(n-1) (c_(n-1) + 1/n (c_n)) ... ))) where each c_k < k. It follows that the 1/2 (c_2 + ... ) term is less than 1, so c must be 7, the first decimal digit of e.
Rewriting again, e = 2.7 + 1/100 (1/2 (10c_2 + 1/3 (10c_3 + ... 1/(n-1) (10c_(n-1) + 1/n (10c_n)) ... ))). We apply the same procedure to get the second digit of e, and so on.
The algorithm's coef[j] takes the place of these c_j. To get each digit, we recalculate the c_j in one pass starting from the right; the digit is whatever term is left over outside of the 1/2 (...).
It's worth noting that this algorithm has the same probability of making mistakes as our fifth attempt. So it's probably worth thinking about this probability in more detail.
However, we note that if the calculated sequence of digits ends with d 9s, then only the final d+1 digits have a chance of being incorrect. If they are, then the 9s should all be 0s, and the one preceding them should be one higher. This is reassuring; it permits us to say "if you really can't tolerate the slightest inaccuracy, then ask for more digits than you need, and throw away the final ones as appropriate".
(We could automate the process, but that would require us to guess a new n, then run the algorithm again from scratch. If we want that behaviour, we really ought to write it as a seperate program. In this sense, the fifth attempt was better, because you could store intermediate results and use them to get a head start on later ones.)
I also note that I've spent too much time on this already, and that the implementation as it stands chooses m larger than necessary (except for small n, which we shall soon fix), massively reducing the probability of an error. (For n>50 or so, the probability of an error is smaller than the probability of winning the lottery; if an error does appear, it seems more likely to be from some a mistake somewhere else.) And whatever the actual probability of error is, it was originally small enough that the authors of the paper didn't notice, and after cursory examination I haven't been able to find any instances where the original algorithm made a mistake. (Digit 327 looked promising, being followed by three 0s; but it turned out okay in the end.)
So while I'd like to go into more depth on this issue, I won't do so at this point.
It remains to fix the algorithm for small n. We simply calculate to at least 22 decimal places' worth of precision. This is a little slower than necessary, but the small-n case hardly seems worth optimising.
ecalc() {
let n=$1
let m=n
(( n < 22 )) && m=22
echo -n 2.
for (( j = m; j >= 2; j-- )); do
coef[j]=1
done
for (( i = 1; i <= n; i++ )); do
let carry=0
for (( j = m; j >= 2; j-- )); do
let temp=coef[j]*10+carry
let carry=temp/j
let coef[j]=temp-carry*j
done
echo -n $carry
done
echo
}
We could at this point try to calculate the value of m actually needed. We could even use our arbitrary-precision arithmetic to do it; we haven't implemented logarithms, but we can get an upper bound using the inequality log(sum(ai * (2^32)^i)) < sum(log(ai) + i*log(2^32)).
But this would impose O(n^2 log n) startup cost, so is decidedly not a good tradeoff. There may well be better ways to approximate log_10(m!), but again, I've spent too much time on this.
This has been interesting to write, even if more than half of it turns out to be useless.
Posted on 16 June 2012
|
Comments
(Fri, 6 Dec 2013: Importing this post from its original home as a gist.)
The recent post on Hacker News about #! semantics surprised me. I had always assumed that a shebang line like
Would be equivalent to calling
$ /usr/bin/prog -a -b <file>
- but instead, it's
$ /usr/bin/prog '-a -b' <file>
This could only surprise me because I hadn't found out the hard way, so maybe it's not a big deal. Most scripts that I write don't have even a single argument on the shebang line. Most of the rest are Perl scripts, and perl is clever when it comes to parsing a single argument that looks like multiple arguments:
$ echo hi there | perl '-a -n -l -e print $F[0]'
hi
But this behaviour does have consequences, especially with the use of higher-order commands such as sudo, nice and env. For example, the following shebang lines will not work as intended:
#! /usr/bin/sudo -u phil sh
#! /usr/bin/nice -n 19 sh
#! /usr/bin/env perl -n
(Scripts using sudo and nice in a shebang seem unlikely to be distributed, but might find use in site-local maintenance scripts. env can be used to make a script more portable, in case a program isn't in a consistent location across systems.)
So I got to thinking about a program that would act like env for this purpose, but splitting its arguments on whitespace, or even doing full shell-like parsing of quotes.
Of course, such a program already exists: its name is shell. sh accepts the -c option to pass a shell expression on the command line. If this expression comes from a shebang line, word-splitting will be performed just like when typing directly into a shell. As a bonus (arguably), you even get to use things like pipelines, output redirection, shell built-in commands, and forking to the background, all in the shebang line of a script.
There is one downside: normally with a shebang line you can think of the script name and any arguments as being implicitly appended. This no longer holds: sh -c takes an expression, not a program name, and expressions don't take arguments in the same way that programs do. Instead you need to access these arguments through shell variables $0 through $9, $* and $@.
Alas, my first tests failed. It seems that -c requires its argument to be, well, a separate argument, so it's not much use with a shebang. (This is the case when sh is linked to Bash. Perhaps other shells are different, but if it doesn't work in Bash's sh-emulation mode, it probably can't be considered portable.)
So I went ahead and wrote a small script to get this behaviour. I even improved on what I could have done with sh -c: by default the script name and arguments are implicitly appended, but passing -c at the start of the first argument disables this.
I named this script /usr/bin/cmd, so for example the following shebang lines are now possible, and do what you would expect:
#! /usr/bin/cmd sudo -u phil sh
#! /usr/bin/cmd nice -n 19 sh
#! /usr/bin/cmd perl -n
But you can also do things like
#! /usr/bin/cmd grep -v '^#' | perl
to strip comments from the input before you process it. Or perhaps
#! /usr/bin/cmd -c perl "$0" | xargs grep "$@"
to generate a list of filenames in perl and search them for a pattern given on the command line. On a similar note,
#! /usr/bin/cmd -c perl "$0" "$@" | xgraph -nl -m &
might save having a separate file just to pipe the output of a perl script into xgraph.
I have a lisp program which expects an S-expression as input, but I can use
#! /usr/bin/cmd (echo \(&&perl -lne'chomp,print"($_)"'&&echo \)) | sbcl --script
and now it expects plain text. (It could be cleaner, but I wanted to keep it less than 80 characters, and semicolons don't interact well with parentheses and emacs' paredit-mode. This example is probably a little more extreme than is sensible.)
There are also some pitfalls to cmd. If you have a system which does split shebang lines, I think normal behaviour will still work, but anything fancy - any use of -c, or shell processing - will fail. I don't think it would even be possible to port, unless there's some way to tell where the shebang arguments stop and the command-line arguments begin. (You could work this out in most cases by checking which argument names an executable file with cmd on the shebang, but that seems fragile.)
You need to be careful in -c mode to quote properly. Otherwise it will seem to work, but break mysteriously when an argument contains a literal space or wildcard. I think "$0" and "$@" are the constructs you want in almost all cases: everything else I've tried fails, and I haven't found anywhere that these fail. (Perhaps an option to cmd which would cause it to replace % or something with "$0" "$@" would be a good idea.)
If you want to be portable, you also need to worry about the length of your shebang lines. 127 bytes seem to be accepted on all modern systems, but I'll admit that I don't recognise (except perhaps in passing) many of the names in the originally linked article. (But if you want to be portable, you also want to wait until cmd is installed as standard on most systems. This might take a while.)
One pitfall that seems to have been avoided: I was worried that perl (which performs its own parsing of the shebang line) would be too clever for cmd, and recognise switches intended for programs later in a pipeline. This doesn't seem to happen:
#!/usr/bin/cmd -c perl "$0"; echo -l
print "hi";
produces output "hi-l", as intended. But I don't know exactly what rules perl uses, so there may be edge cases.
And I'm sure there are problems that I haven't anticipated. Which is the worst kind of problem.
Ultimately, I'm not sure how much use cmd will be. But until a few days ago, I don't think I'd ever thought about using a shell script in a shebang line, so I guess there's territory yet to explore. I'd be interested in hearing more ideas on the subject.
If you're interested, cmd can be found at http://github.com/ChickenProp/cmd.
Posted on 27 July 2010
|
Comments