A Reasonable Approximation

Latest posts

Calculating e in bash

(Fri, 6 Dec 2013: Importing this post from its original home as a gist.)

The point of this post is an attempt to calculate e to given precision in bash, a challenge given in a job listing that I saw recently. I kind of got nerd sniped. I wrote this as I went along, so there may be inconsistencies.

First attempt

The obvious method to compute e is as the infinite sum of 1/n!, n from 0 to ∞. This converges quickly, but how far do we have to calculate to get the n'th digit of e? We can deal with that later.

We obviously need a factorial function.

fac() {
    if [[ $1 -eq 0 ]]; then
        echo 1
    else
        echo $(( $1 * `fac $(($1 - 1))` ))
    fi
}

Since we only have integer division, we obviously can't calculate 1/2!. But note that x/a! + y/(a+1)! = (x(a+1) + y)/(a+1)!. We can use this to recursively calculate (sum 1/k! [0 ≤ k ≤ n]) as numer(n)/n!, where numer(0) = 0, numer(k) = k*numer(k-1) + 1.

numer() {
    if [[ $1 -eq 0 ]]; then
        echo 1
    else
        echo $(( $1 * `numer $(($1 - 1))` + 1 ))
    fi
}

So now we can calculate the partial sums. Since we still only have integer division, we need to multiply them by a power of 10 to get rid of the decimal point.

nthsum() {
    echo $(( 10**($1-1) * `numer $1` / `fac $1` ))
}

Note that this fails for n=0 (10**-1 isn't allowed), but we can live with that.

So this kind of works:

$ for i in `seq 1 15`; do nthsum $i; done
2
25
266
2708
27166
271805
2718253
27182787
271828152
2718281801
27182818261
2252447557
-1174490000
104582974
1946803

Up to n=11, we accurately calculate the first (n-3) digits of e. For n=12 and above, we get integer overflows.

It doesn't look like we can go very far with this: the numbers we're working with are simply too large.

Second attempt

If you google "algorithm to calculate a specific digit of e", this paper comes up: http://eprints.utas.edu.au/121/1/Calculation_of_e.pdf. It provides a simple algorithm using (mostly) integer arithmetic, implemented in ALGOL. It's simple enough to translate into bash:

ecalc() {
    let n=$1
    echo -n 2.

    for (( j = n; j >= 2; j-- )); do
        coef[j]=1
    done

    for (( i = 1; i <= n; i++ )); do
        let carry=0
        for (( j = n; j >= 2; j-- )); do
            let temp=coef[j]*10+carry
            let carry=temp/j
            let coef[j]=temp-carry*j
        done
        echo -n $carry
    done
    echo
}

This isn't quite accurate: the original algorithm calculates m such that m! > 10^(n+1), and the loops over j go from m to 2 instead of n to 2. This means the algorithm is inaccurate for small n. (For n ≥ 27, n! > 10^(n+1) so it works out okay; for 22 ≤ n ≤ 26, we have 10^(n-1) < n! < 10^(n+1) and the result is accurate anyway. It seems like the algorithm is unnecessarily conservative, but we might also find that m! > 10^(n-1) is insufficient for larger n.) For large n, we do unnecessary work, but get the correct result.

We can fix both these problems, but this algorithm isn't especially nice anyway. Its time complexity is O(n^2). Can we do better?

Third attempt

(Spoiler alert: this doesn't go so well.)

The same google search also gives this page: http://www.hulver.com/scoop/story/2004/7/22/153549/352 which hints at an algorithm without providing it explicitly. We can adapt our first attempt for this.

Write numer(n) as a_n, so a_0 = 1 and a_n = n * a_(n-1) + 1. This gives 1/0! + 1/1! + … + 1/n! = a_n / n!. We know that e = lim [n→∞] a_n / n!; but more than this, we can show that for any n ≥ 1, a_n / n! < e < (a_n + 1)/n!.

(Proof of this: (a_n+1) / n! = 1/0! + 1/1! + … + 1/n! + 1/n!. This is greater than e if 1/n! > 1/(n+1)! + 1/(n+2)! + …, which holds if 1 > 1/(n+1) (1 + 1/(n+2) (1 + … )). For n ≥ 1, RHS is ≤ 1/2 (1 + 1/3 (1 + … )) which we know is e-2 < 1.)

So if a_n / n! and (a_n + 1) / n! agree up to k decimal places, these must be the first k decimal places of e.

Moreover, we can extract specific decmial places while keeping to integer division: the fractional part of x/y is (x%y)/y, so the first decmal digit is int( (10*(x%y))/y ) or int( (x%y)/(y/10) ) (assuming y%10 = 0), and we can extract further digits by doing the same thing again.

This gives us an algorithm for calculating e to n decimal places, one digit at a time:

ecalc() {
    let a=1
    let b=1
    let d=0
    let k=1
    let n=$1

    while (( d <= n )); do
        while (( a/b != (a+1)/b || b%10 != 0 )); do
            let a=k*a+1
            let b*=k
            let k+=1
        done

        echo -n $(( a / b ))
        let d+=1
        let a%=b
        let b/=10
    done

    echo
}

Unfortunately, this only works up to three decimal places before we get overflows. The problem is that b only gets a new power of 10 every time k%5 = 0. Unfortunately 24!/10000 overflows, so we only get digits from k=5, 10, 15, 20. (In fact, ecalc 4 is also correct; but this seems to be just coincidence.)

We can delay the inevitable by keeping track of powers of ten explicitly: when we generate a new digit, if b%10 != 0, increment a counter and consider (10^powten * a)/b and (10^powten * (a+1))/b. This gives us a few more digits, but before long 10^powten * a overflows.

So, how to get around this? Why not just implement arbitrary-precision integers in bash?

It sounds crazy, but we don't actually need a complete implementation. The only operations we need are:

The latter two can be implemented with subtraction and comparison, so it shouldn't be too hard.

Let's represent a big integer as an array of numbers, each smaller than 2^32. Since bash can represent numbers up to 2^63 - 1, we can raise n up to 2^31 - 1 before overflows become a serious problem. As of 2010, e was only known up to about 2^40 digits, so this is an acceptable limit. But it's admittedly quite arbitrary, and there's no reason to hardcode it.

let MAXINT=2**32

We'll want some way of taking an array of numbers and turning it into a bigint, if some of its elements are greater than MAXINT or less than 0. I don't think there's a convenient way of passing around arrays in bash, so let's use a register-based approach, and operate destructively on the variable $res. (This is for convenience: $res will be the output variable of other operations, and we expect normalisation to be the last thing they do.)

normalise() {
    local i

    for (( i = 0; i < ${#res[@]}; i++ )); do
        if (( res[i] >= MAXINT )); then
            let res[i+1]+=(res[i] / MAXINT)
            let res[i]=(res[i] % MAXINT)
        elif (( res[i] < 0 && ${#res[@]} > i+1 )); then
            let res[i+1]-=1
            let res[i]+=MAXINT
        fi
    done
}

This doesn't handle every case; for example, a term smaller than -MAXINT will break things. But it will be sufficient for our purposes.

With this, addition and subtraction are easy. We only need addition of an int and a bigint, so will call this addi (i for integer) and operate on the variable $op1.

addi() {
    res=( ${op1[@]} )
    let res[0]+=$1
    normalise
}

Subtraction needs to be defined between two bigints, but we only need positive results.

sub() {
    local i
    res=()
    for (( i = 0; i < ${#op1[@]}; i++ )); do
        let res[i]=op1[i]-op2[i]
    done
    normalise
}

Multiplication and division follow similarly. (We only need to divide a bigint by 10, but allowing an arbitrary int is no harder.)

muli() {
    local i
    res=(${op1[@]})
    for (( i = 0; i < ${#res[@]}; i++ )); do
        let res[i]*=$1
    done
    normalise
}

divi() {
    local i
    res=(${op1[@]})
    for (( i = ${#res[@]}-1; i > 0; i-- )); do
        let res[i-1]+="MAXINT*(res[i] % $1)"
        let res[i]/=$1
    done
    let res[0]/=$1
    normalise
}

(We note that muli might break if the multiplicand is close to 2^32: if two adjacent terms in $res are sufficiently large, normalise might cause overflows. But we're assuming the multiplicand is at most 2^31 - 1, and testing indicates that this works fine.)

For modi, even though the result is a normal integer, we'll return it in $res like a bigint. The other option would be to echo it, but then we'd need to spawn a subshell to use it. (Test this yourself: compare echo 'echo hi' | strace -f bash to echo 'echo $(echo hi)' | strace -f bash. The first doesn't fork at all, because echo is a builtin command; but the second forks a subshell to run echo hi.) Forking isn't cheating, but it seems worth avoiding.

modi() {
    local i
    let res=0
    for (( i = 0; i < ${#op1[@]}; i++ )); do
        let res+="${op1[i]}%$1 * (MAXINT%$1)**i"
    done
    let res%=$1
}

For division and modulo, we need a ≤ operation; we can use its exit code for the return value. (We return 0 (true) if op1 ≤ op2, and 1 (false) otherwise.)

le() {
    local i
    local len=${#op1[@]}
    (( len < ${#op2[@]} )) && len=${#op2[@]}

    for (( i = len-1; i >= 0; i-- )); do
        if (( op1[i] > op2[i] )); then
            return 1
        elif (( op1[i] < op2[i] )); then
            return 0
        fi
    done
    return 0
}

Finally we can implement division and modulo. We'll just define a mod operator, which can store the division result in a variable $div.

mod() {
    local temp=( ${op1[@]} )
    let div=0
    res=( ${op1[@]} )

    until le; do
        let div+=1
        sub
        op1=( ${res[@]} )
    done

    op1=( ${temp[@]} )
}

So mod stores $op1 % $op2 in $res, and $op1 / $op2 in $div. Since we know $op1 / $op2 will always be less than 10, we could maybe get a slight speed improvement with a binary search, but I really doubt that's going to be a bottleneck.

It would be foolish not to test these. These Haskell functions (entered into GHCI, which only accepts one-line definitions) will help:

let x = 2^32
let splitint n = if n < x then (show n) else (show (n `mod` x)) ++ " " ++ splitint (n `div` x)
let unsplit s = sum $ map (\(a,b) -> b*x^a) $ zip [0..] $ map read $ words s

splitint turns an arbitrary-precision Integer into a string that we can copy into a bash array. unsplit does the opposite, taking a space-separated list of integers and turning them into an arbitrary-precision Integer. (Haskell is good for this because it has arbitrary-precision arithmetic. I originally tried this in Python, and wasted a lot of time trying to track down a bug in my bash code before realising that Python was wrong.) So we choose a few arbitrary large numbers and verify that everything works as expected. (Unsurprisingly, I caught a few bugs when I actually tried this.)

Having implemented bigints to the extent necessary, we can hopefully extract more digits from e. Arithmetic is ugly now, so we'll split off some functions, all using variables $a and $b.

Some things to note:

got_next_digit() {
    op1=( ${a[@]} )
    addi 1
    op1=( ${res[@]} )

    op2=( ${b[@]} )
    mod
    div1=$div

    op1=( ${a[@]} )
    mod
    (( div1 == div ))
}

echo_next_digit() {
    echo -n $div
}

b_has_power_10() {
    op1=( ${b[@]} )
    modi 10
    (( res == 0 ))
}

increase_a_b() {
    op1=( ${a[@]} )
    muli $1
    op1=( ${res[@]} )
    addi 1
    a=( ${res[@]} )

    op1=( ${b[@]} )
    muli $1
    b=( ${res[@]} )
}

reduce_a_b() {
    a=( ${res[@]} )

    op1=( ${b[@]} )
    divi 10
    b=( ${res[@]} )
}

ecalc() {
    a=(1)
    b=(1)
    d=0
    k=1
    n=$1

    while (( d <= n )); do
        until b_has_power_10 && got_next_digit; do
            increase_a_b $k
            let k+=1
        done

        echo_next_digit
        reduce_a_b
        let d+=1
    done

    echo
}

This still seems to act as O(n^2). Which, in hindsight, shouldn't be too surprising: arithmetic is O(log b); b grows as O(k!); and k grows to approximately 4n. (k! needs to have n powers of 5, which means n ≈ k/5 + k/25 + k/125 + … = k/4.) Since there are O(n) arithmetic operations, we should actually find that this is O(n^2 log n) if we look close enough. That's disappointing; and the algorithm is considerably slower than the previous one (7 minutes versus 40 seconds to calculate 100 digits), which is even more disappointing.

But maybe we can still salvage something. (Or maybe we're just doubling down on a sunk cost.) The bottleneck right now is probably the powers of 10 in b. There's an easy way to see this: ask for e up to 20 places. Then take the values of a and b, put them into Haskell, and see just how many digits we'd actually generated at this point.

It turns out to be about 130. (And this only took twelve seconds, so we're beating the second attempt considerably.)

So if we can extract all these digits without needing so many powers of 10 in b, we can do a lot better. We might even be able to beat O(n^2), if k grows slower than O(n). So let's try to do that.

Fourth attempt

We can't simply multiply a by 10 every time we'd like to divide b by 10. That would break the algorithm, for one thing: we'd have to keep track of what power of 10 to multiply a by, and only use it when checking to see if we've got the next digit, not in increase_a_b. (It's okay for b because b only ever gets multiplied, so it doesn't matter whether we do that before or after dividing by 10. But when we do a = k*a + 1, it matters that we haven't already multiplied a by 10.)

That's a minor problem. More severely, our division algorithm was designed for small ratios. If we know a/b < 10, it's okay to examine, a, a-b, a-2b, … to see when we get below b. That won't work so well if a/b could be in the thousands.

Fortunately, we can improve division by using the digits we've already calculated. If we have e = 2.71828… and we haven't reduced a or b at all, then we know a / b = 2.71828…, 10a / b = 27.1828…, 100a / b = 271.828…, etc.

And we know (a-2b)/b = 0.71828, so 10(a-2b)/b = 7.1828…; and (10a - 27b)/b = .1828… so 10(10a - 27b)/b = 10(10(a - 2b) - 7b)/b = 1.828…; and so on.

In short, if we know that a/b = d_0 . d_1 d_2 d_3 … d_n …, then we can extract unknown d_(n+1) by:

let x = a
for i from 0 to n:
    x -= d_i * b
    x *= 10
d_(n+1) = x/b

These operations are all reasonably fast. (It is, however, O(n), which means we're not going to beat O(n^2).) So, let's try it. We'll store digits of e in an array named e_digits.

next_digit_helper() {
    local i
    local tmp1=( ${op1[@]} )
    local tmp2=( ${op2[@]} )
    local x=( ${op1[@]} )
    local y=( ${op2[@]} )

    for (( i = 0; i < ${#e_digits[@]}; i++ )); do
        op1=( ${y[@]} )
        muli ${e_digits[$i]}
        op1=( ${x[@]} )
        op2=( ${res[@]} )
        sub
        op1=( ${res[@]} )
        muli 10
        x=( ${res[@]} )
    done

    op1=( ${x[@]} )
    op2=( ${y[@]} )
    mod

    op1=( ${tmp1[@]} )
    op2=( ${tmp2[@]} )
}

got_next_digit() {
    op1=( ${a[@]} )
    addi 1
    op1=( ${res[@]} )
    op2=( ${b[@]} )

    next_digit_helper
    div1=$div

    op1=( ${a[@]} )
    next_digit_helper

    (( div1 == div ))
}

found_digit() {
    echo -n $div
    e_digits[${#e_digits[@]}]=$div
}

ecalc() {
    a=(1)
    b=(1)
    e_digits=()
    d=0
    k=1
    n=$1

    while (( d <= n )); do
        until got_next_digit; do
            increase_a_b $k
            let k+=1
        done

        found_digit
        let d+=1
    done

    echo
}

Now this works, but it's even slower than the last attempt. We could improve things by reducing a and b as before when possible, but that's not going to gain us much.

There is one other thing we can do, though it seems potentially unsafe. There's a lot of repeated work involved in figuring out how many digits we've accurately calculated. If we guess in advance how high we need to take k, we can save ourselves a lot of work.

Fifth attempt

Recall that we have n digits if a_k/k! and (a_k+1)/k! agree up to n decimal place s. The difference between these is 1/k!. If k! > 10^(n+1), then the decimal expansion of 1/k! will start with at least n+1 zeros. The only way a_k/k! and (a_k+1)/k! could disagree at or before the nth decimal place is if the digits of a_k/k! in positions n+1, n+2, … log_10(k!) are all 9. If additionally a_k/k! disagrees with e at the nth decimal place, it follows that the digits of e in positions n+1, n+2, …, log_10(k!) are all 0.

e is conjectured to be normal, which would mean that any arbitrarily long string of 0s can be found in its decimal expansion. But the probability of l 0s starting at a given position is 10^-l; so if we ask for, say, 100 digits and take k up to 100, then since log_10(100!) = 157, the probability of getting a digit incorrect is 10^-57, which I feel is acceptable. So let's ignore the problem for now.

extract_many_digits() {
    let d=0
    op1=( ${a[@]} )
    op2=( ${b[@]} )
    while (( d <= $1 )); do
          mod
          echo -n $div
          op1=( ${res[@]} )
          muli 10
          op1=( ${res[@]} )
          let d+=1
    done
}

ecalc() {
    a=(1)
    b=(1)
    k=1
    n=$1

    while (( k <= n )); do
        increase_a_b $k
        let k+=1
    done

    extract_many_digits $n
    echo
}

It's a lot faster, but still slightly slower than the second attempt. (And has the same caveat that as written, it only works for sufficiently large n.)

I've now run out of ideas, which is a bit anticlimactic. (At least, I've run out of ideas that I think will do any good. We could get a more accurate bound on how high to take k; but that could be applied to our second attempt as well. We could reduce a as we go along; but that would make increase_a_b slower, probably gain us very little overall, and certainly not improve on O(n^2).)

So let's return to the second attempt.

Second attempt, take two

Recall that this is the algorithm we're using:

ecalc() {
    let n=$1
    echo -n 2.

    for (( j = n; j >= 2; j-- )); do
        coef[j]=1
    done

    for (( i = 1; i <= n; i++ )); do
        let carry=0
        for (( j = n; j >= 2; j-- )); do
            let temp=coef[j]*10+carry
            let carry=temp/j
            let coef[j]=temp-carry*j
        done
        echo -n $carry
    done
    echo
}

It seems foolish to rely on an algorithm without understanding it, so how does it work? The paper doesn't make it entirely clear, but what's going on is this:

We approximate e as 2 + 1/2! + 1/3! + 1/4! + … + 1/m!, where in our implementation m=n. Rewrite this as 2 + 1/2 (1 + 1/3 (1 + … 1/(n-1) (1 + 1/n) … )).

This in turn is 2 + 1/10 (1/2 (10 + 1/3 (10 + … 1/(n-1) (10 + 1/n (10)) … ))). Some of the 10/k terms are greater than 1; we refactor so that, for example, 1/2 (10 + 1/3 (10 + …)) becomes 1/2 (13 + 1/3 (1 + …)) and then 6 + 1/2 (1 + 1/3 (1 + …)). Eventually we have e = 2 + 1/10 (c + 1/2 (c_2 + 1/3 (c_3 + … 1/(n-1) (c_(n-1) + 1/n (c_n)) … ))) where each c_k < k. It follows that the 1/2 (c_2 + … ) term is less than 1, so c must be 7, the first decimal digit of e.

Rewriting again, e = 2.7 + 1/100 (1/2 (10c_2 + 1/3 (10c_3 + … 1/(n-1) (10c_(n-1) + 1/n (10c_n)) … ))). We apply the same procedure to get the second digit of e, and so on.

The algorithm's coef[j] takes the place of these c_j. To get each digit, we recalculate the c_j in one pass starting from the right; the digit is whatever term is left over outside of the 1/2 (…).

It's worth noting that this algorithm has the same probability of making mistakes as our fifth attempt. So it's probably worth thinking about this probability in more detail.

However, we note that if the calculated sequence of digits ends with d 9s, then only the final d+1 digits have a chance of being incorrect. If they are, then the 9s should all be 0s, and the one preceding them should be one higher. This is reassuring; it permits us to say "if you really can't tolerate the slightest inaccuracy, then ask for more digits than you need, and throw away the final ones as appropriate".

(We could automate the process, but that would require us to guess a new n, then run the algorithm again from scratch. If we want that behaviour, we really ought to write it as a seperate program. In this sense, the fifth attempt was better, because you could store intermediate results and use them to get a head start on later ones.)

I also note that I've spent too much time on this already, and that the implementation as it stands chooses m larger than necessary (except for small n, which we shall soon fix), massively reducing the probability of an error. (For n>50 or so, the probability of an error is smaller than the probability of winning the lottery; if an error does appear, it seems more likely to be from some a mistake somewhere else.) And whatever the actual probability of error is, it was originally small enough that the authors of the paper didn't notice, and after cursory examination I haven't been able to find any instances where the original algorithm made a mistake. (Digit 327 looked promising, being followed by three 0s; but it turned out okay in the end.)

So while I'd like to go into more depth on this issue, I won't do so at this point.

It remains to fix the algorithm for small n. We simply calculate to at least 22 decimal places' worth of precision. This is a little slower than necessary, but the small-n case hardly seems worth optimising.

ecalc() {
    let n=$1
    let m=n
    (( n < 22 )) && m=22

    echo -n 2.

    for (( j = m; j >= 2; j-- )); do
        coef[j]=1
    done

    for (( i = 1; i <= n; i++ )); do
        let carry=0
        for (( j = m; j >= 2; j-- )); do
            let temp=coef[j]*10+carry
            let carry=temp/j
            let coef[j]=temp-carry*j
        done
        echo -n $carry
    done
    echo
}

We could at this point try to calculate the value of m actually needed. We could even use our arbitrary-precision arithmetic to do it; we haven't implemented logarithms, but we can get an upper bound using the inequality log(sum(a_i * (2^32)^i)) < sum(log(a_i) + i*log(2^32)).

But this would impose O(n^2 log n) startup cost, so is decidedly not a good tradeoff. There may well be better ways to approximate log_10(m!), but again, I've spent too much time on this.

This has been interesting to write, even if more than half of it turns out to be useless.

Posted on 16 June 2012

Tagged: software; math; silly