A Reasonable Approximation

Latest posts

Variadic functions in Hindley Milner

I previously mentioned an interest in adding variadic functions to Haskenthetical, the "Haskell with a Lisp syntax" I'm vaguely working on.

It sounds difficult. This is an attempt to figure out just how difficult, partly by looking at the state of the art.

A variadic function is one which can accept argument lists of different length. A well-known example in many languages is printf. Here are a bunch of functions I might want to be variadic in Haskenthetical, along with examples in Haskell of types I might want to use them as:

(I'm using Haskell syntax for the types, rather than Haskenthetical syntax. Partly because Haskenthetical has no concept of typeclasses or rank-2 types, which some of the examples use. But also partly because it's probably easier to read. Which you might think would be a reason not to make Haskenthetical, but here we are.)

I'm not, to be clear, saying that all or even any of these are good ideas. I mostly don't miss variadic functions in Haskell; they can be implemented hackily like in Text.Printf linked above, but I'm not sure I've ever felt the need to, and I've rarely-if-ever used that printf. But it seems worth starting to explore the space of the sorts of things I might want to consider trying to support, before making any decisions.

Not Hindley-Milner

The first paper I'm going to look at is the most recent one I've found: Strickland, Tobin-Hochstadt and Felleisen, Practical Variable-Arity Polymorphism (hereafter PVAP; 2009, doi: 10.1007/978-3-642-00590-0_3). I linked this in my previous post. It implements typechecking for variadic functions in Typed Scheme, I think specifically meaning Typed Racket? I'm not familiar with the language (I have done a little untyped Racket in the past), but from the sounds of things, its type system is fundamentally different from Hindley-Milner, and the implementation won't easily transfer. (Both compile to System F, but I don't think that helps.)

But it does help me make sense of the space. It divides the functions it can type into two: uniform and non-uniform. Let's call the optional arguments the "rest parameter", as in the parameter which holds "the rest of the parameters". Uniform functions are those whose rest parameter is a homogeneous list, such that they could be replaced (at cost to ergonomics) with a function accepting a list. In my above examples, that's the arithmetic functions plus list and list'. In Typed Racket syntax, the types of these functions would be

(: + (-> Number Number Number * Number)) # also -, *, /
(: list (All (a) (-> a * (Listof a))))
(: list' (All (a) (-> a * (Listof a) (Listof a))))

With the * indicating "zero or more of the preceding type". These seem simple enough. (Though list' takes an argument after the variadic part, which makes things more complicated. Racket calls that function list* but I couldn't find a type declaration for it to be sure it's actually valid.)

Then the other other functions handled by the paper are "non-uniform". Of my examples, I think that's just map, trace, and maybe zip and unzip natively.

(: map (All (a b ...) (-> (-> b ... b a) (Listof b) ... b (Listof a))))
(: trace (All (a b ...)) (-> String (-> b ... b a) b ... b a))
(: zip (All (a ...)) (-> (Listof a) ... a (Listof (Tuple a ... a))))
(: unzip (All (a ...)) (-> (Listof (Tuple a ... a)) (Tuple (Listof a) ... a)))

For zip and unzip, I'm inventing the type Tuple here to refer to a collection with known size and types. (Tuple Number String Bool) would be equivalent to Haskell's (Number, String, Bool). I don't know if anything like it actually exists, or can exist, in Typed Racket already.

These are a bit more involved. The ... is doing two related but syntactically different things.1 In the variable list of All (the (a b ...) and (a ...)) it says the preceding variable corresponds to a list of types. In the body of All, it combines with both the preceding and following syntax elements. t ... b means: "b was followed by ... in the type signature, so it corresponds to a list of types. Use a copy of t for each one of those types, and inside each copy, substitute b with the corresponding type from the list".

So if b ... corresponds to Integer String Number, then (Listof b) ... b corresponds to (Listof Integer) (Listof String) (Listof Number).

I don't know if we strictly need the trailing variable in the body. You're only allowed one ... in the variable list (right at the end), and the trailing variable is required to have been in the variable list followed by a ..., so as far as I can tell it's unambiguous. (At least as long as there are no higher-rank types, which I don't think I've seen mentioned in this context.)

printf, appF, appA and puring would also fit into this schema if it weren't for the constraints. But as far as I know Typed Racket has nothing analagous to Haskell's constraints. I don't know how much they complicate matters.

That leaves four examples. sort and renderCsv don't fit the scheme because they can only accept one or two optional arguments, not an arbitrary number. (Typed Racket does support optional arguments, they're just not covered by this paper.)

» and « don't fit because the type of each argument can depend on the type of the preceding one. For example, we might call

(«) (&) 1 (+ 1) Just (fmap (* 3)) (: [])
(») ($) (: []) (fmap (* 3)) Just (+ 1) 1

There's a pattern to the types, but not the pattern we need.

So: this paper describes a way of typechecking a few of the functions we might like to typecheck, in a completely different type system than the one we want to use. What can we do in Hindley-Milner?

There's a brief discussion of that, mostly of the form "here's another paper that made some progress in an HM system. It's not as powerful as what we have here". But those other papers fully exhaust what I've managed to find on the subject; and actually, I can't find a copy of Moggi, Arity polymorphism and dependent types (2000). That leaves three to look at.

Infinitary Tuples

First up: Dzeng and Haynes, Type reconstruction for variable-arity procedures (1994, doi: 10.1145/182590.182484). This uses a thing called "infinitary tuples" to handle optional arguments and both uniform and non-uniform variadic functions. PVAP lists some limitations:

Most importantly, since their system does not support first-class polymorphic functions, they are unable to type many of the definitions of variable-arity functions, such as map or fold. Additionally, their system requires full type inference to avoid exposing users to the underlying details of row types, …

I don't understand this first one: map is explicitly given as an example, with type2

\[ (\mathit{pre} · ((γ∘δ) → α) :: (γ ∘ (\underline{\mathtt{list}}\ δ))) → \mathtt{list}\ α \]

But this might help illustrate the second problem, which I think is saying: type annotations are complicated, users won't want to deal with them.

I do notice two limitations myself. One is that you're not allowed to pass in a variadic function and apply it with two different argument counts: ((λ (f) (* (f 1) (f 2 3))) -) is forbidden, even if (- 1) and (- 2 3) are both okay. Combining this system with higher-rank types might avoid this limitation, but I don't know if they're compatible. The authors list two other ways to potentially avoid it, but both would need more research. I don't know how big a deal this would be, but it feels minor. It's similar to (perhaps the same as) the restriction that lambda bindings are monomorphic, and I don't know if I've ever had a problem with that in practice.

The other is: there's only one thing you can do with a "rest argument" (i.e. the collection of the arbitrarily-many arguments at the end), and that's to use it as another rest argument. There's even special syntax for that: you'd define a variadic function as (λ (x &rest xs) ...) (i.e. taking one required argument and arbitrarily many following that), and inside the body you'd call some other variadic function with (g a b c &rest xs). So variadic functions need to be either built-in or defined in terms of other variadic functions.

This feels like a bigger limitation, but again I don't really know. The paper only talks about type-checking - it contains zero implementations of variadic functions. Maybe a small handful of built-in ones would let us define everything we want.

Ignoring that problem for now, and considering whether we can type the example functions from above:

So how does it work? It's based on extensible records, which I haven't yet looked at in depth. (Extensible records with scoped labels is the paper I was planning to look into if I tried to add them to Haskenthetical, but it might not take a compatible approach.) A row is an infinite sequence of (mark, type) pairs, where a mark is either "present", "absent" or a mark variable. Rows can be equivalently constructed as pairs of (mark row, type row), where those are infinite sequences of the respective things. At some point all the marks become "absent", and then the types don't matter. Recall the type of map from above,

\[ (\mathit{pre} · ((γ∘δ) → α) :: (γ ∘ (\underline{\mathtt{list}}\ δ))) → \mathtt{list}\ α \]

This is quantified over three type variables. \(γ\) is quantified over mark rows, and \(δ\) over type rows, with \(γ∘δ\) combining them into a row. And \(α\) is quantified over proper types.

Everything before the outermost \(→\) is a row. (Functions in this system are of kind "row → type", not kind "type → type".) It has \(\mathit{pre} · ((γ∘δ) → α)\) as its head, a single field marked present ("pre") with type \((γ∘δ) → α \). \(γ∘δ\) is itself a row. Then for the outer row, the tail \(γ ∘ (\underline{\mathtt{list}}\ δ)\) has the same sequence of marks as the argument to its head, meaning the same number of arguments. \(\underline{\mathtt{list}}\ δ\) is a type row of "apply list to the types in \(δ\)".

Then we might instantiate \(γ\) at \(\mathit{pre} :: \mathit{pre} :: \underline{\mathit{abs}}\), i.e. "two present fields and then nothing". And we might instantiate \(δ\) at \(\mathtt{int} :: \mathtt{string} :: δ'\), i.e. "an int, then a string, then another type row", and \(α\) at \(\mathtt{bool}\). Then we'd have something like the Haskell type

(Int -> String -> Bool) -> [Int] -> [String] -> [Bool]

but it would be rendered in this type system as

\[ (\mathit{pre} · ((\mathit{pre} · \mathtt{int} :: \mathit{pre} · \mathtt{string} :: \underline{\mathit{abs}} ∘ δ') → \mathtt{bool}) :: \mathit{pre} · \mathtt{list\ int} :: \mathit{pre} · \mathtt{list\ string} :: \underline{\mathit{abs}} ∘ (\underline{\mathtt{list}}\ δ')) → \mathtt{list\ bool} \]

Which, let us say, I do not love.4 And this kind of thing infects all functions, not just variadic ones! And I don't know how it interacts with partial application. Still, maybe it can be made ergonomic. I think this paper deserves further study, though I'm unlikely to do it myself.

The Zip Calculus

Next is Tullsen, The Zip Calculus (2000, doi: 10.1007/10722010_3). This extends typed lambda calculus to get variadic tuples.

This paper seems to come out of the "program transformation community", which I'm not familiar with. I suspect it's talking about things I know a little about in ways I don't recognize.

The specific typed lambda calculus it extends is "\(\mathrm{F}_ω\)". Fortunately I recently came across the lambda cube so I know how to find out what that means. It's System F, which (as above) is what GHC compiles to as an intermediate step, plus the ability for the user to define their own types, which… I'd always assumed GHC's intermediate System F also had that, but maybe not, or maybe GHC actually compiles to \(\mathrm{F}_ω\)? But… later in the paper it talks about dependent typing, which \(\mathrm{F}_ω\) doesn't have, so maybe the paper is using nonstandard terminology? Argh.

Anyway, I think of System F as a more powerful version of Hindley-Milner, but too powerful to automatically type check, so you need lots of type annotations. If we're making it more powerful I guess we're still going to need those.

I confess I have trouble figuring out what this paper offers in any detail. I think I'd be able to eventually, but I'd need to put in more effort than I felt like right now.

It does give us a short list of functions that, if the language defines them built-in, we can build other functions out of. These are list and what it calls seqTupleL and seqTupleR. These aren't described except for their (identical) type signatures

\[ \mathtt{Monad\ m} ⇒ ×⟨^{i} \mathtt{a}_{.i} → \mathtt{m\ b}_{.i}⟩ → ×\mathtt{a} → \mathtt{m}(×\mathtt{b}) \]

which I think in Haskell correspond to the types

seqTuple_ :: Monad m => (a1 -> m b1) -> a1 -> m b1
seqTuple_ :: Monad m => (a1 -> m b1, a2 -> m b2) -> (a1, a2) -> m (b1, b2)
seqTuple_
  :: Monad m
  => (a1 -> m b1, a2 -> m b2, a3 -> m b3)
  -> (a1, a2, a3)
  -> m (b1, b2, b3)

If I'm right about that, I'm pretty sure the semantics are "zip the tuple of functions with the tuple of parameters, apply them each in turn and sequence the effects (left-to-right / right-to-left)".

Given these functions, we're specifically told we can implement zip, unzip, map5 and appF6. I'm pretty sure arithmetic, list', appA, puring and trace will be possible, and I weakly guess that printf will be as well, while « and » won't be. I'm not sure about sort or renderCsv.

One thing is that all of these functions are defined with variadic tuples, so that e.g. map would actually be accepted at types like

map :: (a -> b) -> [a] -> [b]
map :: ((a, b) -> c) -> ([a], [b]) -> [c]
map :: ((a, b, c) -> d) -> ([a], [b], [c]) -> [d]

which I assume leaves no room for partial application. It might also be awkward when the final argument needs to be handled separately; I'm not sure if we could get

list' :: [a] -> [a]
list' :: (a, [a]) -> [a]
list' :: (a, a, [a]) -> [a]

or if we'd be stuck with

list' :: () -> [a] -> [a]
list' :: a -> [a] -> [a]
list' :: (a, a) -> [a] -> [a]

Given this limitation, it would be convenient to have variadic curry and uncurry functions,

curry :: (a -> b -> c) -> (a, b) -> c
curry :: (a -> b -> c -> d) -> (a, b, c) -> d
curry :: (a -> b -> c -> d -> e) -> (a, b, c, d) -> e

uncurry :: ((a, b) -> c) -> a -> b -> c
uncurry :: ((a, b, c) -> d) -> a -> b -> c -> d
uncurry :: ((a, b, c, d) -> e) -> a -> b -> c -> d -> e

but no such luck. We're specifically told curry can't be typed, that's an area for future research. And uncurry would break us out of the variadic-tuple paradigm entirely.

I'd be really interested to see a more approachable-to-me version of this paper.

Typeclasses

Finally McBride, Faking It: Simulating Dependent Types in Haskell (2002, doi: 10.1017/S0956796802004355).

I'm happy to say I understand (the relevant parts of) this paper. It adds variadic map directly to Haskell using typeclasses. It then explores further in directions I'm going to ignore. The downside, as PVAP correctly notes, is that you need to explicitly pass in the number of variadic arguments as an initial argument.

The code is simple enough. It felt a bit dated to me: in the past twenty years GHC has added a bunch of new features that seemed like they'd help make it more ergonomic. But I couldn't find a much better way to write it using those, so whatever.

The original is (equivalent to):

{-# LANGUAGE FunctionalDependencies, FlexibleInstances, UndecidableInstances #-}

data Zero = Zero
data Suc n = Suc n

one = Suc Zero :: Suc Zero
two = Suc one :: Suc (Suc Zero)

class ManyApp n fst rest | n fst -> rest where
  manyApp :: n -> [fst] -> rest

instance ManyApp Zero t [t] where
  manyApp Zero fs = fs

instance ManyApp n fst rest => ManyApp (Suc n) (t -> fst) ([t] -> rest) where
  manyApp (Suc n) fs xs = manyApp n (zipWith ($) fs xs)

nZipWith :: ManyApp n fst rest => n -> fst -> rest
nZipWith n f = manyApp n (repeat f)

Here manyApp is the version where instead of just one function, we provide a list of them. The instances give us

manyApp Zero :: [t]           -> [t]
manyApp one  :: [s -> t]      -> [s] -> [t]
manyApp two  :: [r -> s -> t] -> [r] -> [s] -> [t]
--      ^ n     ^ fst            ^ rest

We define it recursively, in terms of its simpler definitions. Then nZipWith is easy to define in terms of manyApp, where it's not easy to define recursively in terms of itself.

So what else can we already implement in Haskell with typeclasses? I think most of my original list of functions, and for several of them I think explicitly specifying the variadic count would be unnecessary.

I tried out a few of them, see here. It's possible there are ways to improve on what I found, but my current sense is:

Here, I consider something to "work" if you can make it do what you want, and work "better" if it doesn't need much type hinting. "works fine" means you don't need type hinting in any places I wouldn't normally expect it in Haskell, and I think the error messages might even be pretty normal?

So I think that with explicit counts, most of these work fine. Without the explicit counts, several of them work pretty okay, but they'll have difficulty if you use them in a weakly-constrained context. Like, print (list "a" "b" "c") won't work because it doesn't know if you want the Show instance on [String] or on String -> [String] or…. (Probably only one of these instances exists, but we can't rely on that.) But then you just need to add a type annotation, print (list "a" "b" "c" :: [String]). list' and zip need a lot more type annotations than is normal, for reasons I maybe have a shallow understanding of but not a deep one.

Partial application

Here's something I think I was vaguely aware of before I started this, but it's clearer now: partial application makes this harder.

Consider the Haskell code

zipWith (\a b c -> a + b + c) [1,2] [3,4]

That uses the existing 2-ary zipWith function, which means it has type [Int -> Int]7. (Its value is [(+ 4), (+ 6)].) We could instead have used the existing 3-ary zipWith3, and then it would have type [Int] -> [Int]. If we used a variadic function, what type would it have? All the papers I looked at had some way of answering the question.

In Racket, as far as I know there's no implicit partial application. A function that takes two arguments is different from a function that takes one argument and returns a function that takes one argument. So under PVAP, to choose between the two interpretations, you'd need something like:

(map (λ (a b) (λ (c) (+ a b c))) '(1 2) '(3 4)) ; [Int -> Int]
(λ (cs) (map (λ (a b c) (+ a b c)) '(1 2) '(3 4) cs)) ; [Int] -> [Int]

The Zip Calculus preserves partial application for fixed-arity functions, but not variadic ones. (More precisely, it has partial application for functions but not for variadic tuples.) The two interpretations would be written along the lines of

nZipWith (\(a, b) c -> a + b + c) ([1,2], [3,4]) -- [Int -> Int]
\cs -> nZipWith (\(a, b, c) -> a + b + c) ([1,2], [3,4], cs) -- [Int] -> [Int]

Which is basically the same as the Racket version, apart from syntax.

And "infinitary tuples" has to do something similar. For the list of functions, we'd pass "a function taking a row with two variables and returning a function" as the first argument. For the function on lists we'd pass "a function taking a row with three variables". I guess the system must lose Haskell's equivalence between \a b -> ... and \a -> \b -> ....

The typeclass solution to this is completely different. The inner function is the same for both interpretations, what changes is the initial "variadic count" argument.

nZipWith two (\a b c -> a + b + c) [1,2] [3,4] -- [Int -> Int]
nZipWith three (\a b c -> a + b + c) [1,2] [3,4] -- [Int] -> [Int]

This feels to me like the cleanest of the bunch. I introduced this initial argument as a downside before, but that might be kind of unfair - like, maybe it's only a downside compared to an imaginary world that I hadn't yet realized was impossible.

How else might we solve this problem?

One option is to assume that nZipWith will never want to return a list of functions. I don't love that solution, but here's someone implementing it for appF (which they call liftAn). I haven't checked how well it works.

Something else that might work is to wrap the return value of the initial function in some singleton type.

nZipWith (\a b -> Solo $ \c -> a + b + c) [1,2] [3,4] -- [Int -> Int]
nZipWith (\a b c -> Solo $ a + b + c) [1,2] [3,4] -- [Int] -> [Int]

This is kind of similar to "no partial applications", except you can still do partial applications everywhere else.

And I don't know if this would be possible, but I could imagine using a type annotation to distinguish. If you need to annotate every call to the function, I think I'd rather just specify the number of arguments or something. But if you only need them when it's ambiguous that could be neat.

This isn't a problem for every variadic function. In the previous section, only about half the functions absolutely needed a variadic count, and this is why. (For unzip, it's about the ambiguity in [((a, b), c)]. Do you want to unzip one level to ([(a, b)], [c]), or two to (([a], [b]), [c])?) Other functions had difficulty without the count, but I think that's for unrelated reasons. I think I have a decent intuitive sense of which functions will have this problem and which won't, but I couldn't give a brief description of it. Maybe something to do with matching structures between the initial argument and the rest of the type?

Conclusion

The main thing I take away from this is that I'm basically going to drop variadic functions from my radar for Haskenthetical. Infinitary tuples and the zip calculus don't feel like directions I want to go in. This might affect whether, when and how I add support for typeclasses and various surrounding machinary.

  1. It also makes it really hard to write code outlines while omitting certain parts. 

  2. Note that I've swapped from a postfix \( α\ \mathtt{list} \) syntax to a prefix \( \mathtt{list}\ α \) that I'm more used to. Also the \( \underline{\mathtt{list}} \) was originally rendered \( \underline{\mathit{list}} \) but I think that was a mistake. 

  3. I'm not sure. It looks like there's nothing stopping us from constructing types corresponding to optional-before-required. But the paper describes a language syntax that forbids it. My weak guess is such types would break the inference algorithm. 

  4. Sorry it's not indented nicely, I'm not sure how to make Mathjax do that. I wouldn't love it however nicely laid out, though. 

  5. Like with the previous paper, PVAP said map wouldn't be possible: "The presented limitations of the Zip Calculus imply that it cannot assign a variable-arity type to the definition of zipWith (Haskell's name for Scheme’s map) without further extension". As far as I can tell it was simply wrong. 

  6. appF is described with Monad constraints, but I gather the Applicative typeclass wasn't introduced at the time. I expect the seqTuple_ functions could take Applicative instead of Monad, and then so could appF

  7. Pedantic note: I'm pretending here that integer literals have type Int, not type Num a => a

Posted on 02 April 2022

Tagged: haskenthetical; software

Comments elsewhere: /r/ProgrammingLanguages; /r/haskell; LessWrong