You may want to inform yourself about human rights in China.

On Nix's Language: Closures

tags: computing nix
date: 2022-11-16
update: 2024-05-01

We saw by the end of the previous article a first example of “closure”, which you can grossly think of as “function”. We’re going to progressively explore this notion in-depth here: we’ll see in particular how they can be used to encode all kinds of data structures: lists, integers, booleans, objects, graphs, etc.

Some tiny programming languages like the λ-calculus, of which we’ll implement an interpreter to conclude this series, is essentially reduced to closures: this minimalism is convenient for theoretical/mathematical analysis, or for programming practice. But it’s naturally ill-suited for real-life programming.

Besides, we’ll also re-implement many ordinary functions on lists, using an ad hoc, closure-based list implementation, instead of genuine Nix’s lists, genuine lists that we’ll explore soon enough.

As always, the code for this series is available on github.


Krestovsky Island shrouded in mist (Крестовский остров в тумане); oil on canvas

Krestovsky Island shrouded in mist (Крестовский остров в тумане); oil on canvas by Ivan Ivanovich Shishkin (Ива́н Ива́нович Ши́шкин, 1832-1898) through wikiart.orgPublic domain

Objects

First-class functions

Recall what a basic function looks like:

#!/usr/bin/env -S nix-instantiate --eval
let
	hw = w: "hello "+w;
in
	hw "reader"
"hello reader"

In Nix, functions are first class-citizens, meaning, they can be provided as arguments, or returned:

#!/usr/bin/env -S nix-instantiate --eval
with builtins;
let
	callFunWith = f: x: (y: f x y);
	add         = x: y: x + y;
	add3To      = callFunWith add 3;
in
	trace(add3To 7)
	trace((callFunWith add 7) 3)
	"ok"
trace: 10
trace: 10
"ok"

This, this ability to pass functions around, is the key to use functions to encode structured data. Let’s see how.

Structures

You may be familiar with the notion of struct in C or C-like languages: those are simple collections of (named) items:

/* C */
struct user {
	char *name;
	int  age;
}
/* Go */
type user struct {
	name string
	age  int
}

Closures can actually be used to implement such a struct:

#!/usr/bin/env -S nix-instantiate --eval
with builtins;
let
	mkUser = name: age: (x: if x then name else age);
	u = mkUser "Bob" 42;
in
	trace (u true)
	trace (u false)
	"ok"
trace: Bob
trace: 42
"ok"

The “trick” is to observe that the parameters to the mkUser function (name and age) are available to the function returned by mkUser, without being explicit parameters of this returned function: the set of all such implicit parameters is called an environment, and the key idea here is to use this environment to store data.

In the previous example, we returned a function whose single boolean parameter allows to fetch either value, name or age, from the environment. Saying it otherwise, we parametrized the returned function by providing it some specific values for name and age.

Let’s see if you’ve understood the previous struct example correctly:


Exercise: Add an email field to the our previous struct; make sure you can access all three values.

Tip:[+]

Solution:[+]

Methods, state, inheritance

Let’s push things one step further: we can store values, and return them, but can we define methods on a struct, to simulate a loose object?


Exercise: Well, yes we can, and you should be able to figure it out without too much trouble: add to our previous three-fields struct a get method to retrieve the value of the fields, and an isAdult method, returning true if the user is above 18, false otherwise.

Tip:[+]

Solution:[+]

Exercise: Slightly more complicated: add a new method set, which allows to set fields from the struct.

Tip:[+]

Solution:[+]

Exercise: Okay, just for fun: create a second type of object, say teacher, who inherits from our previous user object, but whose name field is now unalterable via set (why not…), and who is equipped with an extra field attribute, which can be read/written via get/set.

Solution:[+]

Not fantastic syntactically: better languages, equipped with advanced macros, can improve the situation. But it should nevertheless be enough to convince you that it should be possible to build full-blown objects.

This is actually ancient programming knowledge, captured by a famous Koan:

The venerable master Qc Na was walking with his student, Anton. Hoping to prompt the master into a discussion, Anton said “Master, I have heard that objects are a very good thing - is this true?” Qc Na looked pityingly at his student and replied, “Foolish pupil - objects are merely a poor man’s closures.”

Chastised, Anton took his leave from his master and returned to his cell, intent on studying closures. He carefully read the entire “Lambda: The Ultimate…” series of papers and its cousins, and implemented a small Scheme interpreter with a closure-based object system. He learned much, and looked forward to informing his master of his progress.

On his next walk with Qc Na, Anton attempted to impress his master by saying “Master, I have diligently studied the matter, and now understand that objects are truly a poor man’s closures.” Qc Na responded by hitting Anton with his stick, saying “When will you learn? Closures are a poor man’s object.” At that moment, Anton became enlightened.

– By Anton van Straaten

Lists

Definition

Let’s now see how we can implement lists using closures. The difficulty is that, while a struct has a fixed, predefined number of fields, a list, by design, doesn’t. But we’ve just learnt how to jump from struct to closures, so let’s recall how lists are typically implemented in C.

Of course, there are more than one approach, but we’re in particular interested in the prevailing notion of a (simply) linked list, which is represented by that kind of struct:

typedef struct list list;
struct list {
	void *value;
	list *next;
}

To which we’ll add the convention that NULL corresponds to the empty list. This is all almost identical to the usual recursive definition of a list, which permeates functional programming, where a list is defined as being either:

  1. the empty list;
  2. a value called the head, attached to another list, called the tail.

Okay, let me give you one teeny tiny thing: I’ve explained that in C, NULL is used as a synonym for the empty list. We can easily translate this to Nix as follow, and equip ourselves with a function to test the emptiness of a list:

	...
	# Empty list
	nil     = null;

	# Check whether a list is empty or not
	isEmpty = l: l == nil;
	...

Exercise: Using your previouisly acquired expertise, convert this new struct to a closure “constructor”.

Tip:[+]

Solution:[+]

Exercise: Let’s see if you’ve understood the previous definition: define a few “variables” containing the following lists:

Solution:[+]

Note: The name “cons” refers to a similar operation in Lisp; while our definition and usage is different from Lisp’s, I’ll keep similar naming conventions, as was the case for nil, the empty list, or will be for car/cdr, which are used as synonyms for head/tail, respectively to get, well, the head/tail of a list.


Exercise: Now we have everything we need to implement car and cdr. Let me say it again: those functions act on a list which may be empty or not, and return respectively the head (first element) or the tail (everything but the first element wrapped in a list) of that list.

Tip:[+]

Solution:[+]

Because it’s going to be annoying to repeat all our primitive list functions, I’m going to store them in a module that we can import from other .nix files; I’ll revisit this pattern in detail later: you may want to consider it a magic spell for now. Say the first panel is the content of a file list.nix; the second will import this previous file, allowing the use the functions defined in list.nix.

rec {
	cons    = h: t: (x: if x then h else t);

	nil     = null;
	isEmpty = l: l == nil;
	access  = x: l: if isEmpty l
		then throw "list is empty"
		else l x
	;

	car = access true;
	cdr = access false;
}
#!/usr/bin/env -S nix-instantiate --eval
with builtins;
with (import ./list.nix);
...

Exercise: Write a function cadr which returns the second element of a list. Write a function caddr which returns the third element of a list. Write a function cdar which returns the tail of the first element of a list (cadr’s input is a list of lists then).

Try to use the list.nix module technique I’ve just presented.

Solution:[+]

Note: Those are common shortcuts in Lisp-like languages.

Believe it or not, we now have everything we need to deal with list: we have a precise definition, and a few primitive operations, which are sufficient to build all the usual functions you can think of on lists (calculating their length, filtering elements, etc.)

Iteration pattern

Well, almost everything. There’s one more thing I need to tell you about before we can start writing code: there’s a pattern to recursively travel through a list which naturally arises from (our) definition of a list:

This can be done with or without an accumulator: this is conceptually identical to what we’ve done when implementing fold on integer sequences, foldln. In Nix pseudo-code, this yields something like:

#!/usr/bin/env -S nix-instantiate --eval
with builtins;
let
	f = l:
		if isEmpty l then
			# empty list case:
			...
		else let
			x  = car l;
			xs = cdr l;
		in
			# do something with x and often,
			# recursively apply f to xs
			...
	;
	...
in
	...
#!/usr/bin/env -S nix-instantiate --eval
with builtins;
let
	f = l: let aux = acc: l:
		if isEmpty l then
			# empty list case, generally something like:
			acc
		else let
			x  = car l;
			xs = cdr l;
		in
			# Do something with x via actOnHead, glue this with
			# the accumulator somehow via addToAcc, and
			# eventually (maybeRecurse) recurse on the tail/cdr
			(maybeRecurse aux (addToAcc (actOnHead x) acc) xs)
	;
	...
in
	...
Rain in oak forest (Дождь в дубовом лесу), 1891, 124×203cm, oil on canvas

Rain in oak forest (Дождь в дубовом лесу), 1891, 124×203cm, oil on canvas by Ivan Ivanovich Shishkin (Ива́н Ива́нович Ши́шкин, 1832-1898) through wikimedia.orgPublic domain

Standard list functions

Let’s re-write a bunch of"standard" list functions to exercise, but not using Nix’s lists, but using our previous ad hoc list implementation. The code is for the most part a direct application of the previous recursive patterns; until they both become second-nature, you may want to try solving each exercise twice, once with each pattern.

Exercise: Write a (recursive) function isMember e xs, which returns true if e exists in xs, false otherwise.

Solution:[+]

Exercise: Write a (recursive) function length xs which computes the length of a list. Try implementing it with and without an accumulator.

\[\begin{aligned} (\text{length}\ [x_0\ x_1\ ...\ x_n]) = n+1 \end{aligned}\]

Solution:[+]

Exercise: Write a (recursive) print xs function which prints a list xs. You can either use builtins.trace or dump it to a string. Tests your function on the lists we’ve defined in an earlier exercise:

	x = nil;
	y = cons 3 nil;
	z = cons "hello" (cons 3 (cons (x: x) nil));
	# We're still being "lazy", but you may to be fancier
	w = cons nil (cons nil nil);

The goal is for you to navigate the list, the output format doesn’t matter much, it can be a bit clumsy-looking.

Tip:[+]

Solution:[+]

Exercise: Write a (recursive) function range n m which creates a list containing the integers from n to m, both included.

\[\begin{aligned} (\text{range}\ n\ m) = [n\ (n+1)\ ...\ m] \end{aligned}\]

Solution:[+]

Exercise: Write a (recursive) function reverse xs which inverts the order of the elements of xs.

\[\begin{aligned} (\text{reverse}\ [x_0\ x_1\ ...\ x_n]) = [x_n\ x_{n-1}\ ...\ x_0] \end{aligned}\]

Solution:[+]

Exercise: Write a (recursive) function append xs ys which concatenates two lists xs and ys. Write a version with and without an accumulator.

Tip:[+]

Solution:[+]

Exercise: Write a (recursive) function map f xs mapping the function f to all elements of xs; more precisely:

\[\begin{aligned} (\text{map}\ \phi\ [x_0\ x_1\ ...\ x_n]) = [\phi(x_0)\ \phi(x_1)\ ...\ \phi(x_n)] \end{aligned}\]

Tip:[+]

Solution:[+]

Exercise: Write a (recursive) function foldl f s xs folding xs via f, with s as an initial value; more precisely:

\[\begin{aligned} (\text{foldl}\ \phi\ [x_0\ x_1\ ...\ x_n]) = (\phi\ (...\ (\phi\ (\phi\ s\ x_0)\ x_1)\ ...\ )\ x_n) \end{aligned}\]

Solution:[+]

Note: There’s a famous, old (1999) paper about fold by Graham Hutton that you may enjoy studying.


Exercise: Rewrite isMember, length, print, reverse, append and map, all using foldl. The fact that we can do that is rather significant: fold really is a fundamental recursive list operator.

Solution:[+]

Exercise: Rewrite foldr, both directly and via foldl. Try rewriting map with foldr. As a reminder, foldr is mathematically defined by:

\[\begin{aligned} (\text{foldr}\ \phi\ s\ [x_0\ x_1\ ...\ x_n]) = (\phi\ x_0\ (\phi\ x_1\ (...\ (\phi\ x_n\ s)\ ...))) \end{aligned}\]

Tip:[+]

Tip:[+]

Tip:[+]

Solution:[+]

Exercise: Use foldl to write a flatten function, which takes a list that may contain lists, and flatten them all to a single list. Write a deepFlatten function which does so recursively.

Use foldr to rewrite flatten; how does it compare to the foldl-based implementation?

Tip:[+]

Solution:[+]

Exercise: Use foldl to write a zipWith f xs ys function which returns a list, pairing elements from xs and ys via f:

\[\begin{aligned} (\text{zipWith}\ \phi\ [x_0\ x_1\ ...\ x_n]\ [y_0\ y_1\ ...\ y_n]) = [(\phi\ x_0\ y_0)\ (\phi\ x_1\ y_1)\ ...\ (\phi\ x_n\ y_n)] \end{aligned}\]

Write a zip xs ys function via zipWith which pairs adjacent elements from each list.

\[\begin{aligned} (\text{zip}\ \phi\ [x_0\ x_1\ ...\ x_n]\ [y_0\ y_1\ ...\ y_n]) = [\ [x_0\ y_0]\ [x_1\ y_1]\ ...\ [x_n\ y_n]\ ] \end{aligned}\]

Tip:[+]

Solution:[+]

Exercise: Last but not least, we’ve already mentioned apply a few times. Try to remember what it did, to re-implement it to work with our bespoke lists.

Tip:[+]

Solution:[+]
Winter forest (oil on canvas?)

Winter forest (oil on canvas?) by Ivan Ivanovich Shishkin (Ива́н Ива́нович Ши́шкин, 1832-1898) through dardant.euPublic domain

Booleans (single bit)

Exercise: I would expect finding how to implement booleans on your own to be too difficult: maybe give it a shot for 10/15min before you start unrolling those tips; trying is more important than succeeding. Let me give you a first advise: if it behaves like a boolean, then it’s a boolean (duck typing). Think about how we can get/set a boolean’s value.

But as for the list, there’s no need for any object-oriented feature (well, at least for the standard approach we’re going to take).

Tip:[+]

Tip:[+]

Tip:[+]

Solution:[+]

Exercise: Implement a few common boolean operations on those Church booleans: and, or, xor, not. Try implementing them both with and without ifelse.

Tip:[+]

Tip:[+]

Solution:[+]

Alright, let’s have some fun with our lists again. Looking at our implementation once again, you can see that we were relying genuine Nix booleans:

rec {
	cons    = h: t: (x: if x then h else t);

	nil     = null;
	isEmpty = l: l == nil;
	access  = x: l: if isEmpty l
		then throw "list is empty"
		else l x
	;

	car = access true;
	cdr = access false;
}

Let’s try to progressively replace those booleans by our ad hoc, closure-based booleans. This will imply also getting rid of using null for the empty list.


Exercise: I guess the most straightforward is the one used in cons, so start with this one. Because all the functions on lists we wrote earlier are independent of the implementation details our our list.nix modules, they should all keep working (that’s your tests)

Solution:[+]

Note: There’s some “strange” behavior with Nix when exporting functions/objects that are defined by the language (e.g. map, true, false). For instance, the true/false returned by isMember above are Nix booleans, not our redefined booleans. Perhaps using different names would have been wiser, but highlighting this corner case isn’t necessarily a bad idea either.


Exercise: This one is more delicate: let’s get rid of all Nix booleans, and implement our lists fully with closures. There’s an important consequence to this: our lists will become, like our booleans, functions: they will be executable.

Now you may think, but isn’t cons already executable? Well yes, but cons itself is not a list, it’s a list constructor, and nil is clearly not a function either.

Saying it otherwise, this means that, as we can execute true and false, we’ll want to be able to execute nil and cons h t (again, not cons, but the lists we construct via cons).

Don’t bother too much: keep the throw, and focus on adjusting the list implementation (you can ignore the list library).

Tip:[+]

Tip:[+]

Tip:[+]

Tip:[+]

Solution:[+]

Integers

Exercise: We have list. We have booleans. Don’t we have integers?

Solution:[+]

Note: Once we have integers, we also have strings, as they can be considered (and are, for instance in Haskell) as lists of characters, with syntactic sugar and other goodies to help handle character sets.


Exercise: Propose another simple integer encoding, relying on lists only.

Tip:[+]

Solution:[+]

Exercise: Implement a function to multiply two such integers. You don’t have to handle negative integers (I won’t).

Tip:[+]

Tip:[+]

Solution:[+]

A third’s time a charm: forget our lists and try to focus solely on implementing integers, directly from closures. A common way to proceed is to use Church numerals. We could devise other methods, but Church numerals have a handy property: it’s not too complicated to write common operations (addition, multiplication, etc.) on them. Let’s see how we could encode the first few (natural) integers:

	zero  = x: y: x;
	one   = x: y: y x;
	two   = x: y: y (y x);
	three = x: y: y (y (y x));

Essentially, we represent a number as a function taking two arguments. The number encoded correspond to the number of time the second argument, a function, is called when expanding the expression of the number (e.g. the three calls y three times).

Exercise: Try to find, in this order, how to:

  1. Get a visual representation of a number (print them somehow);
  2. Add two numbers;
  3. Multiply two numbers.
Tip:[+]

Tip:[+]

Tip:[+]

Tip:[+]

Solution:[+]

Sets, graphs

Exercise: But, don’t we already have ways to implement them already? Albeit inefficiently.

Solution:[+]

I don’t think there’s much more to say: we could try implementing them, either from lists or from scratch, but this wouldn’t bring much more to the conversation.

In the wild north… (На севере диком…), 1891, 161×118cm, oil on canvas

In the wild north… (На севере диком…), 1891, 161×118cm, oil on canvas by Ivan Ivanovich Shishkin (Ива́н Ива́нович Ши́шкин, 1832-1898) through wikimedia.orgPublic domain

Conclusion

This article mainly was a way to practice using the λ-calculus, that hasn’t been introduce properly, yet. We’ve started from a “frankenstein” implementation of lists, exhaustively practiced with lists, before progressively studying how we could make this implementation become more and more closure-based.

I don’t think this can be considered an easy topic, especially if you’re coming from an imperative background. The differences between the fold* can be subtle, and wrapping one’s head on using functions to encode anything, is likely to be disturbing. If this is still foggy, don’t hesitate to wait a few days before looking at the material again.

We’ll revisit in greater depth the λ-calculus the last article of this series anyway.

The next two articles should feel like a breeze in comparison to this one!


In the series:


Comments

By email, at mathieu.bivert chez:

email