You may want to inform yourself about human rights in China.

On Nix's Language: Pragmatism, Laziness

tags: computing nix
date: 2022-12-05
update: 2024-04-01

Lately, we’ve been hanging around a very theoretical corner of computer science. However interesting and insightful it can be, in practice, it must almost always be balanced. For instance, Church booleans/integers are, at least on modern ordinary hardware, cumbersome in many ways compared to fixed-size 32/64 bits integers.

Often in engineering, pragmatism is a decisive way to select between multiple options, an approach which often leads to simple(r/st) choices.

We’ll use some aspects of the design of Nix itself to exemplify this point. Our roadmap for this article will include some more details on laziness, as in lazy evaluation (and not as a variant of “pragmatism”), and genuine Nix lists/sets. Genuine in the sense, not some hand-implemented ones out of closures.

Again, the code for this series is available on github.


Oak grove (Дубовая роща, 1887), oil on canvas, 193×183 cm

Oak grove (Дубовая роща, 1887), oil on canvas, 193×183 cm by Ivan Ivanovich Shishkin (Ива́н Ива́нович Ши́шкин, 1832-1898) through wikimedia.orgPublic domain

Lists

So, here’s how real Nix lists behave:

Exercise: Nixpkgs, NixOS’s package manager, is built with Nix. It comes with a library of additional functions on lists. Find it and spend some time reading its code. Try to find foldl'’s code. How does it compare to our implementation?

Solution:[+]

Here’s a simple list exercise to warm you up, this time, on real Nix lists. We’ll need this function for the following section.

Exercise: Write a function take n xs which returns the list made out of the first n elements of the list xs. Implement it as a recursive function thrice: once with head/tail, once with elemAt, and once with foldl'.

\[\begin{aligned} (\text{take}\ n\ [x_0\ x_1\ ...\ x_{n-1}\ ...\ \ x_m]) = [x_0\ x_1\ ...\ x_{n-1}] \end{aligned}\]

Solution:[+]

Exercise: Writing simple code is not always easy; often, simplicity is achieved progressively, by taking the time to identify and remove superfluous elements.

“I have made this [letter] longer than usual because I have not had time to make it shorter.”

Blaise Pascal

Most of the code I demonstrate in this series has gone through such a refinement process. For instance, try to simplify the following piece of code; which function is it?

	XXX = f: l:
		if l == [] then
			(f)
		else let
			x = head l;
			xs = tail l;
		in
			if xs == [] then
				(f x)
			else
				foldl' (x: y: x y) (f x) xs;
Solution:[+]

Lazy lists?

A common pattern in lazy languages such as Haskell, is the ability to generate “infinite” lists. The idea is to write infinitely recursive code: in practice, it will only be executed as far as we need to. Remember, this works because laziness will delay the evaluation as much as possible.

For example, the iterate function in Haskell is used to generate infinite lists by iteratively applying a function to a previous iteration, starting at a given value.

\[\begin{aligned} (\text{iterate}\ \phi\ e) = [\phi(e)\ (\underbrace{\phi\circ\phi}_{=:\phi^2})(e))\ \phi^3(e)\ ... ] \end{aligned}\]

A common way to “stop” the infinite process is to use take, to select the first n elements of such an infinite list. Here’s a typical Haskell implementation of iterate with an example:

-- ghc -Wno-tabs -o /tmp/a.out $fn && /tmp/a.out

iter :: (a -> a) -> a -> [a]
iter f e = (f e):(iter f (f e))

main = do
	print (take 5 (iter (*2) 1))
[1 of 1] Compiling Main             ( /tmp/t.hs, /tmp/t.o )
Linking /tmp/a.out ...
[2,4,8,16,32]

Note: While laziness presented as such can be intellectually delightful, its value for engineering is debatable: it can for instance hinder reasoning on the behavior or performances of realistically more complex programs.

Now, let’s try to do the same thing with Nix, this shouldn’t be too hard.

Exercise: Write a version of iterate in Nix (using Nix’s lists).

Tip:[+]

Solution:[+]

The reason why this happens is that Nix’s lists are actually fixed-size arrays. In the same way that we choose to use 32/64 bits fixed-size integers instead of systematically using “bignums” (or Church integers…), Nix’s authors made the choice to used fix-sized arrays instead of unbounded lists (linked lists was another option for instance).

They’ve traded “expressivity” for efficiency. Given the fact that most applications don’t need unbounded lists, it’s a fairly reasonable choice. The same kind of reasoning led the authors to implement foldl' or map directly in C++, without recursivity: we’ve demonstrated earlier that such common patterns can be used as building blocks for more advanced lists manipulation code: the latter will directly benefit from an efficient implementation.

So, is Nix a lazy language? Well, it is, to up to some pragmatic considerations. The links between the list elements aren’t lazily evaluated, but the elements themselves are. Don’t worry if this last point is a bit mysterious, we’ll come back to it shortly.


Exercise: Find the code of foldl'. Try to understand what it does.

Solution:[+]

Exercise: tail'’s usage is discourage is the builtins documentation. Try to find its code, and try to understand what’s happening, and why elemAt is more idiomatic in Nix.

Also try to find how/where lists are stored/created.

Solution:[+]

Exercise: Find the stack used for function calls.

Tip:[+]

Tip:[+]

Solution:[+]

Exercise: Recall the first list implementation we had in the previous article (you could use any latter ones, but this one is sufficient here):

rec {
	cons    = h: t: (x: if x then h else t);

	nil     = null;
	isEmpty = l: l == nil;
	access  = x: l: if isEmpty l
		then throw "list is empty"
		else l x
	;

	car = access true;
	cdr = access false;
}

Implement a zeroes “function”, which generates an infinite list containing only zeroes (or, implement iterate f n, as you wish). Implement take n xs for such lists. Does it work? Why/why not?

Solution:[+]
Forest distant view (Лесные дали, 1884), oil on canvas, 113×164 cm

Forest distant view (Лесные дали, 1884), oil on canvas, 113×164 cm by Ivan Ivanovich Shishkin (Ива́н Ива́нович Ши́шкин, 1832-1898) through artpoisk.info wikimedia.orgPublic domain

Exercise: There’s yet another way to proceed: if you’ve understood the key point of the previous solution, then you should be able to find a way to use use Nix’s lists to implement lazy lists.

Again, suffice to check that calling take on zeroes work; if your list implementation has the same interface as before (cons, car, nil, cdr, etc.), then zeroes and take are already written.

What’s the advantage of this approach compared to the previous one?

Solution:[+]

Exercise: Recall the implementation of Eratosthenes’ sieve we had in the second article of this series, which allows to compute prime numbers up to a given, fixed number:

#!/usr/bin/env -S nix-instantiate --eval
with builtins;
let
	foldlnseq = x: f: s: n: let aux = acc: i:
		if i > n then
			acc
		else
			aux (f acc (x i)) (i + 1)
		; in aux s 1;

	# foldln is now a special case of foldlnseq
	foldln = foldlnseq (x: x);

	toFloat = n: n + 0.1 - 0.1;
	isDiv = i: j: ceil(toFloat(i) / j) * j == i;

	sieve = n: foldln (p: i:
		if p i then
			trace("${toString i} is prime!")
			(j : !(isDiv j i) && (p j))
		else
			trace("${toString i} is not prime")
			p
	) (i: i != 1) n;
in
	sieve 50
trace: 1 is not prime
trace: 2 is prime!
trace: 3 is prime!
trace: 4 is not prime
trace: 5 is prime!
trace: 6 is not prime
trace: 7 is prime!
trace: 8 is not prime
trace: 9 is not prime
trace: 10 is not prime
trace: 11 is prime!
trace: 12 is not prime
trace: 13 is prime!
trace: 14 is not prime
trace: 15 is not prime
trace: 16 is not prime
trace: 17 is prime!
trace: 18 is not prime
trace: 19 is prime!
trace: 20 is not prime
trace: 21 is not prime
trace: 22 is not prime
trace: 23 is prime!
trace: 24 is not prime
trace: 25 is not prime
trace: 26 is not prime
trace: 27 is not prime
trace: 28 is not prime
trace: 29 is prime!
trace: 30 is not prime
trace: 31 is prime!
trace: 32 is not prime
trace: 33 is not prime
trace: 34 is not prime
trace: 35 is not prime
trace: 36 is not prime
trace: 37 is prime!
trace: 38 is not prime
trace: 39 is not prime
trace: 40 is not prime
trace: 41 is prime!
trace: 42 is not prime
trace: 43 is prime!
trace: 44 is not prime
trace: 45 is not prime
trace: 46 is not prime
trace: 47 is prime!
trace: 48 is not prime
trace: 49 is not prime
trace: 50 is not prime
<LAMBDA>

Re-write sieve to use an infinite list instead of a bounded sequence, that is sieve should return an infinite list containing all prime numbers. Use take to select only some of them.

Tip:[+]

Solution:[+]

Sets

Here’s a small primer on Nix’s “attribute sets”, which can be assimilated to read-only maps, mapping a string index to an arbitrary, lazy-evaluated expression (number, function, etc.).

Exercise: We previously discussed three different implementations of take n xs. The last one relied on foldl', and used a list of two items as an accumulator. Try to re-implement that last version, but using a set instead of a list as an accumulator.

This is a simple exercise to practice basic set manipulation, nothing fancy.

Solution:[+]

Exercise: Use sets to try to implement lazy lists, as we previously did/attempted. Again, use either zeroes or iterate, and take to break free from the infinite recursion.

Solution:[+]

“Modules”

Let’s recall our most basic hello world:

#!/usr/bin/env -S nix-instantiate --eval
"hello world"
"hello world"

As we’ve already discussed, the value printed by Nix corresponds to the (single) value returned by the script. A “module” can be considered as a script that evaluates to a set containing a bunch of functions.

Such a set can then be imported from another .nix file via import. One can import the whole module, or just parts of it:

# Note that rec is mandatory here
rec {
	fib = n:
		if n == 0 || n == 1 then
			n
		else
			fib(n - 1)+fib(n - 2)
	;
}
#!/usr/bin/env -S nix-instantiate --eval
with builtins;
let
	m = (import ./mod-fib.nix);
	f = (import ./mod-fib.nix).fib;
	xs = [ (m.fib 10) (f 10) ];
in
	deepSeq xs xs
[ 55 55 ]

We’ll see a few examples in the following section.

Noon. Neighborhoods of Moscow (Полдень. В окрестностях Москвы, 1869), oil on canvas, 111×80 cm

Noon. Neighborhoods of Moscow (Полдень. В окрестностях Москвы, 1869), oil on canvas, 111×80 cm by Ivan Ivanovich Shishkin (Ива́н Ива́нович Ши́шкин, 1832-1898) through wikimedia.orgPublic domain

Strings, bytes

The next two articles will involve writing small parsers. That is, we’ll need to know how to process strings. But Nix’s strings weren’t really designed for what we’re going to inflict upon them.

So, we’ll start here by understanding their limitations, and write a few functions to help us with more sophisticated code in the future.

Exercise: I’ve previously pointed you to Nixpkgs' lists.nix; there’s also a strings.nix, containing a bunch of functions, and showing you how idiomatic Nix’s strings handling is to be done. So have a look at it, don’t hesitate to read Nix’s language documentation either.

Again, reading code really is key to an efficient progress.


Exercise: Now, we’d want to access the ASCII code corresponding to a single character. For instance, if we want to parse a number, say "12345", the common approach is to rely on the ASCII codes, and to be able to detect whether a character corresponds to a digit.

So write isNum c, which returns true iff the character c corresponds to a digit, by first implementing two functions toInt c and toChar i, which respectively return the ASCII code corresponding to a character c, and the character corresponding to the ASCII code i.

Write also a function isWhite c, which returns true iff c is a white character (space, newline, etc.)

Store all those functions in an ascii.nix module.

Tip:[+]

Solution:[+]

Exercise: Write the following functions, and store them in a strings.nix module:

Solution:[+]

Exercise: Here’s an interesting exercise for an article entitled “Pragmatism, laziness”.

The goal is to re-implement a delightfully simple recursive regular expression matcher, written by Rob Pike (Go, UTF-8, Sam, etc.) for the book The Practice of Programming. Brian Kernighan (Unix, AWK, etc.), despite his decades of work in computer science, and his considerable contributions to the field, said of it:

I was amazed by how compact and elegant this code was when Rob Pike first wrote it – it was much smaller and more powerful than I had thought possible

There’s little reason to doubt Kernighan’s humility: over-thinking is a real problem in computer science, and the “best” solutions often are deceptively simple. The same goes in other domains I guess: think how bothersome it was not to have a zero in math, or how simple relativity’s core principles are.

Parenthesis aside, the regular expression matcher should support:

Tip:[+]

Tip:[+]

Tip:[+]

Tip:[+]

Solution:[+]

“OOP”, state

A common exercise in early Lisp education is to write a small object system using closures, where “object” is to be understood as in your typical mainstream programming language, wrapping variables and methods on such variables (not in the Smalltalk sense of message passing). This is all rather similar to what we’ve done earlier.

For instance in the SICP §3.1, a basic “object” is defined as a closure whose argument corresponding to a method to be called on this “object”. Upon calling some methods, they may alter the underlying object: Scheme/Lisp are impure languages, meaning, there’s a notion of altering a variable. While we can’t do that with Nix, you should now have some workarounds in mind.

Note: While the SICP sticks to a very simple example, it’s actually possible to build very refined object systems in Scheme/Lisp by leveraging their macro systems, hence allowing for instance to express inheritance, class/instance variables, etc. This is again a feature we don’t have access to with Nix, at least directly.

Exercise: Try to implement such a small “mover” object. It should have two instance variables: a string and a pointer to that string (an integer), and the following methods:

  1. prev: move one byte backward;
  2. next: move one byte forward;
  3. get: retrieve current byte;
  4. setp q: set the pointer to the (integer value) q;
  5. tail: returns the string located after the current position of the pointer.

Finally, write a chain o xs method allowing to iteratively call the methods described in the list xs, using an object o as the first target.

Tip:[+]

Tip:[+]

Solution:[+]
Teutoburg forest (Тевтобургский лес, 1865), oil on canvas

Teutoburg forest (Тевтобургский лес, 1865), oil on canvas by Ivan Ivanovich Shishkin (Ива́н Ива́нович Ши́шкин, 1832-1898) through wikiart.orgPublic domain

Conclusion

We’re now ready to start writing more interesting pieces of code. In the following article, we’re going to study a few simple parsers/interpreters (Brainf*ck and two mathematical expressions interpreters).

We’ll then conclude this series with one last article, focusing on writing a λ-calculus interpreter.


In the series:


Comments

By email, at mathieu.bivert chez:

email