We’ve previously explored
how to run Nix code, and basic Nix features: variables, conditions,
“print/debug” “statement”, etc. But, we still haven’t saw anything
analogue to loops, providing us with iteration. This, will be
our subject for today, and will allow us to start writing something
more substantial than a hello world.
I assume little to no experience in functional programming from
the reader; I however assume familiarity with programming.
Note: I won’t bother with performances in general
in this series: the focus is on correctness (well, hopefully…),
and on developing an understanding of functional programming. But
why can’t I still care about performances, and why do people often
treat them as second-class citizens?
I think there are two main reasons. In tutorials, focusing on
performances adds noise, thus making the transmission of core concepts
more difficult. Furthermore, in real life, they are to be
treatedcautiously:
optimizations have a cost: they can make the code more subtle, more
complex, more difficult to write and maintain;
intuition can be wrong: you may think some function is slow
and spend considerable time fixing it, but proper instrumentation/measurements
may reveal unexpected bottlenecks.
Systematically improving performances locally, without proper instrumentation,
is a great way to degrade the quality of a codebase.
Parenthesis aside, the code for this series is available on
github.
Functional programming favors the use of recursion
over iteration, itself favored by imperative languages. Both
approaches are theoretically equivalent,
which means, all that can be expressed through recursion, can also
be expressed iteratively, and reciprocally.
For most people, recursion is less intuitive than iteration;
it seems that iteration is more natural, perhaps because it’s closer
to everyday perception. There are exceptions though: some people are
strangely fit for recursion.
We’ve saw before how to define basic functions:
#!/bin/nix-instantiatelet hw = w: "hello "+w;
in hw "reader"
"hello reader"
Defining and using recursive functions requires no more syntax than
what precedes (e.g. no need for special keywords). Hopefully, you’re
familiar with the factorial function:
#!/usr/bin/env -S nix-instantiate --evallet fact = n: if n ==0then1else n*fact(n - 1);
in fact 5
#!/usr/bin/env -S nix-instantiate --evallet fact = n: if n ==0then1else n*fact(n - 1);
in fact (fact 5)
0
In most programming languages, default integer types have a fixed
size. Consider:
#!/usr/bin/env -S nix-instantiate --evallet f = n: m:
if n *2==0|| n *2<=0then m
else f (n *2) (m+1)
;
in f 11
63
So it seems that Nix's integers seem, at least on a x86_64 Linux
machine, are stored on 64 bits (1 bit is reserved for the sign; there's
no distinction between signed & unsigned integers in Nix, they are all
signed).
A common way to deal with arbitrary large integers, is
arbitrary precision arithmetic,
colloquially referred to as "bignum".
Exercise: Implement a pair of mutually recursive functions
odd and even, both taking an integer as input, and returning
a boolean such that:
odd returns true when the number is odd, false
otherwise;
and well, even returns true when the number is even,
false otherwise.
#!/usr/bin/env -S nix-instantiate --evalwith builtins;
let odd = n:
if n <0then odd (- n)
elseif n ==0thenfalseelse even (n - 1)
;
even = n:
if n <0then even (- n)
elseif n ==0thentrueelse odd (n - 1)
;
# toString by default for boolean make an empty# string for false and 1 for true. Interestingly,# we can override the existing definition seamlessly. toString = b: if b then"true"else"false";
a = even 10;
b = even 5;
c = odd 10;
d = odd 5;
e = odd (-5);
in"(even 10)=${toString a}; (even 5)=${toString b} (odd 10)=${toString c}"+"; (odd 5)=${toString d} (odd -5)=${toString e}"
The Ackermann function is a specific kind of recursive
function, which can be very slow
early on to compute. And which returns
gigantic numbers.
You should observe, both for (ack 4 1) and (ack 5 5)
an extremely slow computation process.
In the previous example, the expressions (ack 4 1) and (ack 5 5)
weren't evaluated, thanks to laziness. But once we want to print
them, they must first be evaluated. Which triggers the issue.
After a fair bit of trial and error I’ve discovered that people who
struggle to code don’t just struggle on big problems, or even smallish
problems (i.e. write an implementation of a linked list). They struggle
with tiny problems.
So I set out to develop questions that can identify this kind of
developer and came up with a class of questions I call “FizzBuzz
Questions” named after a game children often play (or are made to
play) in schools in the UK. An example of a Fizz-Buzz question is
the following:
Write a program that prints the numbers from 1 to 100. But for multiples
of three print “Fizz” instead of the number and for the multiples of five
print “Buzz”. For numbers which are multiples of both three and five print
“FizzBuzz”.
Most good programmers should be able to write out on paper a program which
does this in under a couple of minutes. Want to know something scary?
The majority of comp sci graduates can’t. I’ve also seen self-proclaimed
senior programmers take more than 10-15 minutes to write a solution.
Note the use of an auxiliary function; I'll revisit this
pattern later on.
#!/usr/bin/env -S nix-instantiate --evalwith builtins;
let mod = a: b: a - (b * (div a b));
fizzbuzz = n: let aux = i:
if i > n then"end"else trace(
if (mod i 3) ==0&& (mod i 5) ==0then"FizzBuzz"elseif (mod i 3) ==0then"Fizz"elseif (mod i 5) ==0then"Buzz"else i
) aux (i +1); in aux 1 ;
in fizzbuzz 100
Consider again the Ackermann–Péter function.
What do you think (ack 0) is? Well, let’s try it out. Let’s
try to “print” it using toString:
#!/usr/bin/env -S nix-instantiate --evalwith builtins;
let ack = m: n:
if m ==0then n +1elseif n ==0then ack (m - 1) 1else ack (m - 1) (ack m (n - 1))
;
a = (ack 01);
b = (ack 0);
c = (b 1);
in"(ack 0)=${toString b}; ((ack 0) 1)=${toString c};"+"(ack 0 1)=${toString a}"
error:
… while calling the 'toString' builtin
at ./nix/rec/ack-curry.nix:16:13:
15| in
16| "(ack 0)=${toString b}; ((ack 0) 1)=${toString c};" | ^
17| +"(ack 0 1)=${toString a}" … while evaluating the first argument passed to builtins.toString
error: cannot coerce a function to a string: «lambda ack @ ./nix/rec/ack-curry.nix:4:11»
Well, it failed. But it nevertheless answered our question: (ack 0)
is a function.
Exercise: There’s a typeOf e function in the
builtins package, which returns a string corresponding of the expression
e. Try to use it instead of relying on an error message to check that
(ack 0) indeed is a function.
You can use either toString or make a list and use
deepSeq to view the result. Remember that we've used
the latter in an exercise of the previous article of this series.
But which function is (ack 0) exactly? Well, it’s the function
which takes an integer n as argument, and returns
ack 0 n. This technique is commonly referred to as
partial application: the function ack requires
two arguments to be completely applied, but only one is provided, so
the function is said to have been partially applied.
The process is often called “currying”: while both
terms are often used interchangeably, strictly speaking they aren’t equivalent
(you may want to refer to the related Wikipedia page for more).
We’ll soon see more example of partial application/currying, in this
article and in this series. As a curiosity, I’ll leave you with the following
archetypal example: you don’t need understand it to proceed further:
#!/usr/bin/env -S nix-instantiate --evalwith builtins;
let apply = f: l: foldl' (x: y: x y) f l;
concat = x: y: z: (x + y + z);
in apply concat ["hello"" ""world"]
"hello world"
Note:builtins.foldl' is an, efficient, strict implementation of
foldl on Nix’s lists. I’ll progressively explain that previous sentence
in detail in this series, starting with foldl.
fold: a first encounter
In functional programming, there are a few famous functions, and fold
(you may also find it sometimes called reduce)
must be one of the most prominent first class citizen
(pun intended). It’s generally used
on lists, and we’ll soon study it in such a context in a
later article, but
we can find ways to make it work on some sequences of integers.
From now on, forget everything you know about Nix’s lists and lists
in general. Consider the following warm-up exercise:
Exercise: Write a program sum which takes an integer
as input, and returns the sum of integers from 1 to \(n\), or:
\[\begin{aligned}
(\text{sum}\ n) = \sum_{i=1}^n i = 1 + 2 + ... + n
\end{aligned}\]
Bonus: Can you find a way to write that program without using
recursion (or loops, or any form of iteration)?
Exercise: Now let’s try to factorize sum and fact
by introducing a new function, foldln f s n, where n is integer
marking the end of our integer sequence. Try to understand what f
and s represent, and implement foldln; use it to re-implement
sum and fact to check your results.
#!/usr/bin/env -S nix-instantiate --evalwith builtins;
let foldln = f: s: n: let aux = acc: i:
if i > n then acc
else aux (f acc i) (i +1)
; in aux s 1;
sum = n: foldln add 0 n;
fact = foldln mul 1;
xs = [ (sum 10) (fact 5) ];
in deepSeq xs xs
[55120]
There are a few noteworthy elements in that last exercise:
partial application: note that fact is built as a
partial application of foldln, by comparison with sum.
Obviously, this is arbitrary, and just to highlight how
partial application can occur in practice.
Contrary to what I’ve been doing so far (except for the FizzBuzz exercise)
I’ve used an auxiliary function, defined in the scope on the main function,
which progressively fills an accumulator at every step of the iteration.
This is a common pattern in functional programming. Note that we could have
done without it. There are goodreasons for
using it though, but we won’t focus on them: I just think
it’s useful to get familiar with the pattern. For the curious, you may
find details in the following video:
Exercise: Write a program sumsquare n (via foldln, of course!)
which takes an integer n as input, and returns the sum of the squares of integers
from 1 to n:
#!/usr/bin/env -S nix-instantiate --evalwith builtins;
let foldln = g: s: n: let aux = acc: i:
if i > n then acc
else aux (g acc i) (i +1)
; in aux s 1;
foldrn = g: s: n: let aux = acc: i:
if i > n then acc
else g i (aux acc (i +1))
; in aux s 1;
# (sub (sub (sub (sub (sub 0 1) 2) 3) 4) 5)# = -1-2-3-4-5# (sub 1 (sub 2 (sub 3 (sub 4 (sub 5 0)))))# = 1-(2-(3-(4-5)))# = 1-(2-(3-4+5))# = 1-(2-3+4-5)# = 1-2+3-4+5 xs = [
(foldln sub 05)
(foldrn sub 05)
];
in deepSeq xs xs
[ -15 3]
Exercise: Write the following programs:
sumfib n which sums the first n terms of the fibonacci
sequence;
sumfact n, which sums the factorials of the integers from 1 to n;
sumr m n which sums the integers between m and n, included;
sumeven n and sumodd n, which respectively sums even and odd numbers
between 1 and n.
Identify generic patterns underlying all those functions, that is, implement
them using a handful of generic functions.
foldlnseq x f s n, where seq stands for sequence;
x is a function taking an integer and returning an integer,
thereby generating the aforementioned sequence, starting from \([1\ 2\ ...\ n]\);
I'm being voluntarily scare on details, in the tips as in the answer,
so as to give you room to work things out. We're going to step up
in difficulty with the next exercise and with the next article, so
let's make sure you're prepared.
#!/usr/bin/env -S nix-instantiate --evalwith builtins;
let foldlnseq = x: f: s: n: let aux = acc: i:
if i > n then acc
else aux (f acc (x i)) (i +1)
; in aux s 1;
foldlnp = p: f:
foldlnseq
(i: if p(i) then i elsenull)
(x: y: if y ==nullthen x else f x y);
# foldln is now a special case of foldlnseq foldln = foldlnseq (x: x);
# foldln sample usage sum = foldln add 0;
fact = foldln mul 1;
# Importing fib fib = n:
if n ==0|| n ==1then n
else fib(n - 1)+fib(n - 2)
;
# Nix idiosyncracies toFloat = n: n +0.1 - 0.1;
isDiv = i: j: ceil(toFloat(i) / j) * j == i;
isEven = n: isDiv n 2;
isOdd = n: !(isEven n);
# Helpers (we could do without, but they make things clear) foldlnr = f: s: m: foldlnp (i: i >= m) f s;
foldlneven = foldlnp isEven;
foldlnodd = foldlnp isOdd;
# ------ answers ------ sumfib = foldlnseq fib add 0;
sumfact = foldlnseq fact add 0;
sumr = foldlnr add 0;
sumeven = foldlneven add 0;
sumodd = foldlnodd add 0;
xs = [
(sum 10)
(fact 5)
(sumr 510)
(sumr 14)
(sumeven 10)
(sumodd 10)
# https://www.wolframalpha.com/input?i2d=true&i=Sum%5Bi%21%2C%7Bi%2C1%2C5%7D%5D (sumfact 5)
# https://www.wolframalpha.com/input?i2d=true&i=Sum%5BF_i%2C%7Bi%2C1%2C10%7D%5D (sumfib 10)
];
in deepSeq xs xs
[5512045103025153143]
Note: Don’t hesitate to spend time toying with recursive functions,
foldln & cie, until they become second nature. Recursivity in functional programming
languages is as fundamental as while/for loops in imperative languages, so you want
to make sure to understand it adequately. fold encompasses a key pattern to
recursively go through structured data is also something definitely worth studying
extensively.
We’ll revisit all of this in later articles anyway.
Eratosthenes’ sieve
Finally, to conclude and progressively transition with the
following article on closures,
let me show you again the last example from the section on partial
application/currying. Now that we’ve covered fold, even though only
on some integer sequences, you may find it less cryptic:
#!/usr/bin/env -S nix-instantiate --evalwith builtins;
let apply = f: l: foldl' (x: y: x y) f l;
concat = x: y: z: (x + y + z);
in apply concat ["hello"" ""world"]
"hello world"
The key idea is as follow: our accumulator is now a function, and we’re
progressively building the final function by creating intermediate partial
applications, until we’ve exhausted the list of arguments.
Exercise:Eratosthenes’ sieve
computes prime numbers up to a certain fixed integer. Have a look at
the Wikipedia page, and use it to implement a function sieve n which
prints (trace) whether a given number is prime or not. Your implementation
must rely on foldln & friends. Do not use lists or sets.
The trick here is that to implement the sieve while iterating
on the numbers, we need some kind of ways to keep track of the
numbers we already saw, which would typically be implemented
with a map/hashmap/set.
But as we can't use sets here, the only way we have to track
earlier numbers is to use a function. More precisely, we
want, as it is done for apply, a foldln,
whose accumulator is a function.
Start with an accumulator set to (x: x != 1). Then,
once you find a prime number, update the accumulator accordingly,
otherwise, the accumulator need not be changed. Try to understand
how the code would behave on 2, 3, 4 and 5.
While the following does what it's supposed to do, the final
predicate is "wrong". Can you figure out why? This will be
a part of the following exercise.
#!/usr/bin/env -S nix-instantiate --evalwith builtins;
let foldlnseq = x: f: s: n: let aux = acc: i:
if i > n then acc
else aux (f acc (x i)) (i +1)
; in aux s 1;
# foldln is now a special case of foldlnseq foldln = foldlnseq (x: x);
toFloat = n: n +0.1 - 0.1;
isDiv = i: j: ceil(toFloat(i) / j) * j == i;
sieve = n: foldln (p: i:
if p i then trace("${toString i} is prime!")
(j : !(isDiv j i) && (p j))
else trace("${toString i} is not prime")
p
) (i: i !=1) n;
in sieve 50
trace: 1 is not prime
trace: 2 is prime!
trace: 3 is prime!
trace: 4 is not prime
trace: 5 is prime!
trace: 6 is not prime
trace: 7 is prime!
trace: 8 is not prime
trace: 9 is not prime
trace: 10 is not prime
trace: 11 is prime!
trace: 12 is not prime
trace: 13 is prime!
trace: 14 is not prime
trace: 15 is not prime
trace: 16 is not prime
trace: 17 is prime!
trace: 18 is not prime
trace: 19 is prime!
trace: 20 is not prime
trace: 21 is not prime
trace: 22 is not prime
trace: 23 is prime!
trace: 24 is not prime
trace: 25 is not prime
trace: 26 is not prime
trace: 27 is not prime
trace: 28 is not prime
trace: 29 is prime!
trace: 30 is not prime
trace: 31 is prime!
trace: 32 is not prime
trace: 33 is not prime
trace: 34 is not prime
trace: 35 is not prime
trace: 36 is not prime
trace: 37 is prime!
trace: 38 is not prime
trace: 39 is not prime
trace: 40 is not prime
trace: 41 is prime!
trace: 42 is not prime
trace: 43 is prime!
trace: 44 is not prime
trace: 45 is not prime
trace: 46 is not prime
trace: 47 is prime!
trace: 48 is not prime
trace: 49 is not prime
trace: 50 is not prime
<LAMBDA>
Note: In that last exercise, because of how sieve
is implemented, it’s not very hard for us to perform computations via
foldlnp such as computing the sum of the squares of all prime numbers
between 0 and 100.
Exercise: You may want to give it a quick shot; there’s a subtle
adjustment to perform for our previous version to work as expected.
#!/usr/bin/env -S nix-instantiate --evalwith builtins;
let foldlnseq = x: f: s: n: let aux = acc: i:
if i > n then acc
else aux (f acc (x i)) (i +1)
; in aux s 1;
foldlnp = p: f:
foldlnseq
(i: if p(i) then i elsenull)
(x: y: if y ==nullthen x else f x y);
foldln = foldlnseq (x: x);
toFloat = n: n +0.1 - 0.1;
isDiv = i: j: ceil(toFloat(i) / j) * j == i;
sieve = n: foldln (p: i:
if i !=1&& p i then (j : (j <= i ||!(isDiv j i)) && (p j))
else p
) (i: i !=1) n;
foldlprime = f: s: n: foldlnp (sieve n) f s n;
in# The value is OK for for 5 and 10; I guess it's OK# for 100 foldlprime (x: y: x + (y*y)) 0100
65796
It’s okay if you’re a bit befuddled by what’s happening here: we’re
going to spend the next article
studying this peculiar idea of closure in greater depth.
Comments
By email, at mathieu.bivert chez: