On Libc, rust experiments, and Risc-V

2020-11-15

LibC

Recently I have continued the exercise of re-implementing certain base system utilities in C, uncovering more and more differences between BSD and GNU libc implementations. At this point it's really easy to say that I have a strong preference for the BSD extensions to the libc standard as opposed to the GNU extensions. The Berkeley programmers, when they added features, added truly useful features.

Take random number generation as an example. It is fairly common when programming on Linux to open and read from /dev/urandom. BSD, in the meantime, offers a group of functions in libc (arc4random) that combine pseudo-random number generation with cryptography to produce pseudo-random data that is truly difficult to guess, with a low CPU overhead.

Another small place where I found the BSD libc to offer superior functionality was when I was writing my own version of the mktemp utility. The BSD versions of the mkstemp and mkdtemp functions allow one to give a template with a variable-length suffix, to give a choice between quick (with a smaller suffix) or as secure as you want to be. By increasing the suffix length, one can exponentially increase the number of possible filenames, making temp files a more difficult vector of attack.

The GNU (and musl actually) versions of these functions only allow a six character random suffix, period. In fact, my earlier port of mktemp from BSD did not function as expected in this regard, and the GNU coreutils programmers had to roll their own functions to get equivalent functionality to the BSD counterparts. I have elected to not abstract this deficiency away in my code, and the HHL version of the mktemp utility uses the available versions of mkstemp and mkdtemp, with a six character suffix. I would rather maintain the simplicity of the program than to take a risk of introducing bugs by writing my own functions here.

In any event, the work that I have been doing here makes a strong case for a future port of musl, with the additions of some BSD functions ported to work with musl, as our base C library. As attractive as BSD libc is, a direct port is unrealistic as so much of it is tied to kernel interfaces that differ on Linux. I just find it a shame that Linux users are mostly saddled with such a piece of crap (bloated while still being less functional) in such an important position in the software stack.

Rust

There are a lot of programming languages, and they tend to come and go regularly. As programming languages go, it's a safe bet that C, C++, Java and Python will be sticking around. Of the less pervasive languages Go and Rust seem to be well ahead of the pack as pertaining to mass adoption. Go is interesting, largely because of what it leaves out. Go is simple. Go avoids the feature creep that seems to be inevitable in all modern software. However, Go achieves memory safety by using garbage collection. I want to like Go, but that one anti-feature prevents me from even trying it.

Rust, on the other hand, has feature upon feature to the point where I really am hoping that they ever finish the language someday. That said, I'm intrigued by Rust for a number of reasons. I have been toying with learning the language for a while, doing some simple learning exercises and reading the Rust Book. I recently decided to just dive in and start writing my own programs using Rust, as that seems to be the only way to truly learn a language. My first full Rust program is a clone of a C program that I wrote for generating sine lookup tables, useful in DSP applications. The C version is perfectly fine, but I basically wanted to try porting a known working program and see how well the control flow translated.

The process was not completely smooth, and it took quite a few iterations to get to the point where the program compiled and actually functioned. However, the documentation is excellent and provided solutions to most of my issues. The other thing that really impressed me was the compiler messages. Often, the compiler will tell you literally the exact change needed to make your program compile. Coming from C, this is a breath of fresh air. And when we're talking about Rust, chances are that if it compiles it's also going to function exactly as expected.

Rust is also elegant in a way that C never will be. Consider the following C code:

for ( i = 0; i < 10, i++) {
  // some code
}

And the equivalent Rust:

for i in 1..10 {
  // some code
}

One thing that requires a different mindset when one moves from an old-school language like C, to a modern language like Rust, is that in C et al, the standard library tries to be all inclusive. Programmers (at least the good ones) try to avoid adding dependencies to anything other than the standard library. Rust, however, has Cargo. Cargo is, in effect, a built in package manager. This is not unheard of; Python has pip and has for years. In Rust, however, the standard library is quite small and everyone pretty much freely uses cargo to access functionality that is not included in libstd. In my case I used cargo to pull in the excellent clap crate, which parses command line arguments and also provides built in documentation. Compared to getopt in C, clap both does more and does it with little effort.

At this point int time, however, Rust is not going to be landing in the HitchHiker base system, as the only viable Rust compiler uses llvm as a backend. At some point in the future this may well change anyway, as I have been growing increasingly frustrated with both Glibc and gcc (a brief skim of some earlier blog posts will quickly turn up examples). The obvious replacements are musl and the clang compiler, which, being based on llvm, would make the further inclusion of rustc trivial.

Risc-V

I do a fair amount of electronics experimenting and often employ various microcontrollers in my projects. Some are Atmega based, while I have more recently fallen in love with the ATSAMD51 Arm Cortex M4 processor based boards from Adafruit. Just this week I have acquired something new to me: a Risc-V based MCU from Seeed Studio, the Sipeed Longan Nano. It's a brilliant chip, although quite lacking in documentation and example code. I'll be playing with it a lot.

The Nano, however, is a microcontroller, not a full SOC that is capable of running Linux. Right now there are few options for developers wanting to experiment with Risc-V with a full operating system, and the best ones are all prohibitively expensive for their capability. Due out next year, however, is the PicoRio board from Rios labs that is described as close in form factor and capabilities to most Arm boards based around the RPI format. Provided this board is not vaporware and actually does materialize at a reasonable cost, I intend for HItchHiker Linux to be running on it as soon as feasible. I intend to make HitchHiker one of the very first options independent of distros such as Debian and Fedora, and to make Risc-V a first class citizen. It is my hope that in doing so HitchHiker might help to push along the adoption of truly open hardware on which to run open software.


Tags for this post:

C programming Rust NonGNU RiscV Glibc Utilities