Restarting make, buildworld cleanup, default software

2020-08-03

One of the deficiencies in the HitchHiker build tree up to this point has been dealing with packages that require some manual intervention after the "make install" step. Originally there was no unified infrastructure to deal with this scenario. Very quickly during testing I realized that this was an omission that was going to cause problems, as the scenario was much more common than I at first realized and implementing solutions on a per-package basis is both quite error prone and likely to introduce bugs into the build process. Up until the workarounds have involved redefining the .DEFAULT_GOAL variable and creating custom targets to finish the installation, while making the basic installation a dependency of said targets.

A common problem that arose out of this hodgepodge approach is that if the new targets were ".PHONY" targets (a target that is not an actual file) or the targets were symlinks then if the build was stopped it would error out on resuming, due to the timestamps of dependencies being the same or newer than their targets, or else a target could not actually be run because it involved moving some file that we already moved the first time it ran. This made testing new features a laborious process, as often any errors required removing the entire rootfs and restarting buildworld from the beginning.

When I implemented hhl.cprog.mk, I took steps to remedy this situation. If one defines the variable "finish" then the system will know that there are steps to be taken after "make install". Then any additional steps can be added to the new target {objdir}/.finished, which is the second dependency of the "finish" target after "install". This worked well in practice, so I have implemented the same functionality in targets.mk which is used by all "external" source packages (those packages for which the source is not included in the HitchHiker build tree but downloaded as tarballs during the build).

As a result of this, after considerable tweaking, the build can now be restarted from any point after stopping and will run to completion. I consider this to be an important milestone, as it makes future development much less arduous.

An important goal of HitchHiker is that the base system includes everything needed to bootstrap a full system. As such, it is unfortunately required to include certain packages that will not necessarily be required under all circumstances. A good example of this is the networking stack, as it is next to impossible to bootstrap a full system without a network connection, but we don't know if that connection is going to be wired, wireless, secured wireless, static or dynamic, etc. The first test builds of the complete system ignored this completely and only included what is needed to manually set up a static, wired connection.

This situation is also now remedied, as the officially distributed builds will include the dhcpcd, wireless-tools and wpa_supplicant packages. However, it is entirely possible to exclude any or all of these packages by setting appropriate variables in the file config.mk in the top level of the source tree.

There are now a number of things that can be tweaked when building the system from source. One can specify a list of locales to compile for glibc rather than the default of compiling all supported locales. By setting the variable rpi to true, you get the Raspberry Pi foundation kernel instead of linux mainline. And a custom kernel configuration file can be substituted for the default HitchHiker one. None of these tweaks should change the essence of the system being HitchHiker, but just allow the user some extra choice and the ability to slim the system down a fair bit, as well as speeding up the build somewhat.

Lastly, I want to talk about the roadmap for HitchHiker a little bit. Regarding the build tree itself, at the moment we currently have the .mk file targets.mk which includes most of the logic to build all of the packages that are downloaded and compiled as part of the base system. This file's internal logic defaults to using GNU autotools, but is flexible enough to build packages in other ways by resetting appropriate variables. This approach does work, but requires some deep knowledge of how it all is put together. As eventually I would love to see my work finding a larger audience I would like to make the infrastructure a little bit easier to understand, so for example when we graduate from pkgsrc to our own ports tree there could be some community involvement in fleshing out a complete ports tree.

To that end, a future goal is going to be removing the autotools logic into an "autotools.mk" file and creating, for instance, "cmake.mk" or "waf.mk", which will know how to build packages using those build systems. At that point one could set the build system to use "waf" via a variable assignment and then include hhl.ports.mk, which would then pull in "waf.mk". Doing this cleanly is an important step in the way to creating a ports tree that will be easily debugged and easy to maintain and update.

This theoretical "hhl.ports.mk" will also need logic for tracking everything to be installed by each and every package, uninstalling packages and upgrading to newer versions. In short it will need much of the logic of a package manager. I have actually implemented most of this previously in the original HitchHiker system of ten years ago. At that point installed files were tracked by creating a timestamp file before "make install" and then using the find utility to catch any files with a newer timestamp. This had the attractive benefit of being completely agnostic to build system, but is not without it's drawbacks. Chief among the drawbacks is that it is a rather time consuming approach, but there is also the danger that we will miss a file if it is created with the wrong time stamps, or may capture an incorrect file that has been created by another running process. For those reasons I'm planning to implement it by installing into a DESTDIR. The problems here are that one needs to be familiar with multiple build systems to know how to achieve this, and also that the concept of a DESTDIR is not universally implemented in all pieces of software out there. For the former, the solution is good old fashioned grunt work to learn the ins and outs of each build system as it is encountered. For the latter, the approach is going to be patching the build system and submitting patches upstream, with the hope that the authors will be receptive.

I'm going to interject at this point to point out that pkgsrc uses an entirely different approach by requiring every package directory to include a PLIST file, thus placing the burden squarely on the shoulders of the maintainer to populate this file with every file the package will install on your system. FreeBSD also uses this approach. Functionality could be included into "hhl.ports.mk" to use a PLIST file if it exists, skipping any automatic file discovery.

Then there is another elephant in the room - dependency tracking. The previous incarnation of HitchHiker kept a flat file database of every file that every package in the ports tree might install, ran "file" on every file in the new package to determine it's file type, then ran ldd on every elf file and compared the results to the database. It was quite effective at catching any binary dependencies. It was also slower than shit on large packages and extremely overkill much of the time. Knowing a bit better now I have no intention of going down that rabbit hole again. Instead, dependencies are just going to be manually specified by the developer in each port's Makefile. While this is somewhat error prone, it can be refined over time to be near perfect and has the advantage of speed. In addition, not all dependencies are library dependencies. Some dependencies are of the nature of expecting another program to be available to exec() at runtime, and some dependencies are simply build time requirements.

Now here's the point where I'm going to get controversial. BSD ports implement all of this using just make and for years had nothing like the traditional Linux package manager, instead relying on simple command line utilities to manipulate packages. I -don't- want a package manager in HitchHiker. I don't want to reinvent the wheel and write a new one, and none of the existing ones are going to be suitable. I think they're generally just complicated, poorly understood by the average user, and since there are already so many implementations it's virtually impossible at this time for one of them to become "standard", thus basically making the idea of the Linux desktop an unachievable dream. No company is going to bother supporting dpkg, rpm, Slackware tarballs, Arch tarballs, Flatpack and Ubuntu snaps. It's not happening. At most they're going to release an rpm, a deb or a snap and just ignore everybody else. The ideas of the commercial world embracing desktop Linux, the year of desktop Linux, etc are all basically pipe dreams.

Here's what I picture instead. If we can manipulate source archives into binary packages and then install those packages all using make, we can frankly do the same with binary packages. BSD port trees and pkgsrc had already proven that would work years ago. In fact, in FreeBSD you can install binary packages that are dependencies of the port that you are building rather than building them from source, or build ports that don't have a binary package available when installing binary packages. This system functioned just fine for a long time. Their mistake a few years ago was jumping on the Linux bandwagon and writing yet another package manager, pkg. Since that time I've noticed quite a lot more issues keeping FreeBSD packages up to date, to the point where I even abandoned an installation at one point. So we're skipping it entirely, and instead I'm going to provide a "binary-install" target in "hhl.ports.mk" that will fetch and install binary packages. Easy peasy, no package manager required, unless you consider the ports tree itself to be a package manager.


Tags for this post:

GNU Make Milestones Roadmap