Continued porting of BSD userland against remaining self-hosting

2020-09-03

At this juncture we're well over the halfway point in regards to providing a predominately BSD userland in HitchHiker, with over 90 console utilities ported against a total of 59 remaining programs brought in from GNU coreutils. That total includes, of course, some programs that are not included in GNU coreutils. Some of the utilities replace standalone GNU packages, such as grep and diff, while some are utilities that may not be present on most Linux distributions, such as pax or leave. In general, the BSD utilities have compatible functionality, as both GNU and BSD have mostly embraced the POSIX standard, but with a few caveats where there are extensions made to the standard. The biggest differences reflect a difference in philosophy, as the BSD utils are quite small and include only what they must, while the GNU versions are all (yes all) larger and accept more options. GNU particularly accept long options and in general have better internationalization. The GNU utilities also accept the --help option, giving a brief summary of their function. BSD utilities specifically omit this functionality, as each command comes with a man page, which would duplicate functionality.

One of the dangers in just blindly replacing the usual GNU userland on Linux is that when the accepted options differ slightly, some scripts may not run as expected. This also can affect building packages from source if care is not taken to stick with only POSIX specified options when writing Makefiles. To some extent I have been working to minimize the effects here by taking the time to verify each utility works as expected and in some cases implementing options that most users expect, such as the -v switch (verbose) for the ln and rmdir utilities. In other cases, such as the much abused -a or --preserve-attributes options of GNU cp, implementing equivalent functionality in the BSD port is definitely non-trivial, and many people are just trying to copy directories recursively while preserving symlinks when they use 'cp -a', and not actually wanting to preserve ownership or modification times. Those cases can be served by using BSD cp as 'cp -R'. If preserving attributes is the desired behavior, then this is one of the tradeoffs being made and extra steps will have to be taken with chown and touch.

In order to ensure that HitchHiker always remains self-hosting I have been running bootstrapping the system from itself multiple times during the porting effort. A few issues have been found and fixed along the way, but it has been an overall smooth transition. In one case, I had actually misused 'cp -a' myself in the busybox build (busybox is only built as part of the temporary toolchain).

One particularly special case is glibc, which has a hard dependency on GNU awk during build time. As I have a preference for the historical (yet still maintained) Nawk, this requires building a statically compiled Gawk early in the bootstrap process as part of the temporary tools in /toolchain, which is discarded along with the rest of the temporary tools after the bootstrap phase and does not become part of the base system. Another example of the "Sheer F%$king Hubris" of much GNU software, which often will only build with other GNU tools, even though other tools exist with equivalent (and sometimes superior) functionality.


Tags for this post:

Porting Roadmap NonGNU