Let's chart a course to a usable system while usable systems can still be built, shall we?
Definitions first
By Republican, we do not simply mean that the item shall be used by the Republic. We mean that the item shall be constructed in such a way that it embodies, serves, and furthers Republican ideology in computer operating systems. As there can be no meaning absent a structure of authority, "Republican OS" is predicated on the Republic being the structure of authority within which said item is defined.1 Thus relying on "upstream"2 for anything is sin. Fallen creatures that we are, I do not expect us to eradicate sin in an instant. Yet inaction is not an option. Inaction is death, both of the Republic and ourselves. From our options we cleave both immediate perfection and tolerance of anything which does not serve us. What remains is a ratchet; it begins loose and ever tightens, squeezing out the unnecessary, the confounding, the redundant, and leaves us with something small enough to fit in our hand.3
By operating system, we mean a complete set4 of self-hosting software capable of running on computers which can be obtained with reasonable effort. Self-hosting means that the system can be used to edit, build, and collaborate on next iterations of the software. It means that we turn the ratchet ourselves. Understand that the first build of this self-hosting system incurs the bootstrapping problem, and understand that we do not currently have a solution to this problem.5
Hardware selection
We must target particular hardware on which to build this system. This means AMD, or ARM64 machines, and to date it isn't clear which.6 Whichever hardware we choose, should we be forced to rely on items we cannot ourselves build, alter, and maintain,7 we proceed in sin, and leave debts to pay.
Compiler
Building software, we must choose a compiler which is capable of producing binaries for our target architecture. There has been much discussion already on the precise place to break. Absent deep research on the matter, one can only lean on empirical results, and by those, gcc 4.x seems to have been capable of our needs to date. This is a weak recommendation, and given the hulking mass of the pile in question, it might be all we have the lifetimes to achieve. gcc-4.9.4 being the last bugfix release before the next major revision, it seems as good a place as any to pull the handbrake. It's worth pointing out that this version made the following change:
GNAT switched to Ada 2012 instead of Ada 2005 by default.
Let's weigh the thing before we move on.
gcc-4.9.4 cloc . 76640 text files. 75753 unique files. Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327. Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327. Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327. Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327. 4671 files ignored. github.com/AlDanial/cloc v 1.70 T=338.66 s (212.7 files/s, 30282.2 lines/s) --------------------------------------------------------------------------------------- Language files blank comment code --------------------------------------------------------------------------------------- C 23126 444405 462198 2326698 C/C++ Header 10280 155822 158227 758024 Ada 5052 264536 349229 756828 Java 6346 169198 646043 682350 C++ 17407 137657 176833 612276 Bourne Shell 152 81229 68484 440850 Markdown 346 40753 0 339466 Go 2170 37979 49221 298060 HTML 330 33462 5663 148765 Fortran 90 4071 17275 31443 101132 m4 198 8644 2443 77132 Assembly 544 13218 32088 58482 XML 60 5961 563 44147 Windows Module Definition 122 4104 38 30404 make 157 4088 1653 25751 Expect 282 5596 9909 22452 Objective C 512 4852 3066 16604 TeX 2 1480 6149 11675 Fortran 77 431 1077 3850 10659 Objective C++ 242 2388 1502 8020 Perl 30 903 1349 4686 MSBuild script 7 1 0 4675 Pascal 13 790 3261 3465 Python 10 804 736 2855 XSLT 20 563 436 2805 awk 18 374 589 2393 OCaml 3 310 416 2279 Bourne Again Shell 15 415 654 1865 CSS 9 332 143 1428 yacc 2 107 119 977 C# 9 230 506 879 Tcl/Tk 1 72 112 393 lex 1 34 30 156 CMake 1 27 31 153 NAnt script 2 17 0 132 JavaScript 2 20 81 122 Haskell 36 15 0 112 Windows Resource File 2 3 2 67 SAS 1 14 22 32 DTD 3 28 70 26 Fortran 95 2 10 8 21 Lisp 1 4 12 8 MATLAB 1 0 0 5 DOS Batch 2 0 0 4 XHTML 1 6 16 3 --------------------------------------------------------------------------------------- SUM: 72022 1438803 2017195 6799316 ---------------------------------------------------------------------------------------
The damned thing broke the tool I was using, so this may be an incomplete result! At any rate, hefty...
Bootloader
The system must boot, and therefore we must select a bootloader. Here are the options. If you know of another, write in.
Syslinux
Apparently this thing can boot far more than LiveCDs these days. If this claim holds true, it might be a way of avoiding having multiple bootloaders present in the system (one for install media, and the other for the installed system). Things like being able to display graphics could probably be chopped out without too much effort. It supports EFI, which permits us use of disks larger than 2tb. It's large, but appears to be a somewhat modular collection of bootloaders. It'd be worthwhile to discover how heavy extlinux is by itself.
syslinux-6.03 cloc . 2905 text files. 2760 unique files. 344 files ignored. github.com/AlDanial/cloc v 1.70 T=14.98 s (171.1 files/s, 41420.3 lines/s) ----------------------------------------------------------------------------------- Language files blank comment code ----------------------------------------------------------------------------------- C 1147 50307 78851 255734 C/C++ Header 856 22631 46656 108013 HTML 8 3974 11 13407 Assembly 95 1816 4745 9239 D 296 0 0 5825 Pascal 30 607 20 4809 Perl 40 720 828 3546 make 65 997 1140 2896 xBase Header 7 308 844 1212 Python 1 34 6 267 Bourne Shell 7 40 17 179 Lua 2 20 5 161 Bourne Again Shell 3 15 29 150 CSS 2 18 0 105 XML 1 0 0 39 diff 1 7 34 32 Windows Resource File 1 2 0 24 ----------------------------------------------------------------------------------- SUM: 2562 81496 133186 405638 -----------------------------------------------------------------------------------
Lilo
It only supports MBR partition schemes (which incur the aforementioned 2tb limitation). I'm not aware of any way to boot from CDs using lilo, though one may exist, but I've had no problem using it to boot USB sticks. And the thing is damned slim!
lilo-24.2 cloc . 251 text files. 240 unique files. 113 files ignored. github.com/AlDanial/cloc v 1.70 T=1.29 s (106.7 files/s, 34785.2 lines/s) -------------------------------------------------------------------------------- Language files blank comment code -------------------------------------------------------------------------------- C 19 1702 558 11821 Assembly 21 1683 2627 9178 HTML 34 131 67 5701 TeX 8 550 394 3238 C/C++ Header 22 532 787 1379 Perl 4 262 356 1295 make 12 195 148 592 Bourne Again Shell 1 87 34 578 Bourne Shell 15 133 195 473 CSS 2 9 4 280 -------------------------------------------------------------------------------- SUM: 138 5284 5170 34535 --------------------------------------------------------------------------------
Elilo
This obviates the 2tb drive limit, and is even smaller than classical lilo! That said, much of the complexity may be hiding in EFI. I hear there's inquiry into EFI ongoing, so I'll await spyked's report for now.
elilo-3.16 cloc . 119 text files. 113 unique files. 25 files ignored. github.com/AlDanial/cloc v 1.70 T=0.46 s (204.5 files/s, 52649.6 lines/s) ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- C 46 2746 3715 10905 C/C++ Header 35 792 1643 2969 Assembly 5 89 407 430 make 8 111 213 184 ------------------------------------------------------------------------------- SUM: 94 3738 5978 14488 -------------------------------------------------------------------------------
Grub
There are two paths to consider here, the "legacy" branch and the 2.x branch. The former is quite manageably-sized, but does not support EFI.
grub-0.97 cloc . 193 text files. 184 unique files. 45 files ignored. github.com/AlDanial/cloc v 1.70 T=1.03 s (144.2 files/s, 84368.0 lines/s) ------------------------------------------------------------------------------- Language files blank comment code ------------------------------------------------------------------------------- C 65 7117 9156 32871 Bourne Shell 12 1242 1625 9938 C/C++ Header 49 1374 2986 5915 TeX 1 590 2464 4032 Assembly 9 807 1324 1913 m4 3 308 89 1700 make 8 99 54 459 Perl 1 91 87 339 ------------------------------------------------------------------------------- SUM: 148 11628 17785 57167 -------------------------------------------------------------------------------
Meanwhile the 2.x branch is unsurprisingly plump, but does support EFI.
grub-2.04 cloc . 2228 text files. 2157 unique files. 377 files ignored. github.com/AlDanial/cloc v 1.70 T=11.32 s (164.6 files/s, 50901.3 lines/s) --------------------------------------------------------------------------------------- Language files blank comment code --------------------------------------------------------------------------------------- C 931 44555 37610 264712 C/C++ Header 563 10793 21301 53226 Bourne Shell 18 6803 4182 32919 make 10 2987 469 25152 m4 150 1485 1504 20611 TeX 2 1512 6521 12598 Assembly 172 3186 6809 10329 Windows Module Definition 5 651 21 3627 Python 3 168 213 1379 lex 1 54 34 305 yacc 1 44 44 267 awk 1 6 16 80 C++ 1 8 17 80 sed 5 3 0 59 Lisp 1 3 0 50 --------------------------------------------------------------------------------------- SUM: 1864 72258 78741 425394 ---------------------------------------------------------------------------------------
Refind
This one's slim, supports EFI, and appears to be the work of a single guy. Should we proceed down the path of EFI, it appears to be worth further investigation.
refind-0.11.4 cloc . 312 text files. 307 unique files. 114 files ignored. github.com/AlDanial/cloc v 1.70 T=1.71 s (115.7 files/s, 51557.9 lines/s) -------------------------------------------------------------------------------- Language files blank comment code -------------------------------------------------------------------------------- C 70 6296 8340 36255 C/C++ Header 84 2769 7447 14805 HTML 20 2170 2 6581 Bourne Again Shell 5 106 347 1524 Python 5 148 222 494 make 10 125 123 313 CSS 1 13 2 73 Bourne Shell 2 8 25 29 Perl 1 1 0 6 -------------------------------------------------------------------------------- SUM: 198 11636 16508 60080 --------------------------------------------------------------------------------
Kernel
Our chosen hardware will require a kernel which supports it. Regardless of which kernel version we choose8 this is likely to be the second-hardest component of the system after hardware fabrication to maintain by Republican effort alone in the near term. The codebase is gargantuan even for older versions. Lets first take a look at 2.6.39.2, the last 2.6 release.
linux-2.6.39.2 cloc . 36684 text files. 36131 unique files. 4074 files ignored. github.com/AlDanial/cloc v 1.70 T=244.28 s (133.5 files/s, 56149.6 lines/s) -------------------------------------------------------------------------------- Language files blank comment code -------------------------------------------------------------------------------- C 16087 1501167 1531763 7743098 C/C++ Header 13589 314737 536239 1632163 Assembly 1217 39835 92468 204275 XML 139 3119 948 40974 make 1390 6004 6374 22654 Perl 41 2973 2462 13900 Bourne Shell 61 638 1475 3644 yacc 5 453 322 2987 Python 21 594 343 2727 C++ 1 209 57 1521 lex 5 202 237 1318 TeX 1 108 3 911 awk 8 90 79 714 Bourne Again Shell 28 74 55 446 HTML 2 58 0 378 NAnt script 1 87 0 356 Pascal 3 49 0 231 Lisp 1 63 0 218 Objective C++ 1 55 0 189 ASP 1 33 0 137 XSLT 6 13 27 70 sed 1 0 3 30 vim script 1 3 12 27 -------------------------------------------------------------------------------- SUM: 32610 1870564 2172867 9672968 --------------------------------------------------------------------------------
The thing weighs in at over 9 million lines. Start reading...
Since then, the line-count has almost doubled. Here's the latest version as of the publish date of this article.
linux-5.4.6 cloc . 65658 text files. 65224 unique files. 13254 files ignored. github.com/AlDanial/cloc v 1.70 T=478.71 s (109.5 files/s, 53144.6 lines/s) --------------------------------------------------------------------------------------- Language files blank comment code --------------------------------------------------------------------------------------- C 27621 2731603 2273195 13904947 C/C++ Header 19699 527802 950568 4273634 Assembly 1315 46707 101106 227391 JSON 272 1 0 159315 Bourne Shell 557 12529 9357 50235 make 2509 9400 10531 41322 Perl 56 5593 4073 27924 Python 110 4443 4131 24035 YAML 188 3039 880 15426 HTML 5 665 0 5508 yacc 9 692 355 4627 lex 8 326 300 2014 C++ 8 300 82 1873 Bourne Again Shell 51 354 296 1748 awk 10 140 116 1058 NAnt script 2 147 0 556 Windows Module Definition 2 15 0 109 m4 1 15 1 95 CSS 1 27 28 72 XSLT 5 13 26 61 vim script 1 3 12 27 Ruby 1 4 0 25 INI 1 1 0 6 sed 1 2 5 5 --------------------------------------------------------------------------------------- SUM: 52433 3343821 3355062 18742013 ---------------------------------------------------------------------------------------
In either case, absent a multi-billion dollar business alongside, major changes to this artifact are not happening. The open-source "dorks with laptops changing the world" approach never happened, and it's not about to now. Absent control of hardware production, I don't even think a kernel version can be specified. As soon as the chosen hardware becomes unavailable, one will be forced to either attempt backports of drivers,9 or give up and move to a newer kernel version.
Thus I propose a controversial alternative. Instead of targeting an older kernel which may not support available hardware, target the latest available and see what may be sawed off without disrupting needed hardware drivers. The kernel is quite modular, and there may be some hope in reducing the effective line count by several multiples by eradicating architectures, drivers, and other optional items that are unneeded.
And now for a breather...
More choices remain before we have even the vague outline of a bootable, self-hosting, self-editing, viable system. I'll be back tomorrow with the next article. Take care!
- If you read circularity there, read again. [↩]
- The stallmans, poetterings, *buntus. [↩]
- Recall the "fits in head". I intentionally reference and correct it. I don't give a flying fuck what fits in the imagination if it cannot be wielded. [↩]
- Complete means that all components are wholly included in the set, and not merely referenced. [↩]
- Though glimmers of a way out appear in the darkness. [↩]
- I will leave this portion unspecified for now, as sadly all we have is a vague notion that a certain vintage AMD chip might be usable. [↩]
- Binary-blob firmware, drivers, yes... but don't dare absolve yourself of your inability to fabricate your own hardware! [↩]
- ...and indeed, which kernel. [↩]
- Which either sums to a backport of significant parts of the newer kernel version. [↩]
I subscribe to those definitions. And the whole is quite a pleasure to read, I'll re-read it a few more times too.
At first pass though:
1. What does ARM64 architecture have exactly to recommend it? Last time I looked it wasn't particularly neat or anything and on top of that, it also comes with its own set of additional problems (e.g. no working GNAT for it).
2. If that Refind is potentially interesting, maybe it's worth contacting the single guy - who knows, maybe he had enough of working alone or something?
3. Among all the other elephants in the room, it would seem to me that the biggest you identify there is still the hardware production really - or at least I would certainly call it the biggest trouble indeed. But I'm not sure - do you consider that going with the latest hardware really does much to mitigate this until the time it can be properly addressed? I can see perhaps some advantage in "the latest" mainly as it makes it easier to spread essentially but the trend as I see it is for the latest to last very little anyway. I am certainly not against considering at least the option.
Comment by Diana Coman — 2019/12/29 @ 5:05 a.m.
1. ARM64 has lots of very cheap embedded hardware, great for disposables. GNAT should be able to be cross-compiled from e.g. an AMD64 GNAT.
2. Yeah, maybe I'll reach out to him.
3. I'm proposing specifying the process by which a kernel is inserted into the system, and proposing we make this very easy, rather than e.g. pouring cement around 2.6. We cannot understand the implications of the kernel choice at the birth of the system, and therefore we would benefit from being agile should we have chosen incorrectly, either due to sudden unavailability of hardware or to one of these rearing its awful tentacled head: https://www.cvedetails.com/product/47/Linux-Linux-Kernel.html?vendor_id=33
Comment by trinque — 2019/12/29 @ 5:05 p.m.
So the "fit in head" symbolically becomes fit in hand, huh. I can see it.
> or ARM64 machines
I didn't realise those were still in the running.
As far as I recall, it was specifically non-64 ARMs that were considered at some point, but they don't seem to have crossed the head-to-hand barrier ever -- and while operator ineptitude was certainly involved in that failure, I'm still gonna deem they failed.
> Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.
This would render the count dubious, neh ?
cloc btw is... well, it regularly chokes on things like a ./* (as in, paths) quoted inside code.
> Refind
How about you say hi to the guy ?
> In either case, absent a multi-billion dollar business alongside, major changes to this artifact are not happening.
The linux effort is not actually married to any billion-dollar businesses, nor ever historically was. It did manage to (briefly) disrupt the indolence and stupidity of a few thousand people, it is true, but that had relatively little to do with money or business. It did have relatively a lot to do with naggum's notion of luck, which is to say it happened to coincidentally occur at a time white society was both still colimated enough and still hierarchical enough such that young men on collegiate campuses connected to the internet could do remarkable things ; meaning it's not so likely to repeat naturally, "by itself" as it were -- but then again that's neither here nor there.
I still stand by the dozen people eighteen months story, in the above context.
> Absent control of hardware production, I don't even think a kernel version can be specified.
This is a major problem -- linus had a certain irreproducible wind blowing up his sails in the early days, namely that hardware was somewhat specified and pinned down by the necessities and attritions of poverty. He fucked up all that, leaving us to start over in markedly more adverse circumstances, it's true ; but I still don't think the actual standard is as hard as literal fabrication.
Imagine what the world would be like if in order to find a good fucktoy you actually had to impregnate her mother yourself. If I can get away with a lower standard than literal fabrication, I expect so could you.
> or give up and move
It really doesn't take THAT much to achieve pre-eminence over the current linux crowd.
> target the latest available
I'm never moving off the 2.something. I really don't care what happens, come hell or high water, whatever. 2.6 has been good enough for like fifteen years -- trilema is served by one, my gaming station runs one, it's good the fuck enough, forget about it. Unless quantum computers actually start selling at the store it will stay good enough, permanently.
Comment by Mircea Popescu — 2019/12/30 @ 3:08 p.m.
@Trinque Re ARM64 & GNAT, the "should be" is not enough - we found out already that compiling it with sjlj for ARM64 fails: http://logs.ossasepia.com/log/trilema/2019-02-15#1897027
Comment by Diana Coman — 2019/12/30 @ 3:19 p.m.
Footnote three is a solid point. It makes me wonder where the extension'll go in the future, if at all --are there more places to (by necessity) load a thing?
Probably I miss something between "In either case, absent a multi-billion dollar business alongside, major changes to this artifact are not happening." and "...target the latest available and see what may be sawed off without disrupting needed hardware drivers". Does sawing off not result in major changes? 'Cause if not, and the argument is that both presented kernels are irresolvably rotten and also sport sloughable surface gunk, the lowest-oldest-known-est-smallest-simplest rotten core wins no matter what else's going on, no?
Comment by hanbot — 2019/12/30 @ 7:30 p.m.
I agree that it is hard to foresee the implications of kernel version choice at the start.
1. Hardware support. Fundamentally, I have nothing against using the most recent one, but I would additionally stress that this does not imply that we will be chasing kernel versions for hardware support. IMO using the latest kernel instead of a bit older one only delays the potential problems of hardware support, and sooner or later a working solution will have to be found. Not depending on the upstream implies that users that bring non-supported devices would have to bring their drivers as well -- a backport of a driver can be anything from one-liner to significant PITA (miss one line where new kernel API should be used instead of old one, enjoy strange bugs), but 'give up and move on' is anti-ratchet.
Anyhow, I will make a review of what drivers (useful on x86, without looking into other arches) were added in each kernel version 4.10-5.4, maybe this will give a bit more information about how much backporting could happen from version to version. I do not expect that many drivers useful for servers were added in meantime, but lot of code churn for GPUs.
2. Minimizing the count of lines. Given that most of the weight of the kernel is in drivers, not in architectures, potentially can start from there and drop drivers which are not used on x86.
3. CVEs: a dedicated person would have to look at the announcements and patch accordingly, the fixes are typically not intrusive, but require attention to not miss anything. Picking "LTS" kernel can help with this at least a bit.
Comment by bvt — 2020/01/11 @ 6:00 a.m.
I see initial definitions are something I neglected and I subscribe to yours. Thanks for laying them out and picking up the slack.
While no one has yet to try compiling Gales with tcc, the cross-bootstrap procedure, looks like a good starting point and Gales uses GCC 4.7.4, which your link in footnote 5 indicated tcc can build. Perhaps this is closer than we think.
Re : which GCC, Gales uses 4.7.4 because it's the last version that can be bootstrapped purely from C, i.e. no C++. Along with 4.9.4, 4.4.7 is also on the table as that's what Mircea Popescu reported to mainly use.
No one yet owns GCC, primarily because I've been waiting to see if ave1 will resurface, but at this point we've waited longer than long enough without a peep from him and in the next week or so it's a priority to have someone take ownership and start mapping it out properly.
Re ARM64, one more data point to add to those above is the rockchips used by Pizarro are ARM64. I got everything but boost built on my quest to build TRB there, prior to Piz falling over. I think it's definitely doable to run trb on ARM64, but my quest there is sidelined for the foreseeable future.
I can say hi to the rEFInd guy on Monday and invite him to #trilema, apparently he works for Canonical.
Comment by Robinson Dorion — 2020/01/11 @ 4:59 p.m.