trinque

2019/12/28

A Republican OS - Part 1

Filed under: Uncategorized — trinque @ 9:35 p.m.

Let's chart a course to a usable system while usable systems can still be built, shall we?

Definitions first

By Republican, we do not simply mean that the item shall be used by the Republic. We mean that the item shall be constructed in such a way that it embodies, serves, and furthers Republican ideology in computer operating systems. As there can be no meaning absent a structure of authority, "Republican OS" is predicated on the Republic being the structure of authority within which said item is defined.1 Thus relying on "upstream"2 for anything is sin. Fallen creatures that we are, I do not expect us to eradicate sin in an instant. Yet inaction is not an option. Inaction is death, both of the Republic and ourselves. From our options we cleave both immediate perfection and tolerance of anything which does not serve us. What remains is a ratchet; it begins loose and ever tightens, squeezing out the unnecessary, the confounding, the redundant, and leaves us with something small enough to fit in our hand.3

By operating system, we mean a complete set4 of self-hosting software capable of running on computers which can be obtained with reasonable effort. Self-hosting means that the system can be used to edit, build, and collaborate on next iterations of the software. It means that we turn the ratchet ourselves. Understand that the first build of this self-hosting system incurs the bootstrapping problem, and understand that we do not currently have a solution to this problem.5

Hardware selection

We must target particular hardware on which to build this system. This means AMD, or ARM64 machines, and to date it isn't clear which.6 Whichever hardware we choose, should we be forced to rely on items we cannot ourselves build, alter, and maintain,7 we proceed in sin, and leave debts to pay.

Compiler

Building software, we must choose a compiler which is capable of producing binaries for our target architecture. There has been much discussion already on the precise place to break. Absent deep research on the matter, one can only lean on empirical results, and by those, gcc 4.x seems to have been capable of our needs to date. This is a weak recommendation, and given the hulking mass of the pile in question, it might be all we have the lifetimes to achieve. gcc-4.9.4 being the last bugfix release before the next major revision, it seems as good a place as any to pull the handbrake. It's worth pointing out that this version made the following change:

GNAT switched to Ada 2012 instead of Ada 2005 by default.

Let's weigh the thing before we move on.

gcc-4.9.4 cloc .
   76640 text files.
   75753 unique files.
Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.
Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.
Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.
Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.
    4671 files ignored.

github.com/AlDanial/cloc v 1.70  T=338.66 s (212.7 files/s, 30282.2 lines/s)
---------------------------------------------------------------------------------------
Language                             files          blank        comment           code
---------------------------------------------------------------------------------------
C                                    23126         444405         462198        2326698
C/C++ Header                         10280         155822         158227         758024
Ada                                   5052         264536         349229         756828
Java                                  6346         169198         646043         682350
C++                                  17407         137657         176833         612276
Bourne Shell                           152          81229          68484         440850
Markdown                               346          40753              0         339466
Go                                    2170          37979          49221         298060
HTML                                   330          33462           5663         148765
Fortran 90                            4071          17275          31443         101132
m4                                     198           8644           2443          77132
Assembly                               544          13218          32088          58482
XML                                     60           5961            563          44147
Windows Module Definition              122           4104             38          30404
make                                   157           4088           1653          25751
Expect                                 282           5596           9909          22452
Objective C                            512           4852           3066          16604
TeX                                      2           1480           6149          11675
Fortran 77                             431           1077           3850          10659
Objective C++                          242           2388           1502           8020
Perl                                    30            903           1349           4686
MSBuild script                           7              1              0           4675
Pascal                                  13            790           3261           3465
Python                                  10            804            736           2855
XSLT                                    20            563            436           2805
awk                                     18            374            589           2393
OCaml                                    3            310            416           2279
Bourne Again Shell                      15            415            654           1865
CSS                                      9            332            143           1428
yacc                                     2            107            119            977
C#                                       9            230            506            879
Tcl/Tk                                   1             72            112            393
lex                                      1             34             30            156
CMake                                    1             27             31            153
NAnt script                              2             17              0            132
JavaScript                               2             20             81            122
Haskell                                 36             15              0            112
Windows Resource File                    2              3              2             67
SAS                                      1             14             22             32
DTD                                      3             28             70             26
Fortran 95                               2             10              8             21
Lisp                                     1              4             12              8
MATLAB                                   1              0              0              5
DOS Batch                                2              0              0              4
XHTML                                    1              6             16              3
---------------------------------------------------------------------------------------
SUM:                                 72022        1438803        2017195        6799316
---------------------------------------------------------------------------------------

The damned thing broke the tool I was using, so this may be an incomplete result! At any rate, hefty...

Bootloader

The system must boot, and therefore we must select a bootloader. Here are the options. If you know of another, write in.

Syslinux

Apparently this thing can boot far more than LiveCDs these days. If this claim holds true, it might be a way of avoiding having multiple bootloaders present in the system (one for install media, and the other for the installed system). Things like being able to display graphics could probably be chopped out without too much effort. It supports EFI, which permits us use of disks larger than 2tb. It's large, but appears to be a somewhat modular collection of bootloaders. It'd be worthwhile to discover how heavy extlinux is by itself.

syslinux-6.03 cloc .
    2905 text files.
    2760 unique files.
     344 files ignored.

github.com/AlDanial/cloc v 1.70  T=14.98 s (171.1 files/s, 41420.3 lines/s)
-----------------------------------------------------------------------------------
Language                         files          blank        comment           code
-----------------------------------------------------------------------------------
C                                 1147          50307          78851         255734
C/C++ Header                       856          22631          46656         108013
HTML                                 8           3974             11          13407
Assembly                            95           1816           4745           9239
D                                  296              0              0           5825
Pascal                              30            607             20           4809
Perl                                40            720            828           3546
make                                65            997           1140           2896
xBase Header                         7            308            844           1212
Python                               1             34              6            267
Bourne Shell                         7             40             17            179
Lua                                  2             20              5            161
Bourne Again Shell                   3             15             29            150
CSS                                  2             18              0            105
XML                                  1              0              0             39
diff                                 1              7             34             32
Windows Resource File                1              2              0             24
-----------------------------------------------------------------------------------
SUM:                              2562          81496         133186         405638
-----------------------------------------------------------------------------------

Lilo

It only supports MBR partition schemes (which incur the aforementioned 2tb limitation). I'm not aware of any way to boot from CDs using lilo, though one may exist, but I've had no problem using it to boot USB sticks. And the thing is damned slim!

lilo-24.2 cloc .
     251 text files.
     240 unique files.
     113 files ignored.

github.com/AlDanial/cloc v 1.70  T=1.29 s (106.7 files/s, 34785.2 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
C                                19           1702            558          11821
Assembly                         21           1683           2627           9178
HTML                             34            131             67           5701
TeX                               8            550            394           3238
C/C++ Header                     22            532            787           1379
Perl                              4            262            356           1295
make                             12            195            148            592
Bourne Again Shell                1             87             34            578
Bourne Shell                     15            133            195            473
CSS                               2              9              4            280
--------------------------------------------------------------------------------
SUM:                            138           5284           5170          34535
--------------------------------------------------------------------------------

Elilo

This obviates the 2tb drive limit, and is even smaller than classical lilo! That said, much of the complexity may be hiding in EFI. I hear there's inquiry into EFI ongoing, so I'll await spyked's report for now.

elilo-3.16 cloc .
     119 text files.
     113 unique files.
      25 files ignored.

github.com/AlDanial/cloc v 1.70  T=0.46 s (204.5 files/s, 52649.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C                               46           2746           3715          10905
C/C++ Header                    35            792           1643           2969
Assembly                         5             89            407            430
make                             8            111            213            184
-------------------------------------------------------------------------------
SUM:                            94           3738           5978          14488
-------------------------------------------------------------------------------

Grub

There are two paths to consider here, the "legacy" branch and the 2.x branch. The former is quite manageably-sized, but does not support EFI.

grub-0.97 cloc .
     193 text files.
     184 unique files.
      45 files ignored.

github.com/AlDanial/cloc v 1.70  T=1.03 s (144.2 files/s, 84368.0 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C                               65           7117           9156          32871
Bourne Shell                    12           1242           1625           9938
C/C++ Header                    49           1374           2986           5915
TeX                              1            590           2464           4032
Assembly                         9            807           1324           1913
m4                               3            308             89           1700
make                             8             99             54            459
Perl                             1             91             87            339
-------------------------------------------------------------------------------
SUM:                           148          11628          17785          57167
-------------------------------------------------------------------------------

Meanwhile the 2.x branch is unsurprisingly plump, but does support EFI.

grub-2.04 cloc .
    2228 text files.
    2157 unique files.
     377 files ignored.

github.com/AlDanial/cloc v 1.70  T=11.32 s (164.6 files/s, 50901.3 lines/s)
---------------------------------------------------------------------------------------
Language                             files          blank        comment           code
---------------------------------------------------------------------------------------
C                                      931          44555          37610         264712
C/C++ Header                           563          10793          21301          53226
Bourne Shell                            18           6803           4182          32919
make                                    10           2987            469          25152
m4                                     150           1485           1504          20611
TeX                                      2           1512           6521          12598
Assembly                               172           3186           6809          10329
Windows Module Definition                5            651             21           3627
Python                                   3            168            213           1379
lex                                      1             54             34            305
yacc                                     1             44             44            267
awk                                      1              6             16             80
C++                                      1              8             17             80
sed                                      5              3              0             59
Lisp                                     1              3              0             50
---------------------------------------------------------------------------------------
SUM:                                  1864          72258          78741         425394
---------------------------------------------------------------------------------------

Refind

This one's slim, supports EFI, and appears to be the work of a single guy. Should we proceed down the path of EFI, it appears to be worth further investigation.

refind-0.11.4 cloc .
     312 text files.
     307 unique files.
     114 files ignored.

github.com/AlDanial/cloc v 1.70  T=1.71 s (115.7 files/s, 51557.9 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
C                                70           6296           8340          36255
C/C++ Header                     84           2769           7447          14805
HTML                             20           2170              2           6581
Bourne Again Shell                5            106            347           1524
Python                            5            148            222            494
make                             10            125            123            313
CSS                               1             13              2             73
Bourne Shell                      2              8             25             29
Perl                              1              1              0              6
--------------------------------------------------------------------------------
SUM:                            198          11636          16508          60080
--------------------------------------------------------------------------------

Kernel

Our chosen hardware will require a kernel which supports it. Regardless of which kernel version we choose8 this is likely to be the second-hardest component of the system after hardware fabrication to maintain by Republican effort alone in the near term. The codebase is gargantuan even for older versions. Lets first take a look at 2.6.39.2, the last 2.6 release.

linux-2.6.39.2 cloc .
   36684 text files.
   36131 unique files.
    4074 files ignored.

github.com/AlDanial/cloc v 1.70  T=244.28 s (133.5 files/s, 56149.6 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
C                             16087        1501167        1531763        7743098
C/C++ Header                  13589         314737         536239        1632163
Assembly                       1217          39835          92468         204275
XML                             139           3119            948          40974
make                           1390           6004           6374          22654
Perl                             41           2973           2462          13900
Bourne Shell                     61            638           1475           3644
yacc                              5            453            322           2987
Python                           21            594            343           2727
C++                               1            209             57           1521
lex                               5            202            237           1318
TeX                               1            108              3            911
awk                               8             90             79            714
Bourne Again Shell               28             74             55            446
HTML                              2             58              0            378
NAnt script                       1             87              0            356
Pascal                            3             49              0            231
Lisp                              1             63              0            218
Objective C++                     1             55              0            189
ASP                               1             33              0            137
XSLT                              6             13             27             70
sed                               1              0              3             30
vim script                        1              3             12             27
--------------------------------------------------------------------------------
SUM:                          32610        1870564        2172867        9672968
--------------------------------------------------------------------------------

The thing weighs in at over 9 million lines. Start reading...

Since then, the line-count has almost doubled. Here's the latest version as of the publish date of this article.

linux-5.4.6 cloc .
   65658 text files.
   65224 unique files.
   13254 files ignored.

github.com/AlDanial/cloc v 1.70  T=478.71 s (109.5 files/s, 53144.6 lines/s)
---------------------------------------------------------------------------------------
Language                             files          blank        comment           code
---------------------------------------------------------------------------------------
C                                    27621        2731603        2273195       13904947
C/C++ Header                         19699         527802         950568        4273634
Assembly                              1315          46707         101106         227391
JSON                                   272              1              0         159315
Bourne Shell                           557          12529           9357          50235
make                                  2509           9400          10531          41322
Perl                                    56           5593           4073          27924
Python                                 110           4443           4131          24035
YAML                                   188           3039            880          15426
HTML                                     5            665              0           5508
yacc                                     9            692            355           4627
lex                                      8            326            300           2014
C++                                      8            300             82           1873
Bourne Again Shell                      51            354            296           1748
awk                                     10            140            116           1058
NAnt script                              2            147              0            556
Windows Module Definition                2             15              0            109
m4                                       1             15              1             95
CSS                                      1             27             28             72
XSLT                                     5             13             26             61
vim script                               1              3             12             27
Ruby                                     1              4              0             25
INI                                      1              1              0              6
sed                                      1              2              5              5
---------------------------------------------------------------------------------------
SUM:                                 52433        3343821        3355062       18742013
---------------------------------------------------------------------------------------

In either case, absent a multi-billion dollar business alongside, major changes to this artifact are not happening. The open-source "dorks with laptops changing the world" approach never happened, and it's not about to now. Absent control of hardware production, I don't even think a kernel version can be specified. As soon as the chosen hardware becomes unavailable, one will be forced to either attempt backports of drivers,9 or give up and move to a newer kernel version.

Thus I propose a controversial alternative. Instead of targeting an older kernel which may not support available hardware, target the latest available and see what may be sawed off without disrupting needed hardware drivers. The kernel is quite modular, and there may be some hope in reducing the effective line count by several multiples by eradicating architectures, drivers, and other optional items that are unneeded.

And now for a breather...

More choices remain before we have even the vague outline of a bootable, self-hosting, self-editing, viable system. I'll be back tomorrow with the next article. Take care!

  1. If you read circularity there, read again. []
  2. The stallmans, poetterings, *buntus. []
  3. Recall the "fits in head". I intentionally reference and correct it. I don't give a flying fuck what fits in the imagination if it cannot be wielded. []
  4. Complete means that all components are wholly included in the set, and not merely referenced. []
  5. Though glimmers of a way out appear in the darkness. []
  6. I will leave this portion unspecified for now, as sadly all we have is a vague notion that a certain vintage AMD chip might be usable. []
  7. Binary-blob firmware, drivers, yes... but don't dare absolve yourself of your inability to fabricate your own hardware! []
  8. ...and indeed, which kernel. []
  9. Which either sums to a backport of significant parts of the newer kernel version. []

7 Comments »

  1. I subscribe to those definitions. And the whole is quite a pleasure to read, I'll re-read it a few more times too.

    At first pass though:

    1. What does ARM64 architecture have exactly to recommend it? Last time I looked it wasn't particularly neat or anything and on top of that, it also comes with its own set of additional problems (e.g. no working GNAT for it).

    2. If that Refind is potentially interesting, maybe it's worth contacting the single guy - who knows, maybe he had enough of working alone or something?

    3. Among all the other elephants in the room, it would seem to me that the biggest you identify there is still the hardware production really - or at least I would certainly call it the biggest trouble indeed. But I'm not sure - do you consider that going with the latest hardware really does much to mitigate this until the time it can be properly addressed? I can see perhaps some advantage in "the latest" mainly as it makes it easier to spread essentially but the trend as I see it is for the latest to last very little anyway. I am certainly not against considering at least the option.

    Comment by Diana Coman — 2019/12/29 @ 5:05 a.m.

  2. 1. ARM64 has lots of very cheap embedded hardware, great for disposables. GNAT should be able to be cross-compiled from e.g. an AMD64 GNAT.

    2. Yeah, maybe I'll reach out to him.

    3. I'm proposing specifying the process by which a kernel is inserted into the system, and proposing we make this very easy, rather than e.g. pouring cement around 2.6. We cannot understand the implications of the kernel choice at the birth of the system, and therefore we would benefit from being agile should we have chosen incorrectly, either due to sudden unavailability of hardware or to one of these rearing its awful tentacled head: https://www.cvedetails.com/product/47/Linux-Linux-Kernel.html?vendor_id=33

    Comment by trinque — 2019/12/29 @ 5:05 p.m.

  3. So the "fit in head" symbolically becomes fit in hand, huh. I can see it.

    > or ARM64 machines

    I didn't realise those were still in the running.

    As far as I recall, it was specifically non-64 ARMs that were considered at some point, but they don't seem to have crossed the head-to-hand barrier ever -- and while operator ineptitude was certainly involved in that failure, I'm still gonna deem they failed.

    > Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.

    This would render the count dubious, neh ?

    cloc btw is... well, it regularly chokes on things like a ./* (as in, paths) quoted inside code.

    > Refind

    How about you say hi to the guy ?

    > In either case, absent a multi-billion dollar business alongside, major changes to this artifact are not happening.

    The linux effort is not actually married to any billion-dollar businesses, nor ever historically was. It did manage to (briefly) disrupt the indolence and stupidity of a few thousand people, it is true, but that had relatively little to do with money or business. It did have relatively a lot to do with naggum's notion of luck, which is to say it happened to coincidentally occur at a time white society was both still colimated enough and still hierarchical enough such that young men on collegiate campuses connected to the internet could do remarkable things ; meaning it's not so likely to repeat naturally, "by itself" as it were -- but then again that's neither here nor there.

    I still stand by the dozen people eighteen months story, in the above context.

    > Absent control of hardware production, I don't even think a kernel version can be specified.

    This is a major problem -- linus had a certain irreproducible wind blowing up his sails in the early days, namely that hardware was somewhat specified and pinned down by the necessities and attritions of poverty. He fucked up all that, leaving us to start over in markedly more adverse circumstances, it's true ; but I still don't think the actual standard is as hard as literal fabrication.

    Imagine what the world would be like if in order to find a good fucktoy you actually had to impregnate her mother yourself. If I can get away with a lower standard than literal fabrication, I expect so could you.

    > or give up and move

    It really doesn't take THAT much to achieve pre-eminence over the current linux crowd.

    > target the latest available

    I'm never moving off the 2.something. I really don't care what happens, come hell or high water, whatever. 2.6 has been good enough for like fifteen years -- trilema is served by one, my gaming station runs one, it's good the fuck enough, forget about it. Unless quantum computers actually start selling at the store it will stay good enough, permanently.

    Comment by Mircea Popescu — 2019/12/30 @ 3:08 p.m.

  4. @Trinque Re ARM64 & GNAT, the "should be" is not enough - we found out already that compiling it with sjlj for ARM64 fails: http://logs.ossasepia.com/log/trilema/2019-02-15#1897027

    Comment by Diana Coman — 2019/12/30 @ 3:19 p.m.

  5. Footnote three is a solid point. It makes me wonder where the extension'll go in the future, if at all --are there more places to (by necessity) load a thing?

    Probably I miss something between "In either case, absent a multi-billion dollar business alongside, major changes to this artifact are not happening." and "...target the latest available and see what may be sawed off without disrupting needed hardware drivers". Does sawing off not result in major changes? 'Cause if not, and the argument is that both presented kernels are irresolvably rotten and also sport sloughable surface gunk, the lowest-oldest-known-est-smallest-simplest rotten core wins no matter what else's going on, no?

    Comment by hanbot — 2019/12/30 @ 7:30 p.m.

  6. I agree that it is hard to foresee the implications of kernel version choice at the start.

    1. Hardware support. Fundamentally, I have nothing against using the most recent one, but I would additionally stress that this does not imply that we will be chasing kernel versions for hardware support. IMO using the latest kernel instead of a bit older one only delays the potential problems of hardware support, and sooner or later a working solution will have to be found. Not depending on the upstream implies that users that bring non-supported devices would have to bring their drivers as well -- a backport of a driver can be anything from one-liner to significant PITA (miss one line where new kernel API should be used instead of old one, enjoy strange bugs), but 'give up and move on' is anti-ratchet.

    Anyhow, I will make a review of what drivers (useful on x86, without looking into other arches) were added in each kernel version 4.10-5.4, maybe this will give a bit more information about how much backporting could happen from version to version. I do not expect that many drivers useful for servers were added in meantime, but lot of code churn for GPUs.

    2. Minimizing the count of lines. Given that most of the weight of the kernel is in drivers, not in architectures, potentially can start from there and drop drivers which are not used on x86.

    3. CVEs: a dedicated person would have to look at the announcements and patch accordingly, the fixes are typically not intrusive, but require attention to not miss anything. Picking "LTS" kernel can help with this at least a bit.

    Comment by bvt — 2020/01/11 @ 6:00 a.m.

  7. I see initial definitions are something I neglected and I subscribe to yours. Thanks for laying them out and picking up the slack.

    Understand that the first build of this self-hosting system incurs the bootstrapping problem, and understand that we do not currently have a solution to this problem.

    While no one has yet to try compiling Gales with tcc, the cross-bootstrap procedure, looks like a good starting point and Gales uses GCC 4.7.4, which your link in footnote 5 indicated tcc can build. Perhaps this is closer than we think.

    Re : which GCC, Gales uses 4.7.4 because it's the last version that can be bootstrapped purely from C, i.e. no C++. Along with 4.9.4, 4.4.7 is also on the table as that's what Mircea Popescu reported to mainly use.

    No one yet owns GCC, primarily because I've been waiting to see if ave1 will resurface, but at this point we've waited longer than long enough without a peep from him and in the next week or so it's a priority to have someone take ownership and start mapping it out properly.

    Re ARM64, one more data point to add to those above is the rockchips used by Pizarro are ARM64. I got everything but boost built on my quest to build TRB there, prior to Piz falling over. I think it's definitely doable to run trb on ARM64, but my quest there is sidelined for the foreseeable future.

    I can say hi to the rEFInd guy on Monday and invite him to #trilema, apparently he works for Canonical.

    Comment by Robinson Dorion — 2020/01/11 @ 4:59 p.m.

RSS feed for comments on this post. TrackBack URL

Leave a comment

+