Let's chart a course to a usable system while usable systems can still be built, shall we?
Definitions first
By Republican, we do not simply mean that the item shall be used by the Republic. We mean that the item shall be constructed in such a way that it embodies, serves, and furthers Republican ideology in computer operating systems. As there can be no meaning absent a structure of authority, "Republican OS" is predicated on the Republic being the structure of authority within which said item is defined. Thus relying on "upstream" for anything is sin. Fallen creatures that we are, I do not expect us to eradicate sin in an instant. Yet inaction is not an option. Inaction is death, both of the Republic and ourselves. From our options we cleave both immediate perfection and tolerance of anything which does not serve us. What remains is a ratchet; it begins loose and ever tightens, squeezing out the unnecessary, the confounding, the redundant, and leaves us with something small enough to fit in our hand.
By operating system, we mean a complete set of self-hosting software capable of running on computers which can be obtained with reasonable effort. Self-hosting means that the system can be used to edit, build, and collaborate on next iterations of the software. It means that we turn the ratchet ourselves. Understand that the first build of this self-hosting system incurs the bootstrapping problem, and understand that we do not currently have a solution to this problem.
Hardware selection
We must target particular hardware on which to build this system. This means AMD, or ARM64 machines, and to date it isn't clear which. Whichever hardware we choose, should we be forced to rely on items we cannot ourselves build, alter, and maintain, we proceed in sin, and leave debts to pay.
Compiler
Building software, we must choose a compiler which is capable of producing binaries for our target architecture. There has been much discussion already on the precise place to break. Absent deep research on the matter, one can only lean on empirical results, and by those, gcc 4.x seems to have been capable of our needs to date. This is a weak recommendation, and given the hulking mass of the pile in question, it might be all we have the lifetimes to achieve. gcc-4.9.4 being the last bugfix release before the next major revision, it seems as good a place as any to pull the handbrake. It's worth pointing out that this version made the following change:
GNAT switched to Ada 2012 instead of Ada 2005 by default.
Let's weigh the thing before we move on.
gcc-4.9.4 cloc .
76640 text files.
75753 unique files.
Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.
Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.
Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.
Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.
4671 files ignored.
github.com/AlDanial/cloc v 1.70 T=338.66 s (212.7 files/s, 30282.2 lines/s)
---------------------------------------------------------------------------------------
Language files blank comment code
---------------------------------------------------------------------------------------
C 23126 444405 462198 2326698
C/C++ Header 10280 155822 158227 758024
Ada 5052 264536 349229 756828
Java 6346 169198 646043 682350
C++ 17407 137657 176833 612276
Bourne Shell 152 81229 68484 440850
Markdown 346 40753 0 339466
Go 2170 37979 49221 298060
HTML 330 33462 5663 148765
Fortran 90 4071 17275 31443 101132
m4 198 8644 2443 77132
Assembly 544 13218 32088 58482
XML 60 5961 563 44147
Windows Module Definition 122 4104 38 30404
make 157 4088 1653 25751
Expect 282 5596 9909 22452
Objective C 512 4852 3066 16604
TeX 2 1480 6149 11675
Fortran 77 431 1077 3850 10659
Objective C++ 242 2388 1502 8020
Perl 30 903 1349 4686
MSBuild script 7 1 0 4675
Pascal 13 790 3261 3465
Python 10 804 736 2855
XSLT 20 563 436 2805
awk 18 374 589 2393
OCaml 3 310 416 2279
Bourne Again Shell 15 415 654 1865
CSS 9 332 143 1428
yacc 2 107 119 977
C# 9 230 506 879
Tcl/Tk 1 72 112 393
lex 1 34 30 156
CMake 1 27 31 153
NAnt script 2 17 0 132
JavaScript 2 20 81 122
Haskell 36 15 0 112
Windows Resource File 2 3 2 67
SAS 1 14 22 32
DTD 3 28 70 26
Fortran 95 2 10 8 21
Lisp 1 4 12 8
MATLAB 1 0 0 5
DOS Batch 2 0 0 4
XHTML 1 6 16 3
---------------------------------------------------------------------------------------
SUM: 72022 1438803 2017195 6799316
---------------------------------------------------------------------------------------
The damned thing broke the tool I was using, so this may be an incomplete result! At any rate, hefty...
Bootloader
The system must boot, and therefore we must select a bootloader. Here are the options. If you know of another, write in.
Apparently this thing can boot far more than LiveCDs these days. If this claim holds true, it might be a way of avoiding having multiple bootloaders present in the system (one for install media, and the other for the installed system). Things like being able to display graphics could probably be chopped out without too much effort. It supports EFI, which permits us use of disks larger than 2tb. It's large, but appears to be a somewhat modular collection of bootloaders. It'd be worthwhile to discover how heavy extlinux is by itself.
syslinux-6.03 cloc .
2905 text files.
2760 unique files.
344 files ignored.
github.com/AlDanial/cloc v 1.70 T=14.98 s (171.1 files/s, 41420.3 lines/s)
-----------------------------------------------------------------------------------
Language files blank comment code
-----------------------------------------------------------------------------------
C 1147 50307 78851 255734
C/C++ Header 856 22631 46656 108013
HTML 8 3974 11 13407
Assembly 95 1816 4745 9239
D 296 0 0 5825
Pascal 30 607 20 4809
Perl 40 720 828 3546
make 65 997 1140 2896
xBase Header 7 308 844 1212
Python 1 34 6 267
Bourne Shell 7 40 17 179
Lua 2 20 5 161
Bourne Again Shell 3 15 29 150
CSS 2 18 0 105
XML 1 0 0 39
diff 1 7 34 32
Windows Resource File 1 2 0 24
-----------------------------------------------------------------------------------
SUM: 2562 81496 133186 405638
-----------------------------------------------------------------------------------
Lilo
It only supports MBR partition schemes (which incur the aforementioned 2tb limitation). I'm not aware of any way to boot from CDs using lilo, though one may exist, but I've had no problem using it to boot USB sticks. And the thing is damned slim!
lilo-24.2 cloc .
251 text files.
240 unique files.
113 files ignored.
github.com/AlDanial/cloc v 1.70 T=1.29 s (106.7 files/s, 34785.2 lines/s)
--------------------------------------------------------------------------------
Language files blank comment code
--------------------------------------------------------------------------------
C 19 1702 558 11821
Assembly 21 1683 2627 9178
HTML 34 131 67 5701
TeX 8 550 394 3238
C/C++ Header 22 532 787 1379
Perl 4 262 356 1295
make 12 195 148 592
Bourne Again Shell 1 87 34 578
Bourne Shell 15 133 195 473
CSS 2 9 4 280
--------------------------------------------------------------------------------
SUM: 138 5284 5170 34535
--------------------------------------------------------------------------------
Elilo
This obviates the 2tb drive limit, and is even smaller than classical lilo! That said, much of the complexity may be hiding in EFI. I hear there's inquiry into EFI ongoing, so I'll await spyked's report for now.
elilo-3.16 cloc .
119 text files.
113 unique files.
25 files ignored.
github.com/AlDanial/cloc v 1.70 T=0.46 s (204.5 files/s, 52649.6 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
C 46 2746 3715 10905
C/C++ Header 35 792 1643 2969
Assembly 5 89 407 430
make 8 111 213 184
-------------------------------------------------------------------------------
SUM: 94 3738 5978 14488
-------------------------------------------------------------------------------
There are two paths to consider here, the "legacy" branch and the 2.x branch. The former is quite manageably-sized, but does not support EFI.
grub-0.97 cloc .
193 text files.
184 unique files.
45 files ignored.
github.com/AlDanial/cloc v 1.70 T=1.03 s (144.2 files/s, 84368.0 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
C 65 7117 9156 32871
Bourne Shell 12 1242 1625 9938
C/C++ Header 49 1374 2986 5915
TeX 1 590 2464 4032
Assembly 9 807 1324 1913
m4 3 308 89 1700
make 8 99 54 459
Perl 1 91 87 339
-------------------------------------------------------------------------------
SUM: 148 11628 17785 57167
-------------------------------------------------------------------------------
Meanwhile the 2.x branch is unsurprisingly plump, but does support EFI.
grub-2.04 cloc .
2228 text files.
2157 unique files.
377 files ignored.
github.com/AlDanial/cloc v 1.70 T=11.32 s (164.6 files/s, 50901.3 lines/s)
---------------------------------------------------------------------------------------
Language files blank comment code
---------------------------------------------------------------------------------------
C 931 44555 37610 264712
C/C++ Header 563 10793 21301 53226
Bourne Shell 18 6803 4182 32919
make 10 2987 469 25152
m4 150 1485 1504 20611
TeX 2 1512 6521 12598
Assembly 172 3186 6809 10329
Windows Module Definition 5 651 21 3627
Python 3 168 213 1379
lex 1 54 34 305
yacc 1 44 44 267
awk 1 6 16 80
C++ 1 8 17 80
sed 5 3 0 59
Lisp 1 3 0 50
---------------------------------------------------------------------------------------
SUM: 1864 72258 78741 425394
---------------------------------------------------------------------------------------
Refind
This one's slim, supports EFI, and appears to be the work of a single guy. Should we proceed down the path of EFI, it appears to be worth further investigation.
refind-0.11.4 cloc .
312 text files.
307 unique files.
114 files ignored.
github.com/AlDanial/cloc v 1.70 T=1.71 s (115.7 files/s, 51557.9 lines/s)
--------------------------------------------------------------------------------
Language files blank comment code
--------------------------------------------------------------------------------
C 70 6296 8340 36255
C/C++ Header 84 2769 7447 14805
HTML 20 2170 2 6581
Bourne Again Shell 5 106 347 1524
Python 5 148 222 494
make 10 125 123 313
CSS 1 13 2 73
Bourne Shell 2 8 25 29
Perl 1 1 0 6
--------------------------------------------------------------------------------
SUM: 198 11636 16508 60080
--------------------------------------------------------------------------------
Kernel
Our chosen hardware will require a kernel which supports it. Regardless of which kernel version we choose this is likely to be the second-hardest component of the system after hardware fabrication to maintain by Republican effort alone in the near term. The codebase is gargantuan even for older versions. Lets first take a look at 2.6.39.2, the last 2.6 release.
linux-2.6.39.2 cloc .
36684 text files.
36131 unique files.
4074 files ignored.
github.com/AlDanial/cloc v 1.70 T=244.28 s (133.5 files/s, 56149.6 lines/s)
--------------------------------------------------------------------------------
Language files blank comment code
--------------------------------------------------------------------------------
C 16087 1501167 1531763 7743098
C/C++ Header 13589 314737 536239 1632163
Assembly 1217 39835 92468 204275
XML 139 3119 948 40974
make 1390 6004 6374 22654
Perl 41 2973 2462 13900
Bourne Shell 61 638 1475 3644
yacc 5 453 322 2987
Python 21 594 343 2727
C++ 1 209 57 1521
lex 5 202 237 1318
TeX 1 108 3 911
awk 8 90 79 714
Bourne Again Shell 28 74 55 446
HTML 2 58 0 378
NAnt script 1 87 0 356
Pascal 3 49 0 231
Lisp 1 63 0 218
Objective C++ 1 55 0 189
ASP 1 33 0 137
XSLT 6 13 27 70
sed 1 0 3 30
vim script 1 3 12 27
--------------------------------------------------------------------------------
SUM: 32610 1870564 2172867 9672968
--------------------------------------------------------------------------------
The thing weighs in at over 9 million lines. Start reading...
Since then, the line-count has almost doubled. Here's the latest version as of the publish date of this article.
linux-5.4.6 cloc .
65658 text files.
65224 unique files.
13254 files ignored.
github.com/AlDanial/cloc v 1.70 T=478.71 s (109.5 files/s, 53144.6 lines/s)
---------------------------------------------------------------------------------------
Language files blank comment code
---------------------------------------------------------------------------------------
C 27621 2731603 2273195 13904947
C/C++ Header 19699 527802 950568 4273634
Assembly 1315 46707 101106 227391
JSON 272 1 0 159315
Bourne Shell 557 12529 9357 50235
make 2509 9400 10531 41322
Perl 56 5593 4073 27924
Python 110 4443 4131 24035
YAML 188 3039 880 15426
HTML 5 665 0 5508
yacc 9 692 355 4627
lex 8 326 300 2014
C++ 8 300 82 1873
Bourne Again Shell 51 354 296 1748
awk 10 140 116 1058
NAnt script 2 147 0 556
Windows Module Definition 2 15 0 109
m4 1 15 1 95
CSS 1 27 28 72
XSLT 5 13 26 61
vim script 1 3 12 27
Ruby 1 4 0 25
INI 1 1 0 6
sed 1 2 5 5
---------------------------------------------------------------------------------------
SUM: 52433 3343821 3355062 18742013
---------------------------------------------------------------------------------------
In either case, absent a multi-billion dollar business alongside, major changes to this artifact are not happening. The open-source "dorks with laptops changing the world" approach never happened, and it's not about to now. Absent control of hardware production, I don't even think a kernel version can be specified. As soon as the chosen hardware becomes unavailable, one will be forced to either attempt backports of drivers, or give up and move to a newer kernel version.
Thus I propose a controversial alternative. Instead of targeting an older kernel which may not support available hardware, target the latest available and see what may be sawed off without disrupting needed hardware drivers. The kernel is quite modular, and there may be some hope in reducing the effective line count by several multiples by eradicating architectures, drivers, and other optional items that are unneeded.
And now for a breather...
More choices remain before we have even the vague outline of a bootable, self-hosting, self-editing, viable system. I'll be back tomorrow with the next article. Take care!