trinque

2019/12/29

A Republican OS - Part 2

Filed under: Uncategorized — trinque @ 9:53 p.m.

We began with definitions last time. Let's add a few more before we press on.

Complexity

The English sources at my disposal underwhelm, so let's instead take a stab at the Latin beneath. Plecto means "I weave" or "I twist."1 Com- is the familiar prefix denoting "with". Therefore we may not sin too much calling complexity the measure of how weaved-with an item is, either in itself or with its surroundings.2 In software, there are two kinds of complexity worth mentioning, accidental and essential.3 Essential complexity is that which cannot be reduced without changing your algorithm. Accidental complexity is an accumulation of human error, of engineering tradeoffs and other failings.4 Given the depth-first-search toposort, the essential complexity is linear with the number of vertices and edges, O(V + E). Notice that nowhere do you find terms for whether you used a recursive or iterative solution, whether your recursive solution was implemented on a system with tail-recursion optimization, or whether your program's process was suspended while the operating system received an interrupt. In those invisible terms lie the accidental complexity, and they comprise most of the hulking mass of modern computing.

Ownership

The old language had an entire declension for ownership.5 To own something is for that thing to be an extension of yourself, for your causes to flow through it. It means the object is ready to your hand. What does it mean when you say "my country"? Does the country move when you move? What do you mean by "my computer"? Does your light shine out the other side of it, amplified, or does it confound you, complicate your actions, and bury you in ever-increasing cost?


We're a few lost lifetimes away from a functional Linux system yet. Let's continue.

Libc

We're building an operating system around the Linux kernel, which exposes an interface of system calls to user-space programs. We could conceivably implement these system calls ourselves, along with the myriad utility functions present in most libc libraries, but we'd be treading ground already trod at great cost. Of the available options, I'll discuss these two.

musl

This thing is slim, performs well, and has good compatibility with POSIX6. See here for a comparison with other libc libraries. Moreover, consider this from the musl FAQ:

When will it be finished?
When there's nothing left to remove.

musl-1.1.24 cloc .

    2521 text files.
    2331 unique files.
     247 files ignored.

github.com/AlDanial/cloc v 1.70  T=9.07 s (252.8 files/s, 11741.6 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
C                             1448           6314           7295          48270
C/C++ Header                   537           4316            273          30931
Assembly                       299            490            539           6485
Bourne Shell                     4            111            160            618
awk                              3             56             74            301
make                             1             68             10            157
sed                              1              0              6              9
-------------------------------------------------------------------------------
SUM:                          2293          11355           8357          86771
-------------------------------------------------------------------------------

After weighing the thing, I'm truly in love.

glibc

Meanwhile, lets strap a loathsome pig down and take a few measurements.

glibc-2.30 cloc .
   16983 text files.
   16253 unique files.
Complex regular subexpression recursion limit (32766) exceeded at /usr/bin/cloc line 7327.
    3020 files ignored.

github.com/AlDanial/cloc v 1.70  T=69.28 s (201.9 files/s, 25976.2 lines/s)
---------------------------------------------------------------------------------------
Language                             files          blank        comment           code
---------------------------------------------------------------------------------------
C                                     8781         114862         172458         683447
C/C++ Header                          2882          38763          64340         282848
Assembly                              1821          39782          93919         235315
Bourne Shell                            82           1991           2558          15567
make                                   203           2498           2476          11080
Python                                  44           1173           2376           9013
TeX                                      1            813           3697           7162
m4                                      50            349            144           3467
awk                                     38            291            422           2054
Windows Module Definition               17            190              0           2024
C++                                     32            258            310           1238
Perl                                     3             87            142            632
Korn Shell                               1             55             68            435
yacc                                     1             56             36            290
Pascal                                   7             82            326            182
JSON                                     2              0              0             91
Expect                                   8              0              0             77
Bourne Again Shell                       1              8             31             77
sed                                     11              8             15             70
---------------------------------------------------------------------------------------
SUM:                                 13985         201266         343318        1255069
---------------------------------------------------------------------------------------

86k vs 1.23m lines. I get physically ill looking at wanton messes like these. The utter contempt for human comprehension is abhorrent.7

If somebody else solves the same problem as you sparing over an order of magnitude, you congratulate them and get out of their way.


We still don't have that blinking cursor, but we're getting closer. There are still explosions of accidental complexity to avoid in the days ahead. See you tomorrow.

  1. And because the language is so damned orderly, also "I punish". []
  2. Latin would happily allow you to disambiguate the two if you wanted! []
  3. These are much better discussed here http://curtclifton.net/papers/MoseleyMarks06a.pdf, and I encourage you to go give that a read before continuing. []
  4. Consider that mathematical truths are discovered, while products of beings-an-engineer are built. []
  5. The genitive or "begetting" form. []
  6. POSIX, first codified in 1988, is a standard for the APIs and utility programs of UNIX-like operating systems. Adhering to the standard means that a huge amount of software designed for POSIX can be ported to a given operating system. []
  7. Yes, we glossed over the plump, moo-eyed smiles of the Linux kernel and gcc because we have no alternatives. That's no excuse. []

3 Comments »

  1. Tiny nitpick/clarification re "Essential complexity is that which cannot be reduced without changing your algorithm." - according to the paper you ref, afaik this would be so if you mean there users' "algorithm" aka the problem's own (as seen/presumed known -and that's a big presumption- by the users). Was this what you had in mind behind those (hard to notice at the time really) prods re users of TMSR-OS? Because I realised this only now that I read this article of yours.

    (I have experienced that physical sickness @ the sudden grasp of the size&spread of a code-mess of such proportions)

    Comment by Diana Coman — 2019/12/30 @ 2:21 p.m.

  2. I must say I'm actually quite happy with the complexity definitions.

    > The utter contempt for human comprehension is abhorrent

    It's not just the plain size, either. Musl comments are 10%. glibc comments 2-something%. And this isn't even merely as bad as the straight comparison indicates, but significantly worse : a sub 100k libc getting away with 10% comment by mass is very different from a >1mn libc getting away with anything like HALF commentary by mass -- much in the vein of how my coffee maker may come with a single page of printed manual, 1 ppm of the entire packaging, but my BMW'd better come with complete fucking docs, because guess what, if something goes awry I have no way to intuit the correct approach to significantly longer, utterly more complex mechanics.

    As a rule of thumb I would propose that every thousand lines of code added to a codebase DOUBLES the required PERCENTAGE of commentary included. So a 1001 LOC codepile'd better include at least TWO comment lines, a 10k LOC codepile'd better include 10% commentary by line count, and a 1mn codepile can not possibly exist, because 21000 exceeds 106

    Comment by Mircea Popescu — 2020/01/02 @ 10:23 a.m.

  3. Thanks for the definitions and measurements.

    I'll link in spyked's piece On Intellectual Ownership to supplement.

    Comment by Robinson Dorion — 2020/01/12 @ 1:33 p.m.

RSS feed for comments on this post. TrackBack URL

Leave a comment

+