Rendered at 23:31:39 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
PaulHoule 2 days ago [-]
You can learn a lot developing a language and runtime but you will reach a point when you'll realize you can go back and do it all better.
matheusmoreira 2 days ago [-]
No doubt. The second attempt is always better. My initial plan was to write a complete first implementation in C so that it's always possible to bootstrap the language, then write a compiler inside the lisp itself, or write a Rust version. Hope I somehow manage to do it all before I die of old age.
exe34 2 days ago [-]
Aha I've been semi-vibe-coding a scheme for esp32c3 and linux at the same time, and forgoing the libc too, so baremetal c. I went with a slightly awkward approach for allocation, with heap being seen as pages, and within pages, fixed size objects of size 8, 16 or 32 bytes. A pair is two words, and either bitpacked or pointer to another object. I'm not far in, vaguely following the Peter Michaux approach.
Lone itself is a second system: it's the spiritual successor of liblinux. I suppose the scope did increase... I'll try to be careful.
mpyne 2 days ago [-]
In fairness, with the waterfall methodology that pervaded back then, the "first" system you shipped was actually the second. "Build one to throwaway; you will, anyhow".
db48x 2 days ago [-]
Actual waterfall development was far more iterative than most people realize. If you want a primary source, I recommend Sunbust and Luminary by Don Eyles. It recounts the development of the software for the Apollo lunar lander.
cultofmetatron 2 days ago [-]
this is my favorite thing about agentic coding. its super easy to build a v1 and get the system to a point where it does what you want which means you learn what you need for the v2 much faster
mhb 2 days ago [-]
Also building furniture.
NetMageSCW 2 days ago [-]
Flashbacks to when I implemented a new storage system for a Lisp/Prolog system while porting the core code from Pascal to C. We stuck with linked list of pages so we could keep pointers and gain some speed over array indexing for every object access.
matheusmoreira 2 days ago [-]
That's cool! Can you post more details? In my benchmarks the contiguous heap dramatically improved cache utilization. How did you obtain higher performance with linked lists?
throwaway58670 2 days ago [-]
Such a delight to read a technical article, and about something outside the hype cycle.
I also liked the color scheme.
matheusmoreira 2 days ago [-]
Thank you! I'm glad you enjoyed it. I'm already writing the next one!
Someone 2 days ago [-]
FTA: “Lone is a lisp interpreter written in freestanding C. There is no dynamic memory allocation in freestanding C. There is no such thing as malloc. There is no libc. There is only me and the code. If I wanted malloc, I would have to write it myself.
[…]
mmap is awesome but Linux's got something even more awesome: mremap, which makes it almost trivial to manage the page-based heap.“*
⇒ they are using “freestanding” in a non-standard way. Usually, it means running on bare metal without an OS (https://en.wikipedia.org/wiki/Standalone_program: “A standalone program, also known as a freestanding program, is a computer program that does not load any external module, library function or program and that is designed to boot with the bootstrap procedure of the target processor – it runs on bare metal”
matheusmoreira 2 days ago [-]
Freestanding as in freestanding C. Lone passes the -ffreestanding -nostdlib flags. No external modules, libraries, functions or programs are loaded by the interpreter.
It's true that lone doesn't run on bare metal. I chose Linux for pragmatic and philosophical reasons. I'm not sure lone would ever have printed hello world if I hadn't chosen to build on top of Linux, which is also the only kernel with a stable binary interface. No other system lets you avoid the libc.
Someone 14 hours ago [-]
Of course you’re free to make your own choices, but I find it weird to accept relying on many lines of Linux kernel code and then not wanting to reuse a relatively small number of lines of an existing malloc.
The reverse (running on bare metal and tweaking an existing malloc to run on it) looks way more logical to me.
> No other system lets you avoid the libc.
On most OSes [1] it’s relatively easy to write a libOS that just wraps the system calls. Your only dependency would be on the mapping to syscall numbers.
As a Brazilian, i'll let my nationalism escape a little bit to say that it is awesome to see such an incredible work done by a fellow brasileiro. This gives me the same feeling of awesomeness that I get when I see José Valim's work on Elixir.
I only wish I could be as diligent and focused to do a thing like this, because I think it would be so much fun. I bought the Crafting Interpreters book a while ago, but still haven't got time to actually go through with it.
Well, maybe one day...
matheusmoreira 1 days ago [-]
Thanks, that's high praise! I highly recommend Crafting Interpreters. Bob Nystrom is among my favorite authors. You should read his blog too.
quibono 1 days ago [-]
If I'm not mistaken there's also the author of Starlette / FastAPI
dullcrisp 2 days ago [-]
I don’t know too much about this, but strictly based on the name I assumed that you should use a heap. Is that not right?
That is indeed nowhere to be found in the article. Heap is also used to refer to dynamically allocated memory, which used to be implemented via a heap segment.
Yeah upon looking it up it appears to have nothing to do with the data structure.
When they were doing a linear scan to find a large enough free block I was waiting for a heap to come into the picture, but apparently it’s not done that way.
matheusmoreira 2 days ago [-]
I think the term "heap" is used in the sense of "pile" which evokes the image of a somewhat disorganized heap of things you can grab in any order. Contrasts with the disciplined LIFO stack.
Maybe actual heaps were used to implement this at some point. To be perfectly honest, I'm not sure.
adrian_b 2 days ago [-]
The first use in computing of the word "heap", which refers to a data structure used for priority queues or for sorting, i.e. a kind of binary tree, was in an article published in June 1964, about the algorithm "Heapsort" (by John William Joseph Williams).
The second use in computing of the word "heap", which has no relationship whatsoever with the other meaning, occurred first in the report about the programming language ALGOL 68, which was published in December 1968 by Adriaan van Wijngaarden et al.
In ALGOL 68, "heap" is used as a keyword, to indicate that a new variable must be allocated in the heap, instead of being allocated in the stack, which is the default.
So "heap" in the second sense, which is used by you and traditionally in UNIX, was coined to oppose "stack", as not being constrained by a LIFO allocate/free policy.
The word stack had been popularized by another Dutch, Edsger W. Dijkstra, in May 1960, who explained how to use a stack and a stack pointer for implementing the blocks of ALGOL 60. So both "stack" and "heap", in the senses related to memory management, come from Dutch computer scientists, and they have been used first in connection with the languages ALGOL 60 and, respectively, ALGOL 68.
Before ALGOL 68, the language IBM PL/I had used since December 1964 the terms "automatic storage" instead of "stack" (the source of the C keyword "auto") and "controlled storage" instead of "heap".
John McCarthy published the description of the garbage collector used by LISP in April 1960, but he did not use any special word for the heap, because in LISP that was the only kind of memory used for data, so unlike in PL/I or ALGOL 68 there was no need to distinguish it from memory areas that were managed in a different way.
Um, see The Second System Effect.
https://en.wikipedia.org/wiki/Second-system_effect
I also liked the color scheme.
[…]
mmap is awesome but Linux's got something even more awesome: mremap, which makes it almost trivial to manage the page-based heap.“*
⇒ they are using “freestanding” in a non-standard way. Usually, it means running on bare metal without an OS (https://en.wikipedia.org/wiki/Standalone_program: “A standalone program, also known as a freestanding program, is a computer program that does not load any external module, library function or program and that is designed to boot with the bootstrap procedure of the target processor – it runs on bare metal”
It's true that lone doesn't run on bare metal. I chose Linux for pragmatic and philosophical reasons. I'm not sure lone would ever have printed hello world if I hadn't chosen to build on top of Linux, which is also the only kernel with a stable binary interface. No other system lets you avoid the libc.
The reverse (running on bare metal and tweaking an existing malloc to run on it) looks way more logical to me.
> No other system lets you avoid the libc.
On most OSes [1] it’s relatively easy to write a libOS that just wraps the system calls. Your only dependency would be on the mapping to syscall numbers.
[1] OpenBSD is/may be (I don’t know the status of these) an exception. See https://man.openbsd.org/pinsyscalls.2, https://lwn.net/Articles/949078/
These are unstable in every operating system other than Linux.
I've written about it:
https://www.matheusmoreira.com/articles/linux-system-calls
Linux is the only kernel that promises stability in this area. On every other operating system, one must program against the system libraries.
"C built to compile with freestanding mode", rather than "C built to run bare metal".
[0] https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html#in...
I only wish I could be as diligent and focused to do a thing like this, because I think it would be so much fun. I bought the Crafting Interpreters book a while ago, but still haven't got time to actually go through with it.
Well, maybe one day...
https://en.wikipedia.org/wiki/Heap_(data_structure)
That is indeed nowhere to be found in the article. Heap is also used to refer to dynamically allocated memory, which used to be implemented via a heap segment.
https://en.wikipedia.org/wiki/C_dynamic_memory_allocation
https://en.wikipedia.org/wiki/Data_segment#Heap
When they were doing a linear scan to find a large enough free block I was waiting for a heap to come into the picture, but apparently it’s not done that way.
Maybe actual heaps were used to implement this at some point. To be perfectly honest, I'm not sure.
The second use in computing of the word "heap", which has no relationship whatsoever with the other meaning, occurred first in the report about the programming language ALGOL 68, which was published in December 1968 by Adriaan van Wijngaarden et al.
In ALGOL 68, "heap" is used as a keyword, to indicate that a new variable must be allocated in the heap, instead of being allocated in the stack, which is the default.
So "heap" in the second sense, which is used by you and traditionally in UNIX, was coined to oppose "stack", as not being constrained by a LIFO allocate/free policy.
The word stack had been popularized by another Dutch, Edsger W. Dijkstra, in May 1960, who explained how to use a stack and a stack pointer for implementing the blocks of ALGOL 60. So both "stack" and "heap", in the senses related to memory management, come from Dutch computer scientists, and they have been used first in connection with the languages ALGOL 60 and, respectively, ALGOL 68.
Before ALGOL 68, the language IBM PL/I had used since December 1964 the terms "automatic storage" instead of "stack" (the source of the C keyword "auto") and "controlled storage" instead of "heap".
John McCarthy published the description of the garbage collector used by LISP in April 1960, but he did not use any special word for the heap, because in LISP that was the only kind of memory used for data, so unlike in PL/I or ALGOL 68 there was no need to distinguish it from memory areas that were managed in a different way.