Building an Interpreter from Scratch
I've built an interpreter twice, once in Go, once in C. Both times I went in thinking it'd be a fun side project, and both times it ended up being one of the more humbling things I've done as an engineer.
Why bother?
Honestly, curiosity. I was writing Go day to day and at some point I just wondered: what actually happens when I type x := 5 and hit run? Like, what is the computer actually doing with that? I'd been using languages for years without really knowing the answer.
So I picked up Thorsten Ball's Writing An Interpreter In Go and started there.
The pipeline
At a high level, every interpreter does the same thing:
Source Code → Lexer → Tokens → Parser → AST → Evaluator → Result
- The lexer breaks your source text into tokens.
let x = 5 + 3becomes something like[LET, IDENT(x), ASSIGN, INT(5), PLUS, INT(3)]. - The parser turns that flat list of tokens into a tree, the Abstract Syntax Tree. This is where operator precedence gets encoded.
- The evaluator walks the tree and actually computes stuff.
Each stage is pretty self-contained which makes the whole thing more approachable than it sounds.
The part that messed me up
The parser. Specifically, implementing a Pratt parser. The idea is that each token type carries its own parsing behavior, how it acts at the start of an expression versus in the middle of one. Once it clicked it felt really elegant, but getting there took me way longer than I expected.
Closures were the other thing. To make functions first-class, you can't just run the function body immediately. You have to capture the environment at the time the function was defined and drag it along with the function value. Getting that to work correctly with recursive functions took a few embarrassing bug-fixing sessions.
Go vs C
The Go version was genuinely fun. Garbage collection, interfaces, a good standard library. I could focus on the language semantics and not much else.
The C version (a bytecode VM, following Robert Nystrom's Crafting Interpreters) was a totally different experience. Every allocation is yours to manage. You write your own hash table. You implement a garbage collector. At one point I had a bug where my GC was collecting objects that were still alive. That was a fun afternoon.
It's painful, but by the end there's nothing left you don't understand.
What I got out of it
Building something twice, in two different languages with two different approaches, taught me more about how languages work than years of just using them. I also have a lot more respect for language designers now. The tradeoffs involved are subtle and the decisions compound.
If you're thinking about doing this, just start. Both books I mentioned are really good and surprisingly readable.