One-pass Compiler

(keleshev.com)

163 points | by halst 7 days ago

14 comments

  • ufo 6 days ago

    Lua is an example of a programming language that still uses an one-pass compiler. It keeps the implementation small, which is useful in embedded contexts. Additionally, not constructing an AST helps when parsing very large files. Lua files are sometimes used to store data, sort of like JSON.

    • lebuffon 6 days ago

      Reminds me of "Let's Build a Compiler"

      https://compilers.iecc.com/crenshaw/ circa 1995

      And a new C version https://github.com/lotabout/Let-s-build-a-compiler

      • traes 6 days ago

        If anyone's interested in a simple single-pass compiler for a full language, you can also check out munificent's Crafting Interpreters[0]. It walks you through a Java based AST parser and C based bytecode compiler.

        [0] http://craftinginterpreters.com/

        • nils-m-holm 6 days ago

          There are a few trivial optimizations that you can do even in a one-pass compiler. See http://t3x.org/t3x/comp.html

          Details are described in the book Write Your Own Compiler, http://t3x.org/t3x/book.html

          • eatonphil 6 days ago

            Cool stuff! Register management is a pain to do. It's nice to be able to leave that to LLVM or treat x86 as a stack machine.

            I wrote a compiler series exploring all three of these variations for a lisp dialect compiled by JavaScript.

            https://notes.eatonphil.com/compiler-basics-an-x86-upgrade.h...

            • AlexeyBrin 6 days ago

              Tangential - in your work in progress compiler book https://keleshev.com/compiling-to-assembly-from-scratch-the-... are you going to write yourself the lexer and parser or use Flex and Bison like in the article ?

              • halst 6 days ago

                In the book, lexing and parsing is done from scratch. I think tools like Flex and Bison are very handy, but for the book, my focus is on learning value.

              • stevekemp 6 days ago

                I wrote something very similar recently, also a single-pass, but with a trivial internal representation:

                https://github.com/skx/math-compiler

                The real difference between the compiler in the example, and mine, is that I handle floating-point operations and also explicitly use printf to show the output. (Because otherwise your return value is limited to an 8bit number.)

                • tom_mellior 5 days ago

                  This is nice. The -debug flag is a cool feature. But if you first build a representation of the whole program in memory, then traverse that representation to generate code, it's not really single-pass.

                  • stevekemp 5 days ago

                    Sorry my comment was wrong, I should have said "not a single-pass", that's why I mentiond the trivial internal representation.

                    Too late to edit now!

                  • userbinator 6 days ago

                    Because otherwise your return value is limited to an 8bit number

                    Only on Linux; Windows lets you use the full 32 bits.

                    • stevekemp 5 days ago

                      That's true. Though I guess even there there are limitations, because you can't return non-integers :)

                  • userbinator 6 days ago

                    I remember some of the old DOS compilers would generate similar code, using the stack as an actual expression evaluation stack when registers weren't sufficient. It's not a bad idea even now, since x86 has an internal cache of the top of the stack so operations on it are specifically optimised, and the opcodes are very small: push/pop reg is 1 byte, push s8 is 2.

                    • azhenley 6 days ago

                      I'm currently working on a 3-part blog series on writing a one-pass compiler in Python that emits C. It is quite a bit longer though because we make the lexer and parser from scratch. I started it because my students were having trouble with other tutorials because of all the jargon, theory, and libraries required.

                      http://web.eecs.utk.edu/~azh/blog/teenytinycompiler1.html

                      • Someone 6 days ago

                        Good text, but let’s nitpick. I’m too lazy/confident/arrogant to verify, so let’s potentially embarrass myself here.

                            def getToken(self):
                                self.skipWhitespace()
                                self.skipComment()
                        
                        How does that handle multiple consecutive comments? Whitespace following a comment?

                        (returning ‘comment’ and ‘whitespace’ tokens would fix this, and would make it possible to reuse the lexer for pretty-printing/syntax coloring)

                        • azhenley 6 days ago

                          It works because a comment by definition goes until a newline, so you can’t have consecutive comments without a newline token in between.

                          I’ll look into if there’s a better way to do it than this based on your suggestion, thanks!

                        • eatonphil 6 days ago

                          I've done the same thing for the same reason but in JavaScript and compiling a lisp dialect.

                          https://notes.eatonphil.com/compiler-basics-lisp-to-assembly...

                        • tails4e 6 days ago

                          That's very well done, thanks for sharing. I really like the straight forward writing style, I've never attempted to write a compiler, and have only done basic parsers for day to day tasks, so like the clean, simple approach.

                          • azhenley 6 days ago

                            Very glad to hear it! Part 2 will be out in a few days and part 3 the next week.

                        • nonsince 6 days ago

                          I’ve been writing a one-pass optimising compiler for Wasm. It’s not perfect, but the code it outputs currently outperforms Cranelift's by a factor of 2-10x for many of the files in the Wasm specification testsuite.

                          • timClicks 6 days ago

                            Wow, that sounds incredible. Is that a .wasm -> .wasm compiler?

                            • jashmatthews 6 days ago

                              Is that due to SIMD or something? I got about 80% of LLVM performance when I was using CraneLift.

                            • foobar_ 7 days ago

                              That was a short and epic read.

                              > This limited both the language features that were possible and the quality of the produced code.

                              What are the limitations ?

                              • tom_mellior 7 days ago

                                Older versions of C had the restriction that variables could only be declared at the start of a block. Supposedly this was due to the original C compiler being one-pass. It's easier to keep track of the size of the stack frame if you see all declarations in one place. Though I think with a frame pointer you could make it work anyway. You have essentially the same complications if you allow programmers to open a block anywhere and declare variables there. I don't know if the very first versions of C allowed this.

                                Maybe more exotically, here is a legal Haskell program:

                                    compute = f x y
                                    x = 3
                                    y = 4
                                    f = (+)
                                
                                Using typed identifiers before they are declared would not be possible in a single-pass compiler. You wouldn't know what code to emit for compute since you wouldn't even know its type until you have seen the definitions of x, y, and f.

                                Interestingly, this restriction does not apply to goto labels, which you can use first and define later: The compiler can just emit the label as written into the assembly code and let the assembler worry about patching up the jump target.

                                • klyrs 6 days ago

                                  > Interestingly, this restriction does not apply to goto labels, which you can use first and define later: The compiler can just emit the label as written into the assembly code and let the assembler worry about patching up the jump target.

                                  A one-pass compiler can manage this by maintaining a mapping of undefined labels to a list of goto statements that refer to them. Once the definition is located, unwind the list and fill in the jump values. Types are trickier because the size is unknown.

                                  • tom_mellior 6 days ago

                                    There is no way to "fill in the jump values" if you have already emitted the goto to a place you can't modify later. Like the featured article does, emitting its code with printf. I agree that you could emit code to an intermediate buffer, fix it up later, and still call this (a slightly more relaxed version of) one-pass compilation.

                                    • klyrs 6 days ago

                                      Fair point... kinda feels like passing the buck, but not having that memory overhead is nice

                                  • benibela 6 days ago

                                    >Older versions of C had the restriction that variables could only be declared at the start of a block.

                                    Pascal also has that restriction

                                    Although Delphi does not have it anymore. But FreePascal still has it, even in Delphi compatible mode. The FreePascal developers have also said, they will keep that restriction to improve readability. The code could not be read anymore if variables were placed willy-nilly everywhere

                                    • renox 5 days ago

                                      I’ve heard the readability justification for C restriction, but it doesn’t make sense: this prevent ´constification’ of variables (make the variable contains only one value)..

                                      • jasonlhy 5 days ago

                                        I wonder if Delphi still a one pass compiler?

                                  • jjice 6 days ago

                                    I'm actually working on a Pascal compiler right now. Register management has been a really tricky part for me, but I didn't even thing of treating an x86 processor as a stack based machine. Very clever, especially when you just want a simple compiler an not anything super optimized.

                                    • lebuffon 6 days ago

                                      A simple optimization is to make a stack machine with the top of stack cached in a register. This can improve speed about 10% in general operation.

                                    • bogomipz 6 days ago

                                      The author states:

                                      >"An optimizing compiler would constant-fold our expression into a single number ahead of time."

                                      Could someone say what is meant by constant-folding here?

                                      • stevekemp 6 days ago

                                        As a concrete example I wrote a simple interpreter a while back, in golang. Operations would be compiled to a simple bytecode which would be executed at runtime. So:

                                             3 + 4
                                        
                                        Would get "compiled" to:

                                             PUSH_INT 3
                                             PUSH_INT 4
                                             OP_PLUS
                                        
                                        As you can imagine this was a virtual stack-machine. After a while I started folding these operations:

                                        * If there was a "push int"

                                        * And another "push int"

                                        * And a maths operation.

                                        * Then replace.

                                        So in my case I'd first generate the naive version, then have a separate "optimise" pass which would rewrite the program:

                                             PUSH_INT 7
                                             NOP
                                             NOP
                                             NOP
                                             NOP
                                        
                                        A later optimization pass would remove consecutive NOP operations. I made a brief blog-post here:

                                        https://blog.steve.fi/adventures_optimizing_a_bytecode_based...

                                        There were a couple of bugs found along the way, but the actual implementation of this optimization pass was pretty simple:

                                        https://github.com/skx/evalfilter/pull/66/files

                                        The biggest issue I had to face was that "1 + 3 => 4" is trivial, but my virtual machine only supports loading integers. So I couldn't collapse "2 * 3.3" into the constant "6.6".

                                        • traes 6 days ago

                                          For anyone wondering, the NOPs are so that jumps don't get messed up. Assembly level GOTOs generally work on byte/word offsets, so if something intended to jump back before the calculation but was expecting two PUSH_INTs and an OP_PLUS (5 words: 3 opcodes, 2 arguments) but there was only a single PUSH_INT (2 words), it would jump back 3 words too far.

                                          • stevekemp 6 days ago

                                            Good clarification, and exactly right!

                                            Later I do walk back over the bytecode and remove the nops. But I have to update all JMP/CALL instructions to cope with the changed destination offsets. Not a hard job in my case, as there are only a couple of instructions which refer to byte-offsets.

                                            (If I allowed "LD A,[BC]", or similar permuations I'd have a lot more work to do.)

                                        • war1025 6 days ago

                                          Basically you have all the info you need at compile time, so you just do the work then rather than having to do it every time while the program is running.

                                          Ex: 4 + 4 would be condensed down to just 8 at compile time rather than writing the code to get 4 and 4 both into registers and add them at execution time.

                                          • tom_mellior 6 days ago

                                            https://en.wikipedia.org/wiki/Constant_folding: "Constant folding is the process of recognizing and evaluating constant expressions at compile time rather than computing them at runtime. Terms in constant expressions are typically simple literals, such as the integer literal 2, but they may also be variables whose values are known at compile time. Consider the statement:

                                                i = 320 * 200 * 32;
                                            
                                            Most compilers would not actually generate two multiply instructions and a store for this statement. Instead, they identify constructs such as these and substitute the computed values at compile time (in this case, 2,048,000)."
                                            • wlib 6 days ago

                                              Say you have a simple expression within your code: 4+3*2. Rather than computing the result at run time, you can reduce (fold) that expression into 10, because it's a simple constant that only needs to be computed once.

                                            • non-e-moose 6 days ago

                                              A one pass compiler is going to generate code which is slower than a modern JIT/interpreter. Why is a one pass compiler interesting?

                                              • saagarjha 6 days ago

                                                Because they're really fast. Actually, the lower tiers of modern JITs actually need speed, so being able to do code generation in one pass there is a huge boon.

                                                • Iwan-Zotow 6 days ago

                                                  How so?

                                                  Say, F# compiler is close to single-pass, I believe, producing CIL, which goes to JIT/AOT compiler later