Remix.run Logo
kragen 3 days ago

I agree that static and especially gradual typing add complexity, but it's a very much smaller amount of complexity than we're talking about here, so in fact it is very common to encounter dynamically-typed scripting languages that are much more complex than some languages with excellent type systems.

I think you can implement a Hindley–Milner type checker in about a page of code, not the 2000 pages of code you're talking about.

I'm not sure what you mean by "complete". H–M is complete in the sense that it's decidable and doesn't leave any holes: programs that check statically are guaranteed not to have type errors at runtime. It handles higher-order functions and parametric polymorphism (generics) out of the box, it doesn't suffer from null pointers, and it can even handle mutability. And it's fully inferrable. There are various extensions to make it more expressive (GADTs, typeclasses, subtyping, linearity, tagged arguments) but even the basic HM system is already a lot more powerful than something like the type system of C or Java 1.7.

pansa2 3 days ago | parent [-]

> it is very common to encounter dynamically-typed scripting languages that are much more complex than some languages with excellent type systems.

You're right, that statement was too general. Python is a dynamically-typed scripting language (if you exclude external tools like MyPy), and is one of the most complex languages out there.

I should have been more specific: "Any language with a reasonably-complete type system is inevitably going to be much more complex than Lua".

> I agree that static and especially gradual typing add complexity, but it's a very much smaller amount of complexity than we're talking about here [...] I think you can implement a Hindley–Milner type checker in about a page of code

aw1621107's comment shows that Luau's type checker (the "Analysis" directory) is ~60% of the project's code. Maybe there are languages where the equivalent is just a single page, but even then, type checking makes a language implementation more complex in other ways as well.

For example, Luau's AST implementation alone is 75% the size of the whole of Lua 5.1. By deferring type-checking to runtime, Lua can avoid the need to build an AST at all: the compiler can go straight from source code to bytecode.

kragen 2 days ago | parent [-]

Building an AST is also not a large amount of code, especially if you already have GC and some kind of runtime type discrimination. And building an AST simplifies a lot of other things you might want to do, not just type checking. You're speaking as if compilers were invented in the Obama presidency because previously computers didn't have enough memory for all their code, and were invariably many-person projects.

But, in fact, writing a compiler, with an AST and static types, is a common single-person term project for undergraduates, and it's something people have been doing for 70 years, since computers used magnetic drums and acoustic delay lines for RAM. We've learned techniques since then that make it easier, which is why undergraduates can do it now.

One notable example is Stephen C. Johnson's "portable C compiler", which was the main C compiler in the late 70s and early 80s. By my count the current version of it at https://github.com/PortableCC/pcc is a bit under 50,000 lines of code, including C, some kind of C++, and Fortran frontends and backends for 18 architectures, but not including lex and yacc. I just built it here on my phone, which required implementing getw() and putw() (maybe I don't know the right feature test macros) and #including <strings.h> with the "s" in aarch64/local2.c. The executables total about 320K, including C and C++ (no Fortran) and only aarch64.