Remix.run Logo
samwillis 3 days ago

Oliver is doing awesome work here. A few interesting points:

- Porffor can use typescript types to significantly improve the compilation. It's in many ways more exciting as a TS compiler.

- There's no GC yet, and likely will be a while before it gets any. But you can get very far with no GC, particularly if you are doing something like serving web requests. You can fork a process per request and throw it away each time reclaiming all memory, or have a very simple arena allocator that works at the request level. It would be incredibly performant and not have the overhead of a full GC implementation.

- many of the restrictions that people associate with JS are due to VMs being designed to run untrusted code. If you compile your trusted TS/JS to native you can do many new things, such as use traditional threads, fork, and have proper low level memory access. Separating the concept of TS/JS from the runtime is long overdue.

- using WASM as the IR (intermediate representation) is inspired. It is unlikely that many people would run something compiled with Porffor in a WASM runtime, but the portability it brings is very compelling.

This experiment from Oliver doesn't show that Porffor is ready for production, but it does validate that he is on the right track, and that the ideas he is exploring are correct. That's the imports take away. Give it 12 months and exciting things will be happing.

spankalee 3 days ago | parent | next [-]

I'm very excited by Porffor too, but a lot of what you've said here isn't correct.

> - Porffor can use typescript types to significantly improve the compilation. It's in many ways more exciting as a TS compiler.

Proffor could use types, but TypeScript's type system is very unsound and doing so could lead to serious bugs and security vulnerabilities. I haven't kept track of what Oliver's doing here lately, but I think the best and still safe thing you could do is compile an optimistic, optimized version of functions (and maybe basic blocks) based on the declared argument types, but you'd still need a type guard to fall back to the general version when the types aren't as expected.

This isn't far from what a multi-tier JIT does, and the JIT has a lot more flexibility to generate functions for the actual observed types, not just the declared types. This can be a big help when the declared types are interfaces, but in an execution you only see specific concrete types.

> or have a very simple arena allocator that works at the request level.

This isn't viable. JS semantics mean that the request handling path can generate objects that are held from outside the request's arena. You can't free them or you'd get use-after-free problems.

> - many of the restrictions that people associate with JS are due to VMs being designed to run untrusted code

This is true to some extent, but most of the restrictions are baked into the language design. JS is a single-threaded non-shared memory language by design. The lack of threads has nothing to do with security. Other sandboxed languages, famously Java, have threads. Apple experimented with multithreaded JS and it hasn't moved forward not because of security but because it breaks JS semantics. Fork is possible in JS already, because it's a VM concept, not a language concept. Low-level memory access would completely break the memory model of JS and open up even trusted code to serious bugs and security vulnerabilities.

> It is unlikely that many people would run something compiled with Porffor in a WASM runtime

Running JS in WASM is actually the thing I'm most excited about from Porffor. There are a more and more WASM runtimes, and JS is handicapped there compared to Rust. Being able to intermix JS, Rust, and Go in a single portable, secure runtime is a killer feature.

samwillis 3 days ago | parent [-]

> I haven't kept track of what Oliver's doing here lately

Please do go and check up what the state of using types to inform the compiler is (I'm not incorrect)

On the area allocator, I wasn't clear enough, as stated elsewhere this was in relation to having something similar to isolates - each having a memory space that's cleaned up on exit.

Python has almost identical semantics to JS, and has threads - there is nothing in the EMCAScript standard that would prevent them.

spankalee 3 days ago | parent [-]

It is absolutely true that it is unsafe to trust TypeScript types. I've chatted briefly with Oliver on socials before and he knows this. So I am a bit confused by this issue: https://github.com/CanadaHonk/porffor/issues/234 which says "presume the types are good and have been validated by the user before compiling". This is just not a thing that's possible. Types are often wrong in subtle ways. Casts throw everything out the window.

Dart had very similar issues and constraints and they couldn't do a proper AOT compiler that considered types until they made the type system sound. TypeScript can never do that and maintain compatibility with JS.

Isolates are already available as workers. The key thing is that you can't have shared memory, other wise you can get cross-Isolate references and have all the synchronization problems of threads.

And ECMAScript is simply just specified as a single-threaded language. You break it with shared-memory threads.

In JS, this always logs '4'. With threads that's not always the case.

    let x = 4;
    console.log(x);
nicoburns 3 days ago | parent [-]

> It is absolutely true that it is unsafe to trust TypeScript types... This is just not a thing that's possible.

Well... unsafe and impossible aren't quite the same thing. I guess this is possible if you throw out "safe" as a requirement?

gibolt 3 days ago | parent | prev | next [-]

Based on how much imported libraries are relied upon, it makes sense to treat everything as untrusted. Unless you write every line yourself/in-house, code should be considered untrusted.

I would be curious which attack vectors change or become safe after compiling though.

samwillis 3 days ago | parent | next [-]

The point of the js engine sandbox is to protect the user in the browser - it's completely redundant on the server. Supply chain attacks are real, but only Deno has tried to fix that through permissions/rules.

I don't think anything changes with compile to native on the server.

rafram 3 days ago | parent [-]

Totally disagree. A spec-compliant JS engine has to support the features that allow vulnerabilities like prototype pollution, which can be exploited through user input alone.

hinkley 3 days ago | parent | prev [-]

Also none of the third party code will be thread safe. Hell, some of it isn’t even reentrant.

bastawhiz 3 days ago | parent | prev [-]

> many of the restrictions that people associate with JS are due to VMs being designed to run untrusted code. If you compile your trusted TS/JS to native you can do many new things, such as use traditional threads, fork, and have proper low level memory access. Separating the concept of TS/JS from the runtime is long overdue.

This is just outright wrong. JS limitations come from lots of things:

1. The language has almost zero undefined behavior by design. Code will essentially never behave differently on different platforms.

2. JS has traditional threads in the form of web workers. This interface exists not for untrusted code but because of thread safety. That's a language design, like channels in Go, rather than a sandboxing consideration.

3. Pretty much every non-browser JS runtime has the ability to fork.

4. JS is fully garbage collected, of course you don't get your own memory management. You can use buffers to manage your own memory if you really want to. WASM lets you manage your own memory and it can run "untrusted" code in the browser with the WASM runtime; your example just doesn't hold water. There's no way you could fiddle with the stack or heap in JS without making it not JS.

5. The language comes with thirty years of baggage, and the language spec almost never breaks backwards compatibility.

Ironically Porffor has no IO at the moment, which is present in literally every JS runtime. It really has nothing to do with untrusted code like you're suggesting.

> You can fork a process per request and throw it away each time reclaiming all memory, or have a very simple arena allocator that works at the request level. It would be incredibly performant and not have the overhead of a full GC implementation.

You also must admit that this would make Porffor incompatible with existing runtimes. Code today can modify the global state, and that state can and does persist across requests. It's a common pattern to keep in-memory caches or to lazily initialize libraries. If every request is fully isolated in the future but not now, you can end up with performance cliffs or a system where a series of requests on Node return different results than a series of requests on Porffor.

As for arena allocation, this makes it even less compatible with Node (if not intractable). If means you can't write (in JS) any code that mutates memory that was initialized during startup. If you store a reference to an object in an arena in an object initialized during startup, at the end of the request when the arena is freed you now have a pointer into uninitialized memory.

How do you tell the developer what they can and cannot mutate? You can't, because any existing variable might be a reference to memory initialized during startup. Your function might receive an object as an argument that was initialized during startup or one that's wasn't, and there's no way to know whether it's safe to mutate it.

Long story short, JS must have a garage collector to free memory, or it's not JS.

> It is unlikely that many people would run something compiled with Porffor in a WASM runtime, but the portability it brings is very compelling.

Node (via SEA in v20), bun, and deno all have built in tooling for generating a self-contained binary. Granted, the runtime needs to work for your OS and CPU, but the exact same thing could be said about a WASM runtime.

And of course there are hundreds of mature bundlers that can compile JS into a single file that runs in various runtimes without ever thinking about platform. It's weird to even consider portability of JS as a benefit because JS is already almost maximally portable.

> This experiment from Oliver doesn't show that Porffor is ready for production, but it does validate that he is on the right track, and that the ideas he is exploring are correct.

It validates that the approach to building a compiler is correct, but it says little about whether the project will eventually be usable and good. It's unlikely it'll get faster, because robust JS compatibility will require more edge cases to be handled than it currently does, and as Porffor's own README says, it's still slower than most JITted runtimes. A stable release might not yield much.

cxr 3 days ago | parent | next [-]

What a strange (and strangely adversarial) comment.

Almost none of your criticisms connects with anything that the other person wrote.

hinkley 3 days ago | parent | prev | next [-]

> JS has traditional threads in the form of web workers.

There is no language I’m aware of where workers behave like “traditional threads”. They’re isolates. Not threads.

samwillis 3 days ago | parent | prev [-]

Web workers don't share memory (other than SAB) with the main thread, they are far from traditional threads. These APIs are designed the way they are to protect end users, stop sites from consuming resources or bad code blocking the main thread. None of that is needed to be that way on the server. There is zero reason that a JS implementation cannot implement proper threads within the same memory space. The issue is that all js engines are derived from the browser where that isn't wanted, they simply don't have support for it. Traditional threads need careful use,

Nowhere did I say that full, or even any, compatibility with Node is needed - it isn't.

We need to stop conflating JS the language with the runtimes.

A JS runtime absolutely can get by without a GC, you just never dealloc and consume indefinitely. That doesn't change any semantics of the language, if a value/object is inaccessible, it's inaccessible...

An arena allocator provides a route to say embedding a js-to-native app in a single threaded web server like Nginx, you don't need to share memory between what in effect become "isolates".

no_wizard 3 days ago | parent | next [-]

NodeJS has worker threads[0] already

[0]: https://nodejs.org/docs/latest/api/worker_threads.html

crabmusket 3 days ago | parent [-]

These are very similar to web workers, they don't share memory other than via SharedArrayBuffer instances. For anything else you use message passing.

bastawhiz 3 days ago | parent | prev [-]

> Web workers don't share memory (other than SAB) with the main thread, they are far from traditional threads. These APIs are designed the way they are to protect end users, stop sites from consuming resources or bad code blocking the main thread. None of that is needed to be that way on the server.

It doesn't protect end users any more than it protects servers. Node could easily expose raw threading, but they don't because nearly the whole language isn't thread safe and everything would break. It has almost nothing to do with protecting users, it's a language design decision that enforces other design constraints.

> We need to stop conflating JS the language with the runtimes

If you're just sharing syntax but the standard library is different and essentially none of the code is compatible, it's not the same language. ECMAScript specifies all of the things you're talking about, and that is JavaScript, irrespective of the runtime.

> A JS runtime absolutely can get by without a GC, you just never dealloc and consume indefinitely. That doesn't change any semantics of the language, if a value/object is inaccessible, it's inaccessible...

If you throw away the whole heap on every request, now every request it's definitionally a "cold start". Which negates the singular benefit that this post is calling out. Porffor is still not faster than JITed engines at runtime, and initializing the code still has to happen.

> Nowhere did I say that full, or even any, compatibility with Node is needed - it isn't.

You have to square what you're saying with this statement. What you're describing is JavaScript in syntax only. You're talking about major departures from the formal language spec. Existing JavaScript code is likely to break. Why not just make a new language and call it something else, like Crystal is to Ruby? It works different, you're saying it doesn't care about compatibility... Why even call it JS then?

samwillis 3 days ago | parent [-]

> ECMAScript specifies all of the things you're talking about, and that is JavaScript, irrespective of the runtime.

I suggest you go and read the EMCAScript standard: https://ecma-international.org/publications-and-standards/st...

There is nothing in there about browser APIs, and in fact it explicitly states that the browser runtime, or any other runtime + api are not EMCAScript.