Remix.run Logo
mikewarot 3 days ago

Code has to have addresses for calls and branches. Debuggers need to be able to control it all.

suspended_state 3 days ago | parent [-]

> Code has to have addresses for calls and branches.

Does it mean that at that level an address has to be an offset in a linear address space?

If you have hardware powerful enough to make addresses abstract, couldn't also provide the operations to manipulate them abstractly?

cryptonector 3 days ago | parent | next [-]

Is each branch-free run of instructions an object (which in general will be smaller than "function" or "method" objects) that can be abstracted? How does one manage locality ("these objects are the text of this function")?

Maybe one compromises and treats the text of a function as linear address space with small relative offsets. Of course, other issues will crop up. You can't treat code as an array, unless it's an array of the smallest word (bytes, say) even if the instructions are variable length. How do you construct all the pointer+capability values for the program's text statically? The linker would have to be able to do that...

suspended_state 2 days ago | parent [-]

The linking stage could perhaps be performed at process launch by a privileged task?

See this comment thread: https://news.ycombinator.com/item?id=46494183

Isn't a virtual ISA like an intermediate representation? It doesn't have to include static addresses, only symbolic references, which could be resolved at launch time.

cryptonector 2 days ago | parent [-]

> The linking stage could perhaps be performed at process launch by a privileged task?

This gets expensive.

cryptonector 2 days ago | parent | prev | next [-]

Another thing is that array addressing is tricky. You need to support arrays, but then how do you construct pointers to individual elements of the arrays? The cleanest option is to not allow that and force the programmer to always pass around the pointer to the array and an index. The same applies to slices. This definitely makes life harder for porting existing codebases or making existing languages work.

imtringued 2 days ago | parent | prev [-]

This is probably the worst possible way to implement a processor.

Instead of just bumping the instruction pointer by one or an offset, you now need to have code labels in your processor and a way to efficiently look them up.

goto "exit" means the processor needs to have a lookup table of all the possible nodes in the computational graph. Calling a function requires a global look up table.

How is that table implemented? Obviously you can't just have an array of code labels, because using a linear data structure would kind of defeat the point.

Instead you need to build hardware that can store a graph and its edges directly and each hardware unit also has a label matcher so you can load a particular node and its edges.

This seems like an absurd amount of effort for moving from one instruction to the next. You know, something that could have been done by doing a single IP+1 or IP+jump_offset.

cryptonector 2 days ago | parent | next [-]

See my sibling comment. Basically each text fragment (either each branch-free run of instructions or each function/method) would need to be an object w/ capability, and the linker-loader must arrange for all text references to have the correct pointers. And then it would work. It's just a complication of the linker-loader, but once a program is loaded there would be no further lookups (unless you use something like `dlsym()`). The compromise to make is that within each such text object you have linear addressing with small offsets.

suspended_state 2 days ago | parent | prev [-]

What I meant, and indeed it was poorly explained, is that an address shouldn't be just an integer freely manipulable by any instruction. The microcode will obviously know how to an manipulate an address, but the ISA as a whole doesn't have to, and in fact shouldn't, with the exception of a few specific instructions. What I am advocating is that addresses should constitute a separate type, which isn't a simple alias to integers. I think that this is what capabilities are about.