Remix.run Logo
co_dh 6 days ago

But why do you need serialization? Because the data structure on disk is not the same as in memory. Arthur Whitney's k/q/kdb+ solved this problem by making them the same. An array has the same format in memory and on disk, so there is no serialization, and even better, you can mmap files into memory, so you don't need cache!

He also removed the capability to define a structure, and force you to use dictionary(structure) of array, instead of array of structure.

RossBencina 6 days ago | parent | next [-]

Forget on-disk. Different CPUs represent basic data types with different in-memory representations (endianness). Furthermore different CPUs have different capabilities with respect to how data must be aligned in memory in order to read or write it (aligned/unaligned access). At least historically unaligned access could fault your process. Then there's the problem, that you allude to, that different programming languages use different data layouts (or often a non-standardised layout). If you want communication within a system comprising heterogeneous CPUs and/or languages, you need to translate or standardise your a wire format and/or provide a translation layer aka serialisation.

throwaway127482 6 days ago | parent | prev | next [-]

> But why do you need serialization? Because the data structure on disk is not the same as in memory.

Not always - in browser applications for example, there is no way to directly access the disk, nevermind mmap().

6 days ago | parent | prev [-]
[deleted]