Remix.run Logo
fpoling 3 days ago

The article has not mentioned memory compression as an alternative to swap which many Linux distributions enable by default.

On the other hand these days latest SSD are way faster than memory compression even with LUKS encryption on and even when compression uses LZ4 compression. Plus modern SSDs do not suffer from frequent writes as before so on my laptop I disabled the memory compression and then all reasoning from the article applies again.

Then on a development laptop running compilations/containers/VMs/browser vm.swappines does not seems matter that much if one has enough memory. So I no longer tune it to 100 or more and leave at the default 60%.

vlovich123 3 days ago | parent [-]

> these days latest SSD are way faster than memory compression

That's a really provocative claim. Any benchmarks to support this?

fpoling 3 days ago | parent [-]

On my laptop with Samsung PRO 990 SSD and Intel Core Ultra 7 165U CPU with 64G RAM under Debian 13:

Read test via coping to RAM memory from LUKS-encrypted BTRFS against /tmp that is a standard RAM disk:

  $ dd of=/tmp/input-90K.jsonl if=input-90K.jsonl conv=fdatasync bs=10M iflag=direct oflag=direct
  2596+1 records in
  2596+1 records out
  27225334502 bytes (27 GB, 25 GiB) copied, 20.4403 s, 1.3 GB/s
Write test:

  $ dd if=/tmp/input-90K.jsonl of=tmp.jsonl conv=fdatasync bs=10M oflag=direct
  2596+1 records in
  2596+1 records out
  27225334502 bytes (27 GB, 25 GiB) copied, 16.8612 s, 1.6 GB/s

Preparing RAM disk with zram compression to emulate zram:

  $ sudo zramctl --algorithm=lz4 --size=30GiB /dev/zram0
  $ sudo mkfs.ext4 /dev/zram0
  $ sudo mount /dev/zram0 /mnt
  $ df -h /mnt
  Filesystem      Size  Used Avail Use% Mounted on
  /dev/zram0       30G  2.1M   28G   1% /mnt
Write test to lz4-compressed ZRAM:

  $ sudo dd if=/tmp/input-90K.jsonl of=/mnt/tmp.jsonl conv=fdatasync bs=10M oflag=direct iflag=direct
  2596+1 records in
  2596+1 records out
  27225334502 bytes (27 GB, 25 GiB) copied, 93.2813 s, 292 MB/s
Read test from lz4-compressed ZRAM:

  $ dd of=/tmp/input-90K.jsonl if=/mnt/tmp.jsonl conv=fdatasync bs=10M oflag=direct iflag=direct
  2596+1 records in
  2596+1 records out
  27225334502 bytes (27 GB, 25 GiB) copied, 34.8479 s, 781 MB/s

So SSD with LUKS is 1.5 times faster than zram for read and 5 times faster than zram for write.

Note without LUKS but native SSD encryption the speed of SSD will be at least 2 times faster. Also using recent kernel is important so LUKS uses CPU instructions for AES encryptions. Without that SSD under LUKS will be several times slower.

vlovich123 2 days ago | parent | next [-]

I think this says more about the terrible memory bandwidth & limited compute of the Intel mobile CPUs than about the positive speed of SSDs. Here's an 13900K 64 GiB with a SN850X SSD LUKS encrypted ext4. On my machine RAM compression is still faster. There's also various overheads in this test that make it not a 100% representative sample although I'm not sure how big the divergence is (namely zram swap doesn't have a filesystem and it's deep within the memory management code and not using O_DIRECT).

Basic memory bandwidth test:

    $ dd if=/dev/zero of=/dev/null bs=10M count=7000
    7000+0 records in
    7000+0 records out
    73400320000 bytes (73 GB, 68 GiB) copied, 1.44856 s, 50.7 GB/s
Read test $ dd if=random.bin of=/tmp/random.bin conv=fdatasync bs=10M iflag=direct oflag=direct 2500+0 records in 2500+0 records out 26214400000 bytes (26 GB, 24 GiB) copied, 9.09728 s, 2.9 GB/s

Write test

    $ dd if=/tmp/random.bin of=tmp.bin conv=fdatasync bs=10M iflag=direct oflag=direct
    2500+0 records in
    2500+0 records out
    26214400000 bytes (26 GB, 24 GiB) copied, 53.9548 s, 486 MB/s
Not sure why that disk write test was suddenly so bad.

    $ df -h /mnt
    Filesystem      Size  Used Avail Use% Mounted on
    /dev/zram0       30G  2.1M   28G   1% /mnt
Write test to lz4-compressed ZRAM:

    $ sudo dd if=/tmp/random.bin of=/mnt/tmp.bin conv=fdatasync bs=10M iflag=direct oflag=direct
    2500+0 records in
    2500+0 records out
    26214400000 bytes (26 GB, 24 GiB) copied, 7.97006 s, 3.3 GB/s
Read test:

    $ dd of=/tmp/random.bin if=/mnt/tmp.bin conv=fdatasync bs=10M iflag=direct oflag=direct
    2500+0 records in
    2500+0 records out
    26214400000 bytes (26 GB, 24 GiB) copied, 5.16566 s, 5.1 GB/s
fpoling 2 days ago | parent [-]

What is random.bin? I was testing with a json dataset that compresses by factor like 2.5 with zram. But if random is incompressible, then zram does not write compressed data but rather the original resulting in much faster read speed.

Also on your SSD do you have logical 4K sector or 512 byte sectors? If the latter, then Linux distros defaults to 512 LUKS sectors on them resulting in much slower performance especially with writes.

I always ensure that LUKS sectors are 4K even if SSD reports 512 bytes and does not allow to change that to 4K like Samsung 9* series.

vlovich123 a day ago | parent [-]

4k sectors. I don’t have a 25gib json file. Where can I get/generate the dataset you were using?

fpoling a day ago | parent [-]

The dataset is internal, but if you get a bunch of web pages from wikipedia and wrap them into HTML then it give a rough idea.

With 4K LUKS sectors the write speed is too low for a modern SSD. Check that LUKS use a fast implementation. I have:

  $ /usr/sbin/cryptsetup benchmark
  ...
  #     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b       664.1 MiB/s      2615.9 MiB/s
... aes-xts 512b 2708.8 MiB/s 2986.0 MiB/s

and my LUKS setup uses aes-xts 512b.

man8alexd 3 days ago | parent | prev [-]

You are measuring sequential throughput with a block size of 10M. Swap I/O is random 4K pages (with default readahead 32K and clustered swapout 1M), with the read latency being the most important factor.

fpoling 2 days ago | parent [-]

> are measuring sequential throughput

On SSD I am measuring LUKS performance in fact as IO is much faster then LUKS encryption using specialized CPU instructions. As I wrote, without LUKS the numbers at least twice faster even with random access.

The point is that in 2025 with the latest SSDs there is no point in using compressed memory. Ewen with LUKS encryption it will be faster than even highly tuned swap setup.

In 2022-23 when LUKS was not optimized it was different so I used hardware encryption on SSD after realizing that even lz4 compression was significantly slower than SSD.

fpoling 2 days ago | parent [-]

EDIT: while its true that on purely random 4K SSD performance degrades badly, with 32K random read/writes it is still above 2 GB/s so in practice it is LUKS that is the bottleneck, not SSD.