Remix.run Logo
imtringued 10 days ago

I don't think this is the real explanation. If they gave the filesystem a list of files to fetch in parallel (async file IO), the concept of "seek time" would become almost meaningless. This optimization will make fetching from both HDDs and SSDs faster. They would be going out of their way to make their product worse for no reason.

toast0 2 days ago | parent | next [-]

Solid state drives tend to respond well to parallel reads, so it's not so clear. If you're reading one at a time, sequential access is going to be better though.

But for a mechanical drive, you'll get much better throughput on sequential reads than random reads, even with command queuing. I think earlier discussion showed it wasn't very effective in this case and taking 6x the space for a marginal benefit for the small % of users with mechanical drives isn't worth while...

seg_lol 2 days ago | parent [-]

Every storage medium, including ram, benefits from sequential access. But it doesn't have to be super long sequential access, the seek time, or block open time just needs to be short relative to the next block read.

Xss3 10 days ago | parent | prev | next [-]

If they fill your harddrive youre less likely to install other games. If you see a huge install size youre less likely to uninstall with plans to reinstall later because thatd take a long time.

ukd1 2 days ago | parent [-]

Unfortunately this actually is believable. SMH.

pixl97 2 days ago | parent | prev | next [-]

>If they gave the filesystem a list of files to fetch in parallel (async file IO)

This does not work if you're doing tons of small IO and you want something fast.

Lets say were on a HDD with 200IOPS and we need to read 3000 small files randomly across the hard drive.

Well, at minimum this is going to take 15's seconds plus any additional seek time.

Now, lets say we zip up those files in a solid archive. You'll read it in half a second. The problem comes in when different levels all need different 3000 files. Then you end deduping a bunch of stuff.

Now, where this typically falls apart for modern game assets is they are getting very large which tends to negate seek times by a lot.

imtringued 2 days ago | parent [-]

I haven't found any asynchronous IOPS numbers on HDDS anywhere. The internet IOPs are just 1000ms/seek time with a 8ms seek time for moving from the outer to the inner track, which is only really relevant for the synchronous file IO case.

For asynchronous IO you can just do inward/outward passes to amortize the seek time over multiple files.

While it may not have been obvious, I have taken archiving or bundling of assets into a bigger file for granted. The obvious benefit is that the HDD knows that it should store game files continuously. This has nothing to do with file duplication though and is a somewhat irrelevant topic, because it costs nothing and only has benefits.

The asynchronous file IO case for bundled files is even better, since you can just hand over the internal file offsets to the async file IO operations and get all the relevant data in parallel so your only constraint is deciding on an optimal lower bound for the block size, which is high for HDDs and low for SSDs.

gruez 2 days ago | parent | next [-]

>I haven't found any asynchronous IOPS numbers on HDDS anywhere. The internet IOPs are just 1000ms/seek time with a 8ms seek time for moving from the outer to the inner track, which is only really relevant for the synchronous file IO case.

>For asynchronous IO you can just do inward/outward passes to amortize the seek time over multiple files.

Here's a random blog post that has benchmarks for a 2015 HDD:

https://davemateer.com/2020/04/19/Disk-performance-CrystalDi...

It shows 1.5MB/s for random 4K performance with high queue depth, which works out to just under 400 IOPS. 1 queue depth (so synchronous) performance is around a third.

pixl97 2 days ago | parent | prev [-]

>I haven't found any asynchronous IOPS numbers on HDDS anywhere.

As the other user stated, just look up Crystal Disk Info results for both HDDs and SSD and you'll see hard drives do about 1/3rd of a MBPs on random file IO while the same hard drive will do 400MBps on a contiguous read. For things like this reading a zip and decompressing in memory is "typically" (again, you have to test this) orders of magnitude faster.

jayd16 2 days ago | parent | prev | next [-]

The technique has the most impact on games running off physical disc.

It's a well known technique but happened to not be useful for their use case.

extraduder_ire 2 days ago | parent | prev | next [-]

"97% of the time: premature optimization is the root of all evil."

MLgulabio 2 days ago | parent | prev [-]

[dead]