Remix.run Logo
jy14898 10 days ago

The post stated that it was believed duplication improved loading times on computers with HDDs rather than SSDs

dontlaugh 10 days ago | parent | next [-]

Which is true. It’s an old technique going back to CD games consoles, to avoid seeks.

SergeAx 10 days ago | parent [-]

Is it really possible to control file locations on HDD via Windows NTFS API?

dontlaugh 10 days ago | parent | next [-]

No, not at all. But by putting every asset a level (for example) needs in the same file, you can pretty much guarantee you can read it all sequentially without additional seeks.

That does force you to duplicate some assets a lot. It's also more important the slower your seeks are. This technique is perfect for disc media, since it has a fixed physical size (so wasting space on it is irrelevant) and slow seeks.

viraptor 2 days ago | parent [-]

> by putting every asset a level (for example) needs in the same file, you can pretty much guarantee you can read it all sequentially

I'd love to see it analysed. Specifically, the average number of nonseq jumps vs overall size of the level. I'm sure you could avoid jumps within megabytes. But if someone ever got closer to filling up the disk in the past, the chances of contiguous gigabytes are much lower. This paper effectively says that if you have long files, there's almost guaranteed gaps https://dfrws.org/wp-content/uploads/2021/01/2021_APAC_paper... so at that point, you may be better off preallocating the individual does where eating the cost of switching between them.

toast0 2 days ago | parent | next [-]

From that paper, table 4, large files had an average # of fragments around 100, but a median of 4 fragments. A handful of fragments for a 1 GB level file is probably a lot less seeking than reading 1 GB of data out of a 20 GB aggregated asset database.

But it also depends on how the assets are organized, you can probably group the level specific assets into a sequential section, and maybe shared assets could be somewhat grouped so related assets are sequential.

dontlaugh 2 days ago | parent | prev | next [-]

Sure. I’ve seen people that do packaging for games measure various techniques for hard disks typical of the time, maybe a decade ago. It was definitely worth it then to duplicate some assets to avoid seeks.

Nowadays? No. Even those with hard disks will have lots more RAM and thus disk cache. And you are even guaranteed SSDs on consoles. I think in general no one tries this technique anymore.

wcoenen 2 days ago | parent | prev | next [-]

> But if someone ever got closer to filling up the disk in the past, the chances of contiguous gigabytes are much lower.

By default, Windows automatically defragments filesystems weekly if necessary. It can be configured in the "defragment and optimize drives" dialog.

pixl97 2 days ago | parent [-]

Not 'full' de-fragmentation, Microsoft labs did a study and after 64MB slabs of contiguous files you don't gain much so they don't care about getting gigabytes fully defragmented.

https://web.archive.org/web/20100529025623/http://blogs.tech...

old article on the process

justsomehnguy 2 days ago | parent | prev | next [-]

> But if someone ever got closer to filling up the disk in the past, the chances of contiguous gigabytes are much lower

Someone installing a 150GB game sure do have 150GB+ of free space and there would be a lot of continuous free space.

jayd16 2 days ago | parent | prev [-]

It's an optimistic optimization so it doesn't really matter if the large blobs get broken up. The idea is that it's still better than 100k small files.

toast0 2 days ago | parent | prev [-]

Not really. But when you write a large file at once (like with an installer), you'll tend to get a good amount of sequential allocation (unless your free space is highly fragmented). If you load that large file sequentially, you benefit from drive read ahead and OS read ahead --- when the file is fragmented, the OS will issue speculative reads for the next fragment automatically and hide some of the latency.

If you break it up into smaller files, those are likely to be allocated all over the disk; plus you'll have delays on reading because windows defender makes opening files slow. If you have a single large file that contains all resources, even if that file is mostly sequential, there will be sections that you don't need, and read ahead cache may work against you, as it will tend to read things you don't need.

pjc50 2 days ago | parent | prev | next [-]

Key word is "believed". It doesn't sound like they actually benchmarked.

wongogue 2 days ago | parent [-]

There is nothing to believe. Random 4K reads for HDD is slow.

debugnik 2 days ago | parent [-]

I assume asset reads nowadays are much heavier than 4 kB though, specially if assets meant to be loaded together are bundled together in one file. So games now should be spending less time seeking relative to their total read size. Combined with HDD caches and parallel reads, this practice of duplicating over 100 GBs across bundles is most likely a cargo-cult by now.

Which makes me think: Has there been any advances in disk scheduling in the last decade?

khannn 2 days ago | parent | prev [-]

Who cares? I've installed every graphically intensive game on SSDs since the original OCZ Vertex was released.

teamonkey 2 days ago | parent [-]

Their concern was that one person in a squad loading on HDD could slow down the level loading for all players in a squad, even if they used a SSD, so they used a very normal and time-tested optimisation technique to prevent that.

khannn 2 days ago | parent [-]

Their technique makes it so that the normal person with a ~base SSD of 512 GB can't reasonably install the game. Heck of a job Brownie.

teamonkey 2 days ago | parent [-]

Nonsense. I play it on a 512GB SSD and it’s fine.

khannn 13 hours ago | parent [-]

It's hard for me to use a laptop with win11 and one game (BG3) installed on a 512 GB SSD.