> Which basically expanded back in the day to 65k reads of 1 byte for several MB file. Each fread translated to 65k reads of ReadFile Windows API

What software did that that badly? If the code asks for (up to) 65,536 single byte items, why would you split that into 65,536 calls?

Also, that change changes behavior. The old call could read anything from zero to 65,536 bytes, the new one only can read zero or 65,536 bytes.

(Reading the source of a few implementations, I think most implementations will fill the output buffer with partial objects if the input doesn’t supply an integral number of them, but the return value of fread cannot signal that to the caller)

▲

micampe 3 hours ago | parent | next [-]

A long time ago I worked with someone who read 1 byte at a time from a socket because they insisted data was cached so the kernel was going to batch it magically somehow. It took me days to convince them to measure it.

▲

quietbritishjim 3 hours ago | parent [-]

That's different: you're talking about the application code, like OP.

But I think the parent comment's point is that the issue is in the implementation of fread itself in the standard library. It's perfectly reasonable for an application to pass it 1, 65536 (i.e. one byte, up to 65536 times) and expect it not to issue 65536 separate OS calls.

▲

b112 2 hours ago | parent [-]

Is it? I get what you're saying, but asking for 1 byte 65536 times, is indeed different than asking for 65536 bytes, 1 time. There may be reasons, such as when you pull off the end of a buffer, it shifts. And the buffer size is 1 byte. Or 10. Or whatever.

No, I'm not saying that's why. I'm simply saying there is a difference between asking for 1 byte or 65k bytes of something. Even dd runs the same under Linux.

dd bs=10k count=1 is faster than bs=1 count=10k

I remember trying to recover some data from a spinning disk, and trying to slowly creep up on the data. So I wanted 1 byte per, I wanted it to nibble, until it hit whatever the errored part was. If I just grabbed the lot, it'd error out from the whole read.

	▲	quietbritishjim 10 minutes ago \| parent \| next [-]
		> asking for 1 byte 65536 times, is indeed different than asking for 65536 bytes, 1 time. Yes it's different. As others have noted, the difference is what is returned if less than 65536 are available to read in the file: total failure vs partial read. There is, unsurprisingly, no requirement that it has an unnecessarily inefficient implementation to meet this behavioral requirement. (The C standard doesn't talk about such things as syscalls but, even if it did, it surely wouldn't require such a thing.) The irony is that that partial read is actually the default on both Windows and Posix (i.e. both ReadFile and read() will read up to the number of bytes specified). So a one-syscall implementation for fread would have been easier than multiple calls, and certainly would be standard compliant. The dd example isn't comparable because dd is much lower level, and you really are specifying how the syscalls should be made.
	▲	Someone 2 hours ago \| parent \| prev \| next [-]
		I glanced at https://github.com/busterb/libc-openbsd/blob/master/stdio/fr... and https://chromium.googlesource.com/chromiumos/third_party/gli.... The latter (as usual when comparing OpenBSD and Linux) is more complex, but both multiply count by size and then go their way. Also, the API contract allows fread to read fewer bytes than requested. I would except any implementation to do that. But maybe, somebody interpreted the contract differently than major OSes, in the sense that a call isn’t allowed to write partial size-sized chunks to user memory and/or advance the file position further than its return value advocates (that, I think, is something that the implementations above can do, and might be considered a bug)
	▲	dspillett 2 hours ago \| parent \| prev \| next [-]
		Another possibility for why it needs to be done that way is dealing with error conditions. I've not looked at the code (or even the man pages) and it is a long time since I touched anything that low level, so this might be completely wrong, but if there is an error before the next 64KiB (including just hitting EOF) then the semantics could be different. Asking for 1x64KiB I would expect to just error as there aren't the requested number of bytes. Asking for 64Ki lots of 1 byte might simple error just the same, or it might at least populate the buffer with what it can read, or if the meaning of 1,65536 is actually “up to 64Ki lots of 1B” then it would populate the buffer as far as possible and return the amount read rather than an error condition. If the per-byte option is slow but still fast enough, and dealing with the semantics is less faf, then people will go for that because the tiny time loss is worth the larger effort reduction. Of course this assumes the underlying system doesn't change, as with the “making local code to run as on-demand networked code” example higher in the thread which changes the relative performance characteristics of the two calling methods significantly.
	▲	chadgpt3 2 hours ago \| parent \| prev [-]
		dd is designed to request a certain block size from the kernel. fread is not and should just multiply the two arguments and read that many bytes, just like calloc.

▲

macintux 34 minutes ago | parent | prev [-]

I assumed it was a simple mistake: easy to forget what order the two integers are sent.