| ▲ | jeffbee 13 hours ago |
| What's fast about mmap? |
|
| ▲ | kennethallen 6 hours ago | parent | next [-] |
| Two big advantages: You avoid an unnecessary copy. Normal read system call gets the data from disk hardware into the kernel page cache and then copies it into the buffer you provide in your process memory. With mmap, the page cache is mapped directly into your process memory, no copy. All running processes share the mapped copy of the file. There are a lot of downsides to mmap: you lose explicit error handling and fine-grained control of when exactly I/O happens. Consult the classic article on why sophisticated systems like DBMSs do not use mmap: https://db.cs.cmu.edu/mmap-cidr2022/ |
| |
| ▲ | commandersaki 2 hours ago | parent | next [-] | | you lose explicit error handling I've never had to use mmap but this is always been the issue in my head. If you're treating I/O as memory pages, what happens when you read a page and it needs to "fault" by reading the backing storage but the storage fails to deliver? What can be said at that point, or does the program crash? | |
| ▲ | saidinesh5 5 hours ago | parent | prev [-] | | This is a very interesting link. I didn't expect mmap to be less performant than read() calls. I now wonder which use cases would mmap suit better - if any... > All running processes share the mapped copy of the file. So something like building linkers that deal with read only shared libraries "plugins" etc ..? | | |
| ▲ | squirrellous an hour ago | parent [-] | | One reason to use shared memory mmap is to ensure that even if your process crashes, the memory stays intact. Another is to communicate between different processes. |
|
|
|
| ▲ | rishabhaiover 11 hours ago | parent | prev [-] |
| it allows the program to reference memory without having to manage it in the heap space. it would make the program faster in a memory managed language, otherwise it would reduce the memory footprint consumed by the program. |
| |
| ▲ | jeffbee 11 hours ago | parent [-] | | You mean it converts an expression like `buf[i]` into a baroque sequence of CPU exception paths, potentially involving a trap back into the kernel. | | |
| ▲ | rishabhaiover 11 hours ago | parent [-] | | I don't fully understand the under the hood mechanics of mmap, but I can sense that you're trying to convey that mmap shouldn't be used a blanket optimization technique as there are tradeoffs in terms of page fault overheads (being at the mercy of OS page cache mechanics) | | |
| ▲ | StilesCrisis 7 hours ago | parent | next [-] | | Tradeoffs such as "if an I/O error occurs, the program immediately segfaults." Also, I doubt you're I/O bound to the point where mmap noticeably better than read, but I guess it's fine for an experiment. | |
| ▲ | jibal 10 hours ago | parent | prev [-] | | I think he's conveying that he doesn't know what he's talking about. buf[i] generates the same code regardless of whether mmap is being used. The first access to a page will cause a trap that loads the page into memory, but this is also true if the memory is read into. |
|
|
|