▲ | Veserv 9 hours ago | ||||||||||||||||
You do not schedule a timeout on each buffered write. You only schedule one timeout on the transition from empty to non-empty that is retired either when the timeout occurs or when you threshold flush (you may choose to not clear on threshold flush if timeout management is expensive). So, you program at most one timeout per timeout duration/threshold flush. The point is to guarantee data gets flushed promptly which only fails when not enough data gets buffered. The timeout is a fallback to bound the flush latency. | |||||||||||||||||
▲ | vlovich123 9 hours ago | parent [-] | ||||||||||||||||
Yes that can work but as I said that has trade offs. If you flush before the buffer is full, you’re sacrificing throughput. Additionally the timer firing has additional performance degradation especially if you’re in libc land and only have a sigalarm available. So when an additional write is added, you want to push out the timer. But arming the timer requires reading the current time among other things and at rates of 10-20Mhz and up reading the current wall clock gets expensive. Even rdtsc approaches start to struggle at 20-40Mhz. You obviously don’t want to do it on every write but you want to make sure that you never actually trigger the timer if you’re producing data at a relatively fast enough clip to otherwise fill the buffer within a reasonable time. Source: I implemented write coalescing in my nosql database that can operate at a few gigahertz for 8 byte writes/s into an in memory buffer. Once the buffer is full or a timeout occurs, a flush to disk is triggered and I net out at around 100M writes/s (sorting the data for the LSM is one of the main bottlenecks). By comparison DBs like RocksDB can do ~2M writes/s and SQLite can do ~800k. | |||||||||||||||||
|