| ▲ | userbinator 6 hours ago | ||||||||||||||||
But at the moment when it lags the system switches from hardware cursor to software cursor (CGCursorIsDrawnInFramebuffer() goes from 0 to 1) so maybe that transition is stalled somehow on Macbook Neo. With the disclaimer that I have zero knowledge of the MacBook Neo hardware, but I do know a bit about GPUs in general (including having written some GPU-accelerated drivers for Windows and the associated cursor-handling code), I'm going to make a wild guess: this lag is caused by waiting for the GPU command queue to flush. As a bit of background information: the GPU is fed commands from a queue that the CPU writes to. These commands perform the drawing operations that the GPU is designed to accelerate. A hardware cursor is basically a small bitmap that can be positioned anywhere on the screen and moved around by simply updating position registers (which is normally done per mouse interrupt); the hardware draws it automatically. A software cursor is manually drawn by the graphics stack, which saves what was under it, draws the cursor, and then whenever it needs to be moved, writes the original data back, saves the data at the new position, and then draws the cursor there. Flushing the command queue is necessary when switching to a software cursor, or otherwise doing software writes to the framebuffer, because you need to wait for the GPU to finish drawing what it has queued, or it may end up drawing over what software wants to draw, including the cursor. Or worse, the command is a blit (e.g. scrolling a window) and you end up with remnants of the cursor at its previous position. | |||||||||||||||||
| ▲ | arghwhat 4 hours ago | parent | next [-] | ||||||||||||||||
The display controller and render device are completely distinct logical devices, even though they are often grouped in a "GPU". On mobile architectures they are quite far separated, leading to annoying problems surrounding what we on Linux call "split drm devices". Updating plane properties such as to move the cursor plane around or disable it would by itself not block on render activities, as they are completely distinct blocks. The render hardware could be powered down, but I doubt powering it up and compositing the cursor would take long enough to complete to cause any noticable lag. Under the Linux APIs, updates to the display controller are done through KMS atomic commits, and one mistake you could do display-server side would be to provide a fence in this atomic commit that the scheduler will use to wait on long-running GPU work before using the provided graphics buffers. Under this API, none of the changes - including mouse movements - would then be applied until that fence is signalled. Changing plane associations can lead to resource reallocations that can be a bit heavy. Not sure if the kernel driver in macOS works anything remotely similar to this, and the driver could also just be dumb and block on unrelated things ("let's just wait another vblank to see this apply....", "as we only need one plane now let's power down hardware and wait for that to settle..."). It could also just be windowserver that waits for work to finish on its own, not providing any cursor updates in the meantime. The reality is that it will take reverse engineering or looking at actual code to know what's going on. | |||||||||||||||||
| |||||||||||||||||
| ▲ | bloqs 6 hours ago | parent | prev | next [-] | ||||||||||||||||
This was a really informative and interesting reply articulated in simple enough terms that I am now interested in GPUs, thanks | |||||||||||||||||
| ▲ | Someone 2 hours ago | parent | prev | next [-] | ||||||||||||||||
> A hardware cursor is basically a small bitmap that can be positioned anywhere on the screen and moved around by simply updating position registers (which is normally done per mouse interrupt); the hardware draws it automatically Do modern machines still have custom hardware for cursors? That would surprise me, as a GPU can easily blit a small cursor on top of whatever gets drawn. | |||||||||||||||||
| |||||||||||||||||
| ▲ | jstanley 6 hours ago | parent | prev | next [-] | ||||||||||||||||
But wouldn't the software cursor operations also go in the queue? I don't see the problem. | |||||||||||||||||
| |||||||||||||||||
| ▲ | raphlinus 5 hours ago | parent | prev | next [-] | ||||||||||||||||
This is plausible to me as well. A couple years ago we were trying to make dynamic memory allocation in Vello more robust and explored using async readback of a status buffer. In that case, the async task doesn't wake until the command buffer completes and signals a fence back to the CPU. Long story short, performance was disappointing and we abandoned the approach. It's easy to believe it's a real problem especially when there are other factors including GPU being clocked down to save power. Same caveat as parent, I have no direct knowledge of MacBook Neo or this specific issue. | |||||||||||||||||
| ▲ | nok22kon 6 hours ago | parent | prev | next [-] | ||||||||||||||||
how do hardware cursors work in a composited desktop? the cursor could just be another small rectangle texture you position on top of the other surfaces. there is no need to read the framebuffer/write into it, its just a z-stack of 3d surfaces now | |||||||||||||||||
| |||||||||||||||||
| ▲ | charcircuit 6 hours ago | parent | prev [-] | ||||||||||||||||
>A software cursor is manually drawn by the graphics stack, which saves what was under it, draws the cursor, and then whenever it needs to be moved, writes the original data back, saves the data at the new position, and then draws the cursor there. If a hardware layer is not being used the cursor layer will be treated like any other layer in the compositor. Modern compositors don't try and save and write pixels like that. It will just rerender it. >(which is normally done per mouse interrupt); It's normally done every frame the compositor makes. >or it may end up drawing over what software wants to draw The compositor composites everything at that will be shown on the next refresh of the display. Things don't indepently step on each others toes since it's just the compositor rendering and synchronizing all hardware layers (planes). | |||||||||||||||||