I’m pretty sure people are using them for local inference. Token rates can be acceptable if you max out the specs. If it was just the harness, they’d use a $20 raspberry pi instead.