| ▲ | geuis 5 hours ago | |||||||
Was evaluating YOLO26 within the last month for its on-device (iPhone 16 Pro) segmentation capabilities. Its decent, but its biggest limitation is that its only trained on 80 COCO classes (meaning pre-labeled images). If whatever is in your images isn't in the 80 classes, its invisible to YOLO26. Conversely I have SAM2 running on-device and its my current workhorse. The biggest benefit with SAM2 for me is that it does fine-grained segmentation masks but isn't trained on labeled images. This was a specific requirement for the app I'm building. SAM2 isn't anywhere as speedy as the native Vision framework apis, but it is more capable across a vastly wider array of potential image targets. | ||||||||
| ▲ | larodi 5 hours ago | parent [-] | |||||||
I would prefer GroundingDINo which is a sort of SAM and Dino combo which does open vocabulary. | ||||||||
| ||||||||