▲ | fidotron a day ago | |
I mean the accuracy with which it's locating the bounds. What is extra curious is it obviously supports rotated cubes, yet it often doesn't use them when it should, leading to overstating the bounds, as if it's over enthusiastically trying to put things aligned to some inferred axis. This is obviously an attempt at the general case to apply cubes to anything, but what is disappointing is the performance on boxy objects is lower than I've seen on private NNs used for AR and CV for years (ironically enough on iPads), using just rgb and no depth. I half think the exercise here was to establish if transformers were the way to go for this, and on the strength of that the answer would be probably not. |