| ▲ | danpalmer 4 days ago | |
I have looked at MTurk many times throughout my career. In particular my previous company had a lot of data cleaning, scraping, product tagging, image description, and machine learning built on these. This was all pre-LLM. MTurk always felt like it would be a great solution. But every time I looked at it I persuaded myself out of it. The docs really down played the level of critical thinking that we could expect, they made it clear that you couldn't trust any result to even human-error levels, you needed to test 3-5 times and "vote". You couldn't really get good results for unstructured outputs instead it was designed around classification across a small number of options. The bidding also made pricing it out hard to estimate. In the end we hired a company that sat somewhere between MTurk and fully skilled outsourcing. We trained the team in our specific needs and they would work through data processing when available, asking clarifying questions on Slack, and would reference a huge Google doc that we had with various disambiguations and edge cases documented. They were excellent. More expensive that MTurk on the surface, but likely cheaper in the long run because the results were essentially as correct as anyone could get them and we didn't need to check their work much. In this way I wonder if MTurk never found great product market fit. It languished in AWS's portfolio for most of 20 years. Maybe it was just too limited? | ||