Remix clone Hacker News

new | show | ask | jobs Github

	▲	juliangoldsmith 10 hours ago
		I've been using Azure AI Foundry for an ongoing project, and have been extremely dissatisfied. The first issue I ran into was with them not supporting LLaMA for tool calls. Microsoft stated in February that they were working on it [0], and they were just closing the ticket because they were tracking it internally. I'm not sure why they've been unable to do what took me two hours in over six months, but I am sure they wouldn't be upset by me using the much more expensive OpenAI models. There are also consistent performance issues, even on small models, as mentioned elsewhere. This is with a rate on the order of one per minute. You can solve that with provisioned throughput units. The cheapest option is one of the GPT models, at a minimum of $10k/month (a bit under half the cost of just renting an A100 server). DeepSeek was a minimum of around $72k/month. I don't remember there being any other non-OpenAI models with a provisioned option. Given that current usage without provisioning is approximately in the single dollars per month, I have some doubts as to whether we'd be getting our money's worth having to provision capacity.