Remix.run Logo
GeekyBear 4 days ago

> Hopefully Apple optimizes Core ML to map transformer workloads to the ANE.

If you want to convert models to run on the ANE there are tools provided:

> Convert models from TensorFlow, PyTorch, and other libraries to Core ML.

https://apple.github.io/coremltools/docs-guides/index.html

ls-a 4 days ago | parent | next [-]

I thought Apple MLX can do that if you convert your model using it https://mlx-framework.org/

woadwarrior01 4 days ago | parent | next [-]

MLX does not support the ANE.

https://github.com/ml-explore/mlx/issues/18

elpakal 4 days ago | parent [-]

Yes it does.

That’s just an issue with stale and incorrect information. Here are the docs https://opensource.apple.com/projects/mlx/

woadwarrior01 3 days ago | parent | next [-]

No, it categorically doesn't. Not just that, it's CPU support is quite lacking (fp32 only). Currently, there are two ways to support the ANE: CoreML and MPSGraph.

y1n0 4 days ago | parent | prev | next [-]

Nothing in that documentation says anything about the Apple Neural Engine. MLX runs on the GPU.

jychang 4 days ago | parent | prev [-]

None of that uses the ANE.

GeekyBear 4 days ago | parent | prev [-]

It does indeed, and is more modern than Core ML.

coffeecoders 4 days ago | parent | prev [-]

It is less about conversion and more about extending ANE support for transformer-style models or giving developers more control.

The issue is in targeting specific hardware blocks. When you convert with coremltools, Core ML takes over and doesn't provide fine-grained control - run on GPU, CPU or ANE. Also, ANE isn't really designed with transformers in mind, so most LLM inference defaults to GPU.

aurareturn 4 days ago | parent [-]

Neural Engine is optimized for power efficiency, not performance.

Look for Apple to add matmul acceleration into the GPU instead. Thats how to truly speed up local LLMs.