new | show | ask | jobs Github

GeekyBear 4 days ago

> Hopefully Apple optimizes Core ML to map transformer workloads to the ANE.

If you want to convert models to run on the ANE there are tools provided:

> Convert models from TensorFlow, PyTorch, and other libraries to Core ML.

https://apple.github.io/coremltools/docs-guides/index.html

▲

ls-a 4 days ago | parent | next [-]

I thought Apple MLX can do that if you convert your model using it https://mlx-framework.org/

▲

woadwarrior01 4 days ago | parent | next [-]

MLX does not support the ANE.

https://github.com/ml-explore/mlx/issues/18

▲

elpakal 4 days ago | parent [-]

Yes it does.

That’s just an issue with stale and incorrect information. Here are the docs https://opensource.apple.com/projects/mlx/

	▲	woadwarrior01 3 days ago \| parent \| next [-]
		No, it categorically doesn't. Not just that, it's CPU support is quite lacking (fp32 only). Currently, there are two ways to support the ANE: CoreML and MPSGraph.
	▲	y1n0 4 days ago \| parent \| prev \| next [-]
		Nothing in that documentation says anything about the Apple Neural Engine. MLX runs on the GPU.
	▲	jychang 4 days ago \| parent \| prev [-]
		None of that uses the ANE.

▲

GeekyBear 4 days ago | parent | prev [-]

It does indeed, and is more modern than Core ML.

▲

coffeecoders 4 days ago | parent | prev [-]

It is less about conversion and more about extending ANE support for transformer-style models or giving developers more control.

The issue is in targeting specific hardware blocks. When you convert with coremltools, Core ML takes over and doesn't provide fine-grained control - run on GPU, CPU or ANE. Also, ANE isn't really designed with transformers in mind, so most LLM inference defaults to GPU.

	▲	aurareturn 4 days ago \| parent [-]
		Neural Engine is optimized for power efficiency, not performance. Look for Apple to add matmul acceleration into the GPU instead. Thats how to truly speed up local LLMs.