▲ | refulgentis 4 days ago | |||||||||||||||||||||||||||||||
[flagged] | ||||||||||||||||||||||||||||||||
▲ | HenryNdubuaku 4 days ago | parent [-] | |||||||||||||||||||||||||||||||
We are following Ollama's design, but not verbatim due to apps being sandboxed. Phones are resource-constrained, we saw significant battery overhead with in-process HTTP listeners so we stuck with simple stateful isolates in Flutter and exploring standalone server app others can talk to for React. For model sharing with the current setup: iOS - We are working towards writing the model into an App Group container, tricky but working around it. Android - We are working towards prompting the user once for a SAF directory (e.g., /Download/llm_models), save the model there, then publish a ContentProvider URI for zero-copy reads. We are already writing more mobile-friendly kernels and Tensors, but GGML/GGUF is widely supported, porting it is an easy way to get started and collect feedback, but we will completely move away from in < 2 months. Anything else you would like to know? | ||||||||||||||||||||||||||||||||
|