▲ | Show HN: WebGPU enables local LLM in the browser – demo site with AI chat(andreinwald.github.io) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
137 points by andreinwald 2 days ago | 50 comments | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Browser LLM demo working on JavaScript and WebGPU. WebGPU is already supported in Chrome, Safari, Firefox, iOS (v26) and Android. Demo, similar to ChatGPT https://andreinwald.github.io/browser-llm/ Code https://github.com/andreinwald/browser-llm - No need to use your OPENAI_API_KEY - its local model that runs on your device - No network requests to any API - No need to install any program - No need to download files on your device (model is cached in browser) - Site will ask before downloading large files (llm model) to browser cache - Hosted on Github Pages from this repo - secure, because you see what you are running | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | maxmcd 2 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Looks like this is a wrapper around: https://github.com/mlc-ai/web-llm Which has a full web demo: https://chat.webllm.ai/ | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | echoangle a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Fun demo but the model that's used seems to be pretty stupid: > What's the best way to get to space? >> Unfortunately, it's not currently possible for humans to travel to space in the same way that astronauts do. While there have been several manned missions to space, such as those to the International Space Station, the technology and resources required to make interstellar travel feasible are still in the early stages of development. >> Currently, the fastest spacecraft is the Apollo missions, which took around 77 hours to reach the Moon. From there, they traveled to other planets, but the distances between them are so great that even at high speeds, it would take many years to reach them. >> As for traveling to space itself, there are no commercial or private companies that offer space travel services, and the cost of such a journey is estimated to be in the tens of millions of dollars per person. >> However, there are some concepts and technologies being explored that might one day make space travel more accessible and affordable. For example, private companies like SpaceX and Blue Origin are working on developing reusable rockets that could potentially take people to space more quickly and reliably. >> But for now, let's just say that space travel is still in the future, and we'll have to wait a bit longer for humans to venture into the cosmos! | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | scottfr a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
There is a Prompt API in development that's available in both Chrome and Edge to give access to a local LLM. Chrome extensions have access to it and I believe websites can request access as part of an origin trial. The model is fully managed by the browser. It's currently the Gemini Nano model on Chrome, and they are testing a version of the Gemma 3n model in beta channels. Edge uses phi-4-mini. More information is available here: https://github.com/webmachinelearning/prompt-api | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | petermcneeley a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This demo only works if you have the webgpu feature "f16". You can find out if you have this by checking for the feature list in https://webgpureport.org/ . The page itself can of course check for this but since f16 support is common they probably just didnt bother. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | AndrewDucker a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I asked "Why is the sky blue?" and got back a response of "coppia RR TalentDevExpressincer+'.//////////////////////////////////////////////////////////////////////// cha ولا.AutoSizesaving proleงคicate Like"/> infos эти za cornerback economical (%]\ enumligne.execRELEASEPropagation_station Bucks проHEME seas GASPOST[Unit(suffix Gloves" (and so on, for a few more paragraphs). Am I missing something? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | 201984 10 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Any way to make this use my actual GPU instead of the terrible integrated one? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | dataviz1000 a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Very cool! For some awesome examples of running different types of models in webgpu check out the examples folder here. https://github.com/huggingface/transformers.js | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | andreinwald a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Model used: Llama 3.2 1B (small). Quality hould be similar with running Ollama app with same small model. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | gulan28 a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I did this with mlc @ https://wiz.chat some time ago. Warning: it has a llama 3.1 7b model and is around 4 gb. It needs either a GPU or a Macand works only on chrome | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | asim a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
What's the performance of a model like vs an OpenAI API? What's the comparable here? Edit: I see it's same models locally that you'd run using Ollama or something else. So basically just constrained by the size of the model, GPU and perf of the machine. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | apitman a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Does anyone know why all of these WebGPU LLM demos have you download the models to browser storage rather than letting you open a gguf already on your local drive? I have several models downloaded already that I would be interested in trying. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | om8 a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
To have a gpu inference, you need a gpu. I have a demo that runs 8B llama on any computer with 4 gigs of ram | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | cat-whisperer a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I've been following the development of WebGPU and its potential applications, and this demo is a great example of what's possible. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | cgdl a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Which model does the demo use? | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | RagnarD a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Cool idea but badly broken from a little testing. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | andsoitis 2 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
very cool. improvement would be if the input text box is always on screen, rather than having to manually scroll down as the screen fills. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | pjmlp a day ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Beware of opening this on mobile Internet. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|