Remix.run Logo
visarga 4 days ago

I'd love to see a GraphRAG browser that collects the pages I visit automatically.

_flux 4 days ago | parent | next [-]

Many years ago there used to be a Firefox extension (..or might have even been a Mozilla one..) that would store all the pages I visit. I recall its name was Breadcrumbs but I could be misremembering. Space is cheap, or at least affordable if one would exclude videos, which are probably technically more difficult to archive anyway, but sometimes one remembers having seen content that is never to be found again.

I think it would be useful to have just a personal basic search engine on that kind of contents, but possibly a RAG or even a fine tuned LLM would be even cooler.

Actually, e.g. Firefox could do that at least for its bookmarks and tabs, though it already does provide the function for tagging bookmarks. And I think there's probably an extension for searching tabs' contents..

irthomasthomas 4 days ago | parent | next [-]

Not identical but I started building a smart bookmark tool that stores the content in vectors and sqlite dB and hosts them in GitHub issues with labels managed by the ai. Check it: https://undecidability.com and code lives at https://github.com/irthomasthomas/label-maker It's a bit rough but there is a working cli. It uses local jina embeddings model but openai logprobs to determine when to create new labels.

fire_lake 4 days ago | parent | prev | next [-]

Given how personal browsing history can be this is a great use case for local LLMs. I would love for Mozilla to deliver on this.

jumping_frog 3 days ago | parent [-]

Building personal assistant could be beneficial to Mozilla based on how much we do online. I would like to track changes to my beliefs based on how I came across new information. In future, the AI could automatically shorten paragraphs in essays about topics or terms I am already aware of while keeping new concepts introduced in it full expanded so that I grok them better.

TiredOfLife 4 days ago | parent | prev | next [-]

The original version of read it later (now Mozilla owned Pocket) had that option. but then removed that option because it went against their commercial interests.

monkeydust 4 days ago | parent [-]

Pocket is good. I use it across all my devices, simple and works for me but do wonder if they could or should do more with the data they collect from me which is all the things I really care about.

3abiton 4 days ago | parent [-]

What's the selling point for it though? I don't get it?

gazreese 4 days ago | parent | prev [-]

I need this so much, someone please build it ASAP. This would be so useful!

mehh 3 days ago | parent [-]

Working on it https://ont.fyi

The approach is not to capture all pages you view, rather you can add the pages etc you want in order to reduce the amount of noise/rubbish. It constructs a knowledge graph from these documents, and then a graph rag approach ontop to enable chat.

The core graph is based on wikidata, you can have your graphs either private or public if so they are published like those you can see on the site now.

Lots to do, but making ing good progress, if this sounds like something you might want to use please sign up.

m-s-y 4 days ago | parent | prev | next [-]

I’d love to see a brain interface so that all these pages we visit can instantly become available to our own non-ai in-brain all-human reasoning.

jpt4 3 days ago | parent | prev | next [-]

Local archiving tool I've been testing: webchiver.com

TiredOfLife 4 days ago | parent | prev [-]

According to HN and Reddit that would be spyware and and you are wrong for wanting that.

stogot 4 days ago | parent [-]

Only if it’s turned on by default and uploaded to the cloud. Privacy and user choice are what these readers want

TiredOfLife 4 days ago | parent [-]

That's exactly what Recall is: offline and fully customizable, but HN/Reddit went mad over it.

ubertaco 3 days ago | parent | next [-]

> offline and fully customizable, but HN/Reddit went mad over it.

...until it isn't.

A self-hosted open-source project you can download and run (or compile yourself and then run) is very different from a closed-source OS-level component that's developed by a for-profit company that makes at least some portion of its revenue on ads.

Twitter was "the public square of the web", until it wasn't. Google Reader was a best-in-class easy RSS reader, until it wasn't.

If you don't have the source code, you don't own or control the software. And when you don't own or control the software, it's reasonable to have more-guarded views on what data you're willing to give to that software.

If that software suddenly appears installed on your machine, constantly recording your screen and running entirely-opaque "AI processing" on it, unless you go through a series of steps to opt out...it's reasonable to be upset, because the opportunity to choose what you're willing to share has been denied to you.

And since it's a closed-source OS component, it's only something you can opt out from....until it isn't.

woodson 3 days ago | parent | prev [-]

They got mad because you got Recall in an update, no matter whether you wanted it or not, and after another update you couldn’t uninstall it anymore. No choice.

TiredOfLife 3 days ago | parent [-]

Recall isn't even released yet.