Remix.run Logo
otsaloma 7 hours ago

I've been experimenting with using LLMs for a content recommender system. Specifically I've built a news reader app that fetches news articles from multiple RSS feeds, uses an LLM to first deduplicate and then score them. The user can then rate the articles and those ratings are used as few-shot examples in the LLM scoring prompt. Any resulting low score articles (uninteresting to the user) are hidden by default and visible ones scaled by their score on a dynamic CSS grid like on a traditional newspaper front page. Looking good so far, but still testing and tweaking.

https://github.com/otsaloma/news-rss

SamDc73 6 hours ago | parent | next [-]

Nice project.

One thing I’ve always disliked about RSS (and this could actually fix it) is duplicates. When a new LLM model drops for example there are like ~5 blogs about it in my RSS feed saying basically the same thing, and I really only need to read one. Maybe you could collapse similar articles by topic?

Also, would be nice to let users provide a list of feed URLs as a variable instead of hardcoding them.

otsaloma 6 hours ago | parent [-]

Looking at the console messages with the LLM reasoning, it does seem to work quite nicely for deduplication. Your example is probably even a lot easier than news articles, where you can have many articles from different viewpoints about the same event.

I don't actually plan to run this as a service so there's some things hard-coded and the setup is a bit difficult as you need an API key and a proxy. Currently it's just experimentation, although if it works well, I'll probably use it personally.

andoando 6 hours ago | parent | prev [-]

Ive been working on something similar, have you had any issues with the LLM not giving you back a full response for all the input? Ive been using Chat gpt but even on the same request sometimes id give it 20 things to rank, and Id just get back 3 results

otsaloma 5 hours ago | parent [-]

No, it's been working without problems so far. I'm using Anthropic for what it's worth. I ask the LLM to first do some reasoning and then return a JSON array on the final line. Sometimes I've seen some Markdown backticks there, but no irregularities more than that.