Remix.run Logo
magno 3 days ago

I built Wikli (https://www.wikli.com/) - an AI-powered news aggregator that clusters articles semantically and generates daily digests with editorial oversight.

The Problem: News fatigue is real. Reading 50+ articles daily (from hundred of different sources) to stay informed is unsustainable, but traditional aggregators just dump links without context.

Wikli uses a three-stage pipeline:

Scraping & Processing (Cloudflare Workers): RSS feeds → content extraction → AI classification Semantic Clustering (Python): Claude groups related articles across sources into coherent stories Digest Generation: AI synthesizes clusters into readable reports with context and TLDR

Technical Highlights:

Cloudflare Workers + PostgreSQL for scraping infrastructure Hybrid content extraction (Readability + Puppeteer fallback for tricky sites) Claude Sonnet 4 for clustering and synthesis (outperformed embedding-based approaches) Theme-based filtering with relevance scoring (0-10 scale per article) Telegram bot with stateless approval workflow for editorial control

What's Different:

Semantic clustering beats chronological or source-based grouping Context from previous digests prevents repetition Human-in-the-loop via Telegram for quality control (can edit title/approve digest) Open architecture: separate Brief Generator (Python) and Scraper API (TypeScript)

Stack: TypeScript, Python, PostgreSQL, Drizzle ORM, Claude/Gemini APIs

The system handles rate limiting across domains, AI API throttling, and includes a DataManager abstraction for centralized data operations. Currently live in Italian at wikli.com - language-agnostic by design but focused on the Italian market for now. A the moment running with two topics (AI innovation and Inter Milano Football Club) via Telegram and wikli.com website.

Happy to get any feedback.

erezsh 2 days ago | parent [-]

It's only in Italian lol