| ▲ | schrodinger 3 days ago |
| I've always thought that a problem with sites like Reddit and Hacker News is that a very small percentage of users engage with "new"; most people only see the posts that have been curated by that small minority which creates _some_ sort of bias (arguably, a positive one). I've wanted to try something like a Hacker News where your homepage shows a random smattering of posts where the probability you'll see any particular one depends on its number of likes. In other words, rather than having a firehose of "new" posts from which a few are elevated to the home page (masses), give everyone a dynamic home page which is mostly items that have been liked by many, but includes a mix of some that haven't made that threshold yet. Maybe instead of pure likes it could be a ratio of likes to views. But the point is some way to engage everyone in the selection of what makes the homepage. It could even be as simple as "keep HN as is, but include 5 posts randomly chosen from recent submissions and tag them as such." Dang, has anything like this been considered? |
|
| ▲ | senko 3 days ago | parent | next [-] |
| > dynamic home page which is mostly items that have been liked by many, but includes a mix of some that haven't made that threshold yet. Maybe instead of pure likes it could be a ratio of likes to views. You've just invented much-hated engagement-maxxing The Algorithm. |
| |
| ▲ | sdwr 2 days ago | parent [-] | | If you want to go one step further, how about tracking users' engagement with the randomized posts, and using it to personalize what they see? Wow, I just invented tiktok and Instagram | | |
| ▲ | schrodinger 7 hours ago | parent [-] | | I had the exact same realization after I posted if you see below (or click this https://news.ycombinator.com/item?id=44998664). However, I wonder why it hasn't been done by text-based platforms like Reddit and Hacker News? Reddit has the drop down with "New" items (for the "curators" to watch) and "Best" (presumably for once the items have been elevated to the front page). HN has the the same: the "New" link versus the home page. Twitter is slightly different, and maybe more of what I'm thinking of, but the content is very different: short form, and original content rather than links to finds with an accompanying discussion. Similar, but not the same. I wonder if that tells me that it just only works with "doomscroll" content — the kind of content where it's plentiful, short, and very little commitment to read each piece (and therefore very little time lost for a "poor" suggestion)? Or if there's something fundamentally different I'm missing? |
|
|
|
| ▲ | Induane 3 days ago | parent | prev | next [-] |
| Curation IS arguably one of the most critical factors. A library is useful because of what it doesn't contain, not because of what it does contain. The hypothetical "library of all possible books" isn't useful to anyone. That's a long way of agreeing with you that there is positive in the duration bias of HackerNews and other sites. Of course anything can be hijacked, and metrics proverbially tend towards becoming targets (and hence a dumb arms race), but the general concept of the value of curation is sound. |
| |
| ▲ | progval 3 days ago | parent | next [-] | | > The hypothetical "library of all possible books" isn't useful to anyone. That's an archive, and it has its own uses for researchers, especially historians. | | |
| ▲ | hexaga 3 days ago | parent | next [-] | | In the spirit of the library, which contains both your comment and mine: > The hypothetical "library of all possible books" isn't useful to anyone. That's not an archive, and has no uses even for researchers, especially not for historians. | |
| ▲ | 400thecat 2 days ago | parent | prev [-] | | a library containing all possible books is no more useful than having a random number generator |
| |
| ▲ | schrodinger 3 days ago | parent | prev [-] | | Thanks -- totally agree that curation is essential, and I suspect my original point may have come across as advocating against curation, which I wasn’t. My goal isn’t to randomize the homepage or flatten quality, but to involve a broader swath of users in the curation process. It’s currently dominated by the few who browse “new”, essentially a self-selected minority of curators. Concretely, I was imagining something like:
* Every new post is shown to a small % of users as part of their regular homepage (not in a “new” tab they’d have to seek out).
* Posts that get engagement from that slice are shown to more users, and so on — a gradual ramp-up based on actual interest rather than early-bird luck. So it’s not removing filtering; it’s just moving from a binary gate (past the goalpost = homepage) to a more continuous, probabilistic exposure curve. Curation still happens, but more people get to participate in it, and the system becomes more robust to time-of-day luck or early vote pile-ons. Anyway, I mostly wanted to clarify that I’m not against filtering -- just having a thought experiment about how we might make it more adaptive and inclusive. Does that clarify my point? Any thoughts? I appreciate your engagement! | | |
| ▲ | schrodinger 3 days ago | parent [-] | | Apologize for the poor formatting in the parent post; I can't edit it now, but here it is as intended: Thanks -- totally agree that curation is essential, and I suspect my original point may have come across as advocating against curation, which I wasn’t.
My goal isn’t to randomize the homepage or flatten quality, but to involve a broader swath of users in the curation process. It’s currently dominated by the few who browse “new”, essentially a self-selected minority of curators. Concretely, I was thikinking something like: * Every new post is shown to a small % of users as part of their regular homepage (not in a “new” tab they’d have to seek out). * Posts that get engagement from that slice are shown to more users, and so on — a gradual ramp-up based on actual interest rather than early-bird luck. So it’s not removing filtering; it’s just moving from a binary gate (past the goalpost = homepage) to a more continuous, probabilistic exposure curve. Curation still happens, but more people get to participate in it, and the system becomes more robust to time-of-day luck or early vote pile-ons. Anyway, I mostly wanted to clarify that I’m not against filtering -- just having a thought experiment about how we might make it more adaptive and inclusive. Does that clarify my point? Any thoughts? I appreciate your engagement! |
|
|
|
| ▲ | positron26 3 days ago | parent | prev | next [-] |
| This is a common idea, and it is a super-hard problem. Sites like Reddit are increasingly sampling by throwing more new posts into hot. That is a solution that may improve yet not really change HN. Ultimately, we don't all want the same feeds and user majorities tend to stifle key minorities such as early adopters. I'm building such a system for prizeforge and am several more steps ahead on this solution. You need a decent background in probability and a more robust understanding of what a "good" outcome is to work on this kind of thing.
https://positron.solutions/careers if you do Rust and care a whole lot about this problem and can deal with the challenges we're going to have with PrizeForge. |
|
| ▲ | jacobobryant 3 days ago | parent | prev | next [-] |
| I've been working on this kind of thing over the past several years (for a while full time as an attempted entrepreneur, now on the side for the past couple years). The latest iteration is https://yakread.com -- hit "take a look around" and you can see the "home page"/a list of recommendations without signing up. The recommendations are personalized, i.e. the probability you'll see any particular post depends on your individual interactions with past posts, if you've signed up. (it does collaborative filtering with spark mllib). So that may be a bit different from what you had in mind, since your comment sounds more like an unpersonalized system, but with some extra exploration thrown in. However in practice I suspect the biggest thing the collaborative filtering is doing at Yakread's current scale (not much) is learning which items are good/bad in general. I also do have some methods baked in for doing exploration. "Epsilon greedy" is a common simple approach where x% of the recommendations are purely random. I do a bit more of a linear thing where I rank all the posts by how many times they've been recommended, then I pick a percentage 0 - 100, then I throw out the top x% most popular (previously recommended) items. that also gives you some flexibility to try out different distributions for the x% variable. The source is at https://github.com/jacobobryant/yakread |
| |
| ▲ | schrodinger 3 days ago | parent | next [-] | | Thank you so much! "Epsilon greedy" sounds like a great approach for the general idea I had in mind — I only glanced it but will read it more deeply. I'll definitely try out your product, but I have to say — an enter your email box is surprisingly high-friction and if you weren't a considerate person I'd met on Hacker News I'd probably close the tab when I saw that. I'll try it out and see if there's a particular reason why you need to capture an email address so early on, but I'd bet if you simplified it you'd get more traffic! | | |
| ▲ | jacobobryant 2 days ago | parent [-] | | Thanks for the feedback. I've structured Yakread (and its predecessors) as a daily email newsletter because it increases user retention tremendously. It's much less work for users if Yakread can show up in a place they already check regularly (their email inbox) rather than trying to get users right away to build a habit of visiting a new website regularly. The most common approach to this problem for consumer products is to make a mobile app so you can send push notifications; I like email a lot more since it's a bit more decentralized and is/can be less pushy (no pun intended). But yeah, I wouldn't be opposed to trying out an alternate landing page that shows you article recommendations up front with a signup box somewhere. Could be interesting to see how both approaches perform in an A/B test. Especially if I ever made a concerted effort to get traffic from HN; then structuring the site a bit more like HN would probably be great. Maybe even aggregate comments from bluesky/mastodon? Once I get through the mountain of other TODO items that's been piling up :). |
| |
| ▲ | sydbarrett74 2 days ago | parent | prev [-] | | Interesting project! I also appreciate being introduced to the digital public infra initiative. |
|
|
| ▲ | rendaw 3 days ago | parent | prev | next [-] |
| I think this is an issue for any curation. It's especially an issue for music - bandcamp shows you music - top sellers, recently sold, liked by creators, etc that all require someone else to do curation. They do have a random listing, but the signal to noise ratio is so low that I doubt anyone listens to it. So the most reliable way to get your music known is to have your personal network buy/comment on your music or share it elsewhere. There's _lots_ of music on there that is fantastic that nearly nobody has purchased. The same goes for indie games on steam. Other people will buy your game if it has enough good reviews (past some threshold) to get a thumbs up icon. So whenever anyone launches a game they need to do networking, there's a desperate scramble to get their followers to review the game so that it reaches critical mass where non-network people start paying attention to it. Algorithms can (I think) detect similar styles, but style is not quality, and what people look for in new works is the parts that they do differently and therefore cannot be correlated by algorithm. I don't think there's any way about it other than getting people to try some things purely randomly, even if most of those are awful. Maybe some way to reward people for taking a look at random selections? |
|
| ▲ | schrodinger 3 days ago | parent | prev | next [-] |
| Actually, isn't this a bit like TikTok and why it allows low-follower profiles to rise to the top on occasion? |
| |
| ▲ | lossolo 3 days ago | parent [-] | | Yes, TikTok uses similar algorithm to push new content, that's why anyone can go viral there. |
|
|
| ▲ | sixtyj 3 days ago | parent | prev | next [-] |
| Randomness is an underdog. Serious question is how often would you tolerate if those randomly displayed posts are absolutely out of your interests? Would you click or skip? Plenty of fish (Canadian-based dating site), programmed by Markus Frind, had a function: during onboarding you could choose types of people you think you prefer (e.g. brunette/blond etc.) and if you haven’t clicked later on them, algo had started to show different results… |
| |
| ▲ | schrodinger 3 days ago | parent [-] | | Interesting anecdote on Plenty of Fish. It’s definitely interesting how people aren’t really good at telling you what they like; empirical evidence is far better. I believe Paul Graham has an essay on a similar topic where if you ask people if they like an idea you have for a product, they are likely to say yes even if they wouldn’t actually use it. But if you ask them how much they would pay for access, or if they’d pay a certain amount, you’d get a more accurate response. FWIW, I wasn’t suggesting pure randomness though, it’s more like probabilistic randomness. Rather than a binary threshold a post must pass to make the homepage that divides the community into curators and consumers, this would show you posts with a degree of randomness with a probability proportional to the likes it’s garnered. Btw, I’m not sure what you meant by randomness is an underdog? Are you implying it’s a nice goal but it rarely works out in practice, perhaps because people actually do fall into natural curator / consumer buckets? | | |
| ▲ | sixtyj 2 days ago | parent [-] | | I meant “underrated” by the word “underdog”. Like we don’t randomness appreciate too much, don’t think about it. But life is full of randomness, like a strange attractor. We try to predict future… And in retrospect, we say that we did say that. After the battle, everyone's a general. :) But it is pure randomness… |
|
|
|
| ▲ | ThrowawayR2 2 days ago | parent | prev | next [-] |
| > "...but includes a mix of some that haven't made that threshold yet" Have you ever browsed /new with showdead turned on? A large fraction of the submissions are either spam/SEO, self-promotion, or just plain off topic. |
|
| ▲ | jokoon 3 days ago | parent | prev | next [-] |
| I see what you mean, and yes that's a good idea Personally, I would just mix posts with a hot sorting, with different time spans, like 24h, 4h, 1h, one third of each. Or maybe display two time spans on the front page. Personally, I use many multi reddit, like 15 of them, I don't see other platforms offering that. |
|
| ▲ | altairprime 3 days ago | parent | prev [-] |
| You could email the mods and ask them to reply. Otherwise they may not see your question. |