Remix.run Logo
amazingamazing 2 hours ago

I've been toying around an architecture that sets things up such that the data for each user is actually stored with each user and only materialized on demand, such that many data leaks would yield little since the server doesn't actually store most of the user data. I mention this since this sorts of leaks are inevitable as long as people are fallible. I feel the correct solution is to not store user data to begin with.

some problems I've identified:

1. suppose you have x users and y groups, of which require some subset of x. joining the data on demand can become expensive, O(x*y).

2. the main usefulness of such an architecture is if the data itself is stored with the user, but as group sizes y increase, a single user's data being offline makes aggregate usecases more difficult. this would lend itself to replicating the data server side, but that would defeat the purpose

3. assuming the previous two are solved, which is very difficult to say the least, how do you secure the data for the user such that someone who knows about this architecture can't just go to the clients and trivially scrape all of the data (per user)?

4. how do you allow for these features without allowing people to modify their data in ways you don't want to allow? encryption?

a concrete example of this would be if HN had it so that each user had a sqlite database that stored all of the posts made per user. then, HN server would actually go and fetch the data for each of the posters to then show the regular page. presumably here if a data of a given user is inaccessible then their data would be omitted.

yellow_postit 2 hours ago | parent [-]

I’ve always liked this idea but I think it eventually ends back up with essentially our current system. Users have multiple devices so you quickly get to needing a sync service. Once that gets complex enough, then people will outsource to a third party and then we are back to a FB/Google/Apple sign in and data mgmt world.