| ▲ | marginalia_nu 4 days ago |
| > There's no easy way to keep sync, either. Look at CAP theorem. You can decide which leg you can do without, but you can't solve the distributed computing "problem". Best is just be aware of what tradeoff you're making. Git has largely solved asynchronous decentralized collaboration, but it requires file formats that are ideally as human understandable as machine-readable, or at least diffable/mergable in a way where both humans and machines can understand the process and results. Admittedly git's ergonomics aren't the best or most user friendly, but it at least shows a different approach to this that undeniably works. |
|
| ▲ | jordanb 4 days ago | parent | next [-] |
| I feel like git set back mainstream acceptance of copy-and-merge workflows possibly forever. The merge workflow is not inherently complicated or convoluted. It's just that git is. When dvcses came out there were three contendors: darcs, mercurial and git. I evaluated all three and found darcs was the most intuitive but it was very slow. Git was a confused mess, and hg was a great compromise between fast and having a simple and intuitive merge model. I became a big hg advocate but I eventually lost that battle and had to become a git expert. I spent a few years being the guy who could untangle the mess when a junior messed up a rebase merge then did a push --force to upstream. Now I think I'm too git-brained to think about the problem with a clear head anymore, but I think it's a failure mostly attributable to git that dvcs has never found any uptake outside of software development and the fact that we as developers see dvcs as a "solved problem" outside more tooling around git is a failure of imagination. |
| |
| ▲ | robenkleene 4 days ago | parent | next [-] | | > The merge workflow is not inherently complicated or convoluted. It's just that git is. What makes merging in git complicated? And what's better about darcs and mercurial? (PS Not disagreeing just curious, I've worked in Mercurial and git and personally I've never noticed a difference, but that doesn't mean there isn't one.) | | |
| ▲ | WorldMaker 4 days ago | parent [-] | | Darcs is a special case because it coevolved a predecessor/fork/alternative to CRDTs [0] (called "Patch Theory"). Darcs was slow because darcs supported a lot of auto-merging operations git or mercurial can't because they don't have the data structures for it. Darcs had a lot of smarts in its patch-oriented data structures, but sadly a lot of those smarts in worst cases (which were too common) led to exponential blowouts in performance. The lovely thing was that often when Darcs came out of that slow down it had a great, smart answer. But a lot of people's source control workflows don't have time to wait on their source control system to reason through an O(n ^ 2) or worse O(n ^ n) problem space. To find a CRDT-like "no conflict" solution or even a minimal conflict that is a smaller diff than a cheap three-way diff. [0] Where CRDTs spent most of a couple of decades shooting for the stars and assuming "Conflict-Free" was manifest destiny/fate rather than a dream in a cruel pragmatic world of conflicts, Darcs was built for source control so knew emphatically that conflicts weren't avoidable. We're finally at the point where CRDTs are starting to take seriously that conflicts are unavoidable in real life data and trying new pragmatic approaches to "Conflict-Infrequent" rather that "Conflict-Free". | | |
| ▲ | account42 3 days ago | parent [-] | | At the end of the day all of these have the user start with state A turn that into state B and then commit that. How the that operation is stored internally (as a snapshot of the state or as a patch generated at commit time) is really irrelevant to the options that are available for resolving conflicts at merge time. Auto-merging code is also a double-edged sword - just because you can merge something at the VCS-level does not mean that the result is sensible at the format (programming language) or conceptual (user expectation) levels. | | |
| ▲ | WorldMaker 3 days ago | parent [-] | | Having used darcs for a while and still being a fan of it despite having followed everyone to git, the data storage is not irrelevant and does affect the number of conflicts to resolve and the information to resolve it. It wasn't just "auto-merging" that is darcs' superpower, it's in how many things that today in git would need to be handled in merges that darcs wouldn't even consider a merge, because its data structure doesn't. Darcs is much better than git at cherry picking, for instance, where you take just one patch (commit) from the middle of another branch. Darcs could do that without "history rewriting" in that the patch (commit) would stay the same even though its "place in line" was drastically moved. That patch's ID would stay the same, any signatures it might have would stay the same, etc, just its order in "commit log" would be different. If you later pulled the rest of that branch, that also wouldn't be a "merge" as darcs would already understand the relative order of those patches and "just" reorder them (if necessary), again without changing any of the patch contents (ID, signatures, etc). Darcs also has a few higher level patch concepts than just "line-by-line diffs", such as one that tracks variable renames. If you changed files in another branch making use of an older name of a variable and eventually merge it into a branch with the variable rename, the combination of the two patches (commits) would use the new name consistently, without a manual merge of the conflicting lines changed between the two, because darcs understands the higher level intent a little better there (sort of), and encodes it in its data structures as a different thing. Darcs absolutely won't (and knows that it can't) save you from conflicts and manual merge resolution, there are still plenty of opportunities for those in any normal, healthy codebase, but it gives you tools to focus on the ones that matter most. Also yes, a merge tool can't always verify that the final output is correct or builds (the high level rename tool, for instance, is still basically a find-and-replace and can be over-correct false positives and and miss false negatives). But it's still quite relevant to merges the types of merges you need to resolve in the first place, and how often they occur, and what qualifies as a merge operation in the first place. Though maybe you also are trying to argue the semantics of what constitutes a "merge", "conflicts", and an "integration"? Darcs won't save you from "continuous integration" tools either, but it will work to save your continuous integration tools from certain types of history rewriting. "At the end of the day" the state-of-the-art of VCS on-disk representation and integration models and merge algorithms isn't a solved problem and there are lots of data structures and higher level constructs that tools like git haven't applied yet and/or that have yet to be invented. Innovation is still possible. Darcs does some cool things. Pijul does some cool things. git was somewhat intentionally designed to be the "dumb" in comparison to darcs' "smart", it is even encoded in the self-deprecating name (from Britishisms such as "you stupid git"). It's nice to remind ourselves that while git is a welcome status quo (it is better than a lot of things it replaced like CVS and SVN), it is not the final form of VCS nor some some sort of ur-VCS which all future others will derive and resembles all its predecessors (Darcs predates git and was an influence in several ways, though most of those ways are convenience flags that are easy to miss like `git add -p` or tools that do similar jobs in an underwhelming fashion by comparison like `git cherry-pick`). |
|
|
| |
| ▲ | TylerE 4 days ago | parent | prev | next [-] | | That git won over hg is a true tragedy. The hg ux/ui is so much better. | |
| ▲ | marginalia_nu 4 days ago | parent | prev | next [-] | | Yeah I mostly agree with this. I'm mostly talking about git the model, rather than git the tool when I say git has solved the problem of asynchronous decentralized collaboration. For local-first async collaboration on something that isn't software development, you'd likely want something that is a lot more polished, and has a much more streamlined feature set. I think ultimately very few of git's chafing points are due to its model of async decentralized collaboration. | |
| ▲ | jorvi 4 days ago | parent | prev | next [-] | | Hey, another hg enjoyer! I miss it too. So much simpler. Apparently 'jujutsu' makes the git workflow a bit more intuitive. Its something that runs atop git, although I don't know how much it messes up the history if you read it out with plain git. All in all I'm pretty happy with git compared to the olden days of subversion. TortoiseSVN was a struggle haha. | |
| ▲ | rpdillon 4 days ago | parent | prev [-] | | Ah, I miss hg. Another cool aspect is that because it was written in Python and available as a library, I was able to write a straightforward distributed wiki based on hg in a single Python script. So much fun. |
|
|
| ▲ | jjcob 4 days ago | parent | prev | next [-] |
| Git works, but it leaves conflict resolution up to the user. It's good for a tool for professional users, but I don't see it being adopted for mainstream use. |
| |
| ▲ | PaulHoule 4 days ago | parent | next [-] | | The funny thing about it is I see git being used in enterprise situation for non-dev users to manage files, often with a web back end. For instance you can tell the average person to try editing a file with the web interface in git and they're likely to succeed. People say git is too "complex" or "complicated" but I never saw end users succeeding with CVS or Mercurial or SVN or Visual Sourcesafe the way they do with Git. "Enterprise" tools (such as business rules engines) frequently prove themselves "not ready for the enterprise" because they don't have proper answers to version control, something essential when you have more than one person working on something. People say "do you really need (the index)" or other things git has but git seemed to get over the Ashby's law threshold and have enough internal complexity to confront the essential complexity of enterprise version control. | | |
| ▲ | jjcob 4 days ago | parent [-] | | > you can tell the average person to try editing a file with the web interface Yes, but then you are not using a "local first" tool but a typical server based workflow. |
| |
| ▲ | criddell 4 days ago | parent | prev [-] | | How can you avoid leaving conflict resolution up to the user? |
|
|
| ▲ | robenkleene 4 days ago | parent | prev | next [-] |
| The problem with "human understandable" with respect to resolving syncing conflicts, is that's not an achievable goal for anything that's not text first. E.g., visual and audio content will never fit well into that model. |
| |
| ▲ | marginalia_nu 4 days ago | parent [-] | | I can undo and redo edits in these mediums. Why can't these edits be saved and reapplied? Not saying this would be in any way easy, but I'm also not seeing any inherent obstacles. | | |
| ▲ | robenkleene 4 days ago | parent [-] | | Nothing. But that's not what the comment I was replying to was suggesting: > It requires file formats that are ideally as human understandable as machine-readable, or at least diffable/mergable in a way where both humans and machines can understand the process and results. What you're proposing is tracking and merging operations rather than the result of those operations (which is roughly the basis of CRDTs as well). I do think there's some problems with that approach as well though (e.g., what do you do about computationally expensive changes like 3D renders?). But for the parts of the app that fit well into this model, we're already seeing collaborative editing implemented this way, e.g., both Lightroom and Photoshop support it. To be clear though, I think the only sensible way to process merges in this world is via a GUI application that can represent the artifact being merged (e.g., visual/audio content). So you still wouldn't use Git to merge conflicts with this approach (e.g., a simple reason why is that what's to stop an underlying binary asset that a stack of operations is being applied to from having conflicting changes if you're just using Git?). Even some non-binary edits can't be represented as "human readable" text, e.g., say adding a layer of a vector drawing of rabbit. |
|
|
|
| ▲ | WorldMaker 4 days ago | parent | prev | next [-] |
| Git's original merge algorithm was intentionally dumb, it was mostly just a basic three-way diff/merge. (Git's merge algorithms have gotten smarter since then.) Three-way merges in general are easier to write than the CRDTs as the article suggests. They are also far more useful than just the file formats you would think to source control in get; it's a relatively easy algorithm to apply to any data structure you might want to try. For a hobby project I took a local-first-like approach even though the app is an MPA, partly just because I could. It uses a real simple three-way merge technique of storing the user's active "document" (JSON document) and the last known saved document. When it pulls an updated remote "document" it can very simply "replay" the changes between the active document and the last known saved document to the active document to create a new active document. This "app" currently only has user-owned documents so I don't generally compute the difference between the remote update and the last saved to mark conflicted fields to the user, but that would be the easy next step. In this case the "documents" are in the JSON sense of complex schemas (including Zod schemas) and the diff operation is a lot of very simple `===` checks. It's an easy to implement pattern and feels smarter than it should with good JSON schemas. The complicated parts, as always, are the User Experience of it, more than anything. How do you try to make it obvious that there are unsaved changes? (In this app: big Save buttons that go from disabled states to brightly colored ones.) If you allow users to create drafts that have never been saved next to items that have at least one save, how you visualize that? (For one document type, I had to iterate on Draft markers a few times to make it clearer something wasn't yet saved remotely.) Do you need a "revert changes" button to toss a draft? I think sometimes using a complicated sync tool like CRDTs makes you think you can escape the equally complicated User Experience problems, but in the end the User Experience matters more than whatever your data structure is and no matter how complicated your merge algorithm is. I think it's also easy to see all the recommendations for complex merge algorithms like CRDTs (which absolutely have their place and are very cool for what they can accomplish) and miss that some of the ancient merge algorithms are simple and dumb and easy to write patterns. |
|
| ▲ | taeric 4 days ago | parent | prev | next [-] |
| Git does no such thing. Plain text files with free form merging capabilities somewhat solves the idea that you can merge things. But, to make that work, the heavy lifting has to be done by the users of the system. So, sure, if you are saying "people trained to use git" there, I agree. And you wind up having all sorts of implicit rules and guidelines that you follow to make it more manageable. This is a lot like saying roads have solved how to get people using dangerous equipment on a regular basis without killing everyone. Only true if you train the drivers on the rules of the road. And there are many rules that people wind up internalizing as they get older and more experienced. |
|
| ▲ | poszlem 4 days ago | parent | prev | next [-] |
| Git solved this by pushing the syncing burden onto people. It’s no surprise, merge conflicts are famously tricky and always cause headaches. But for apps, syncing really ought to be handled by the machine. |
| |
| ▲ | marginalia_nu 4 days ago | parent [-] | | If you want local-first, conflict resolution is something you're unlikely to be able to avoid. The other option is to say "whoops" and arbitrarily throw away a change when there is a conflict due to a spotty wifi or some such. Fortunately, a lot of what chafes with git are UX issues more than anything else. Its abstractions are leaky, and its default settings are outright bad. It's very much a tool built by and for kernel developers with all that entails. The principle itself has a lot of redeemable qualities, and could be applied to other similar syncing problems without most of the sharp edges that come with the particular implementation seen in git. |
|
|
| ▲ | 4 days ago | parent | prev | next [-] |
| [deleted] |
|
| ▲ | recursivedoubts 4 days ago | parent | prev | next [-] |
| “solved” imagine asking a normie to deal with a merge conflict |
| |
| ▲ | marginalia_nu 4 days ago | parent [-] | | That's an UX issue with git, not really what's being discussed. | | |
| ▲ | recursivedoubts 4 days ago | parent [-] | | I don’t agree at all. Merging conflicts correctly is often incredibly hard and requires judgement and understanding of semantics and ramifications that are difficult for even skilled developers. | | |
| ▲ | balamatom 4 days ago | parent | next [-] | | Who did what when why? Everyone has understanding of those semantics. It's literally entirely on a computer. If that somehow makes it harder to answer basic human questions about the complex things we're using it for, well that means we've got a problem folks. The problem is with comprehensibility, and it's entrenched (because the only way for a piece of software to outlive its 50 incompatible analogs and reach mass recognition is to become entrenched; not to represent its domain perfectly) The issue lies in how the tools that we've currently converged on (e.g. Git) represent the semantics of our activity: what information is retained at what granularity determines what workflows are required of the user; and thence what operations the user comes to expect to be "easy" or "hard", "complex" or "simple". (Every interactive program is a teaching aid of itself, like how when you grok a system you can whip together a poor copy of it in a couple hours out of shit and sticks) Consider Git's second cousin the CRDT, where "merges" are just a few tokens long, so they happen automatically all the time with good results. Helped in application context by how a "shared editor" interface is considerably more interactive than the "manually versioned folder" approach of Git. There's shared backspace. Git was designed for emailing patches over dialup, there it obviously pays to be precise; and it's also awesome at enabling endless bikeshedding on projects far less essential than the kernel, thanks to the proprietary extension that are Pull Requests. Probably nobody has any real incentive to pull off anything better, if the value proposition of the existing solution starts with "it has come to be expected". But it's not right to say it's inherently hard, some of us have just become used to making it needlessly hard on ourselves, and that's whose breakfast the bots are now eating (shoo, bots! scram) | | | |
| ▲ | account42 3 days ago | parent | prev [-] | | So in other words it requires the skills you need to make an edit in the first place. |
|
|
|
|
| ▲ | threatofrain 4 days ago | parent | prev | next [-] |
| Gits approach is to make people solve it with organic intelligence. |
|
| ▲ | JustExAWS 4 days ago | parent | prev [-] |
| Git hasn’t solved it in a way that any normal person would want to deal with. |