Remix.run Logo
adamwk 9 hours ago

As someone who used phabricator and mercurial, using GitHub and git again feels like going back to the stone ages. Hopefully this and jujutsu can recreate stacked-diff flow of phabricator.

It’s not just nice for monorepos. It makes both reviewing and working on long-running feature projects so much nicer. It encourages smaller PRs or diffs so that reviews are quick and easy to do in between builds (whereas long pull requests take a big chunk of time).

smallmancontrov 8 hours ago | parent | next [-]

I'm so glad git won the dvcs war. There was a solid decade where mercurial kept promoting itself as "faster than git*†‡" and every time I tried it wound up being dog slow (always) or broken (some of the time). Git is fugly but it's fast, reliable, and fugly, and I can work with that.

alwillis a minute ago | parent | next [-]

> I'm so glad git won the dvcs war. There was a solid decade where mercurial kept promoting itself as "faster than git†‡"

It wasn't the Mercurial team saying it was faster than Git; that was Facebook after contributing a bunch of patches after testing Mercurial on their very large mono-repo in 2014 [1]:

For our repository, enabling Watchman integration has made Mercurial’s status command more than 5x faster than Git’s status command. Other commands that look for changed files–like diff, update, and commit—also became faster.*

In fact they liked Mercurial so much they essentially cloned it to create their own dvcs, Sapling [2].

Today, most of the core of Mercurial has been rewritten in Rust; when Facebook did their testing, Mercurial was nearly 100% Python. That's where the "Mercurial is slow" thing came from; launching a large Python 2.x app took a while back in the day.

I was messing with an old Mercurial repo recently… it was like a breath of fresh air. If I can push to GitHub using Mercurial… sign me up.

[1]: https://engineering.fb.com/2014/01/07/core-infra/scaling-mer...

[2]: https://sapling-scm.com/

steveklabnik 8 hours ago | parent | prev | next [-]

What is kind of funny here is that you're right locally. At the same time, the larger tech companies (Meta and Google, specifically) ended up building off of hg and not git because (at the time, especially) git cannot scale up to their use cases. So while the git CLI was super fast, and the hg CLI was slow, "performance" means more than just CLI speed.

I was never a fan of hg either, but now I can use jj, and get some of those benefits without actually using it directly.

landr0id 3 hours ago | parent | next [-]

>At the same time, the larger tech companies (Meta and Google, specifically) ended up building off of hg and not git because (at the time, especially) git cannot scale up to their use cases.

Fun story: I don't really know what Microsoft's server-side infra looked like when they migrated the OS repo to git (which, contrary to the name, contains more than just stuff related to the Windows OS), but after a few years they started to hit some object scaling limitations where the easiest solution was to just freeze the "os" repo and roll everyone over to "os2".

MASNeo 13 minutes ago | parent | next [-]

“roll everyone over to os2”

The IBM crowd may feel vindicated at last.

kqr an hour ago | parent | prev | next [-]

I have heard that the Google monorepo is called google3 but I don't know why. Maybe those things are common...

ongy 5 minutes ago | parent | next [-]

It's the third attempt of building the mono repo.

But not the 3rd mono repo on the same technology to avoid some scaling limit.

roca 28 minutes ago | parent | prev [-]

It's not that.

vasco 20 minutes ago | parent [-]

Thanks for explaining!

w0m an hour ago | parent | prev [-]

didn't msft write an ~entire new file system specifically to scale git to the windows code base?

I have fuzzy memories on reading about it.

jamesfinlayson 10 minutes ago | parent | next [-]

I thought Microsoft made a number of improvements to git to allow it work with all of their internal repos.

landr0id 33 minutes ago | parent | prev [-]

They wrote something that allowed them to virtualize Git -- can't remember the name of that. But it basically hydrated files on-demand when accessed in the filesystem.

The problem was I think something to do with like the number of git objects that it was scaling to causing crazy server load or something. I don't remember the technical details, but definitely something involving the scale of git objects.

dijit 34 minutes ago | parent | prev | next [-]

Small nit: Googles monorepo is based on Perforce.

I think what happened is Google bought a license for source code and customised it.

steveklabnik 31 minutes ago | parent [-]

Yes, the server is based on Perforce, called Piper, but the CLI is based on mercurial. So locally you're doing hg and then when you create a CL, it translates it into what p4 needs.

surajrmal 8 minutes ago | parent [-]

Depends on what frontend tool you use. You can use either. These days you can also use jj. I'm not sure the backend resembles peforce any longer.

smallmancontrov 7 hours ago | parent | prev [-]

Right, and I'm glad there are projects serving The Cathedral, but I live in The Bazaar so I'm glad The Bazaar won.

The efforts to sell priest robes to fruit vendors were a little silly, but I'm glad they didn't catch on because if they had caught on they no longer would have been silly.

eqvinox 8 hours ago | parent | prev | next [-]

I remember using darcs, but the repos I was using it with were so small as to performance really not mattering…

riffraff 33 minutes ago | parent [-]

I remember darcs fondly but even with tiny repos (maybe 5-6 people working on it) we hit the "exponential merge" issues.

It worked just fine 99% of the time and then 1% it became completely unusable.

raincole 8 hours ago | parent | prev | next [-]

This matches my experience 100%. I was about to write a similar comment before I see yours.

Leynos 8 hours ago | parent | prev | next [-]

I just used it because I preferred the UX.

forrestthewoods 8 hours ago | parent | prev | next [-]

Mercurial has a strictly superior API. The issue is solely that OG Mercurial was written in Python.

Git is super mid. It’s a shame that Git and GitHub are so dominant that VCS tooling has stagnated. It could be so so so much better!

awesome_dude 8 hours ago | parent | next [-]

Whatever your opinion on one tool or another might be - it does seem weird that the "market" has been captured by what you are saying is a lesser product.

IOW, what do you know that nobody else does?

forrestthewoods 8 hours ago | parent | next [-]

Worse products win all the time. Inertia is almost impossible to overcome. VHS vs Betamax is a classic. iPod wasn’t the best mp3 player but being a better mp3 player wasn’t enough to claw market share.

Google and Meta don’t use Git and GitHub. Sapling and Phabricator much much better (when supported by a massive internal team)

aaronbrethorst 7 hours ago | parent [-]

What was the better mp3 player than the iPod?

corndoge 2 hours ago | parent [-]

sansa clip+

guelo 8 hours ago | parent | prev | next [-]

Network effects and marketing can easily prevent better tools from winning.

awesome_dude 7 hours ago | parent [-]

I mean, in the fickle world that is TECH, I am struggling to believe that that's what's happened.

I personally went from .latest.latest.latest.use.this (naming versions as latest) to tortoise SVN (which I struggled with) to Git (which I also was one of those "walk around with a few memorised commands" people that don't actually know how to use it) to reading the fine manual (well 2.5 chapters of it) to being an evangalist.

I've tried Mercurial, and, frankly, it was just as black magic as Git was to me.

That's network effects.

But my counter is - I've not found Mercurial to be any better, not at all.

I have made multiple attempts to use it, but it's just not doing what I want.

And that's why I'm asking, is it any better, or not.

WolfeReader 7 hours ago | parent [-]

Mercurial has a more consistent CLI, a really good default GUI (TortoiseHg), and the ability to remember what branch a commit was made on. It's a much easier tool to teach to new developers.

awesome_dude 7 hours ago | parent [-]

Hmm, that feels a bit subjective - I'm not going to say X is easier than Y when I've just finished saying that I found both tools to have a lot of black magic happening.

But what I will point out, for better or worse, people are now looking at LLMs as Git masters, which is effectively making the LLM the UI which is going to have the effect of removing any assumed advantage of whichever is the "superior" UX

I do wish to make absolutely clear that I personally am not yet ready to completely delegate VCS work to LLMs - as I have pointed out I have what I like to think of as an advanced understanding of the tools, which affords me the luxury of not having an LLM shoot me in the foot, that is soley reserved as my own doing :)

esafak 8 hours ago | parent | prev | next [-]

That worse is better, and some people don't know better or care.

dwattttt 7 hours ago | parent [-]

"better" in that sentence is very specific. Worse is also worse, and if you're one of the people for whom the "better" side of a solution doesn't apply, you're left with a mess that people celebrate.

jrochkind1 8 hours ago | parent | prev [-]

Welcome to VHS and Betamax. the superior product does not always win the market.

Per_Bothner 4 hours ago | parent [-]

Not always, but in this case the superior product (i.e. VHS) won. At initial release, Beta could only record an hour of content, while VHS could record 2 hours. Huge difference in functionality. The quality difference was there, but pretty modest.

worldsayshi 8 hours ago | parent | prev | next [-]

Maybe forgejo has a shot?

outworlder 8 hours ago | parent | prev [-]

> The issue is solely that OG Mercurial was written in Python.

Are we back to "programming language X is slow" assertions? I thought those had died long ago.

Better algorithms win over 'better' programming languages every single time. Git is really simple and efficient. You could reimplement it in Python and I doubt it would see any significant slowness. Heck, git was originally implemented as a handful of low level binaries stitched together with shell scripts.

jmalicki 3 hours ago | parent | next [-]

Every time I've rewritten something from Python into Java, Scala, or Rust it has gotten around ~30x faster. Plus, now I can multithread too for even more speedups.

Python is absurdly slow - every method call is a string dict lookup (slots are way underused), everything is all dicts all the time, the bytecode doesn't specialize at all to observed types, it is a uniquely horrible slow language.

I love it, but python is almost uniquely a slow language.

Algorithms matter, but if you have good algorithms, or you're already linear time and just have a ton of data, rewriting something from a single-threaded Python program to a multithreaded rust program I've seen 500x speedups, where the algorithms were not improved at all.

It's the difference between a program running overnight vs. in 30 seconds. And if there are problems, the iteration speed from that is huge.

eru 2 hours ago | parent | next [-]

> [...], it is a uniquely horrible slow language.

To be fair, Python as implement today is horribly slow. You could leave the language the same but apply all the tricks and heroic efforts they used to make JavaScript fast. The language would be the same, but the implementations would be faster.

Of course, in practice the available implementations are very much part of the language and its ecosystems; especially for a language like Python which is so defined by its dominant implementation of CPython.

jmalicki 2 hours ago | parent [-]

Fair! I guess I didn't mean language as such, but as used.

But a lot of the monkey-patching kind of things and dynamism of python also means a lot of those sorts of things have to be re-checked often for correctness, so it does take a ton of optimizations off the table. (Of course, those are rare corner cases, so compilers like pypy have been able to optimize for the "happy case" and have a slow fall-back path - but pypy had a ton of incompatibility issues and now seems to be dying).

dtech 2 hours ago | parent [-]

Javascript has a lot of the same theoretical dynamism, yet V8 and WebkitCore were able to make it fast

eru 31 minutes ago | parent [-]

Yes, with heroic effort. It's really a triumph of compiler / vm engineers over language designers.

byroot 2 hours ago | parent | prev [-]

> every method call is a string dict lookup

Doesn't the Python VM have inline caches? [0]

https://en.wikipedia.org/wiki/Inline_caching

jmalicki 2 hours ago | parent [-]

I think that's a new thing from like python 3.12+ or something after I stopped using Python as much.

It didn't used to.

EDIT: python 3.11+: https://peps.python.org/pep-0659/

surajrmal 13 minutes ago | parent | prev | next [-]

You must belong to the club of folks who use hashmaps to store 100 objects. It's amazing how much we've brainwashed folks to focus on algorithms and lose sight of how to actually properly optimize code. Being aware of how your code interacts with cache is incredibly important. There are many cases of using slower algorithms to do work faster purely because it's more hardware friendly.

The reason that some more modern tools, like jj, really blow git out of the water in terms of performance is because they make good choices, such as doing a lot of transformations entirely in memory rather than via the filesystem. It's also because it's written in a language that can execute efficiently. Luckily, it's clear that modern tools like jj are heavily inspired by mercurial so we're not doomed to the ux and performance git binds us with.

kuschku 7 hours ago | parent | prev | next [-]

I've rewritten a python tool in go, 1:1. And that turned something that was so slow that it was basically a toy, into something so fast that it became not just usable, but an essential asset.

Later on I also changed some of the algorithms to faster ones, but their impact was much lower than the language change.

Diggsey 6 hours ago | parent | prev | next [-]

> git was originally implemented as a handful of low level binaries stitched together with shell scripts.

A bunch of low level binaries stitched together with shell scripts is a lot faster than python, so not really sure what the point of this comparison is.

Python is an extremely versatile language, but if what you're doing is computing hashes and diffs, and generally doing entirely CPU-bound work, then it's objectively the wrong tool, unless you can delegate that to a fast, native kernel, in which case you're not actually using Python anymore.

eru 2 hours ago | parent [-]

Well, you can and people do use Python to stitch together low level C code. In that sense, you could go the early git approach, but use Python instead of shell as the glue.

eru 2 hours ago | parent | prev | next [-]

> Better algorithms win over 'better' programming languages every single time.

That's often true, but not "every single time".

20k 4 hours ago | parent | prev | next [-]

Python is by far the slowest programming language, an order of magnitude slower than other languages

One of the reason mercurial lost the dvcs battle is because of its performance - even the mercurial folks admitted that was at least in part because of python

ragall 3 hours ago | parent | prev | next [-]

> I thought those had died long ago.

No, it's always been true. It's just that at some point people got bored and tired of pointing it out.

bmitc 6 hours ago | parent | prev | next [-]

You barely have to try to have Python be noticeably slow. It's the only language I have ever used where I was even aware that a programming language could be slow.

jstimpfle 6 hours ago | parent | prev | next [-]

[flagged]

forrestthewoods 4 hours ago | parent [-]

[flagged]

forrestthewoods 8 hours ago | parent | prev [-]

They died because everyone knows that Python is infact very very slow. And that’s just totally fine for a vast number of glue operations.

It’s amusing you call Git fast. It’s notoriously problematic for large repos such that virtually every BigTech company has made a custom rewrite at some point or another!

jstimpfle 6 hours ago | parent [-]

Now that is interesting too, because git is very fast for all I have ever done. It may not scale to Google monorepo size, it would ve the wrong tool for that. But if you are talking Linux kernel source scale, it asolutely, is fast enough even for that.

For everything I've ever done, git was practically instant (except network IO of course). It's one of the fastest and most reliable tools I know. If it isn't fast for you, chances are you are on a slow Windows filesysrem additionally impeded by a Virus scanner.

forrestthewoods 6 hours ago | parent [-]

The fact that Git has an extremely strong preference for storing full and complete history on every machine is a major annoyance! “Except for network IO” is not a valid excuse imho. Cloning the Linux kernel should take only a few seconds. It does not. This is slow and bad.

The mere fact that Git is unable to handle large binary files makes it an unusable tool for literally every project I have ever worked on in my entire career.

spockz 42 minutes ago | parent | next [-]

Git-lfs exists for a while now. Does that fix your issue? Or do you mean that it doesn’t support binary diffs?

pabs3 an hour ago | parent | prev [-]

Git handles large text files and large directories fairly poorly too.

bmitc 6 hours ago | parent | prev [-]

Git is not remotely fast for large projects.

kardianos 9 hours ago | parent | prev | next [-]

I continue to use gerrit explicitly because I cannot stand github reviews. Yes, in theory, make changes small. But if I'm doing larger work (like updating a vendored dep, that I still review), reviewing files is... not great... in github.

adityaathalye 14 minutes ago | parent | next [-]

Same team, and a rare hill I'm willing to die on.

Rant incoming...

Boy do I hate Github/Lab/Bucket style code reviews with a burning passion. Who the hell loses code review history? A record of the very thing that made my code better? The "why" of it all, that I am guaranteed to forget tomorrow morning.

Nobody would be using `--force` or `--force-with-lease` as a normal part of development workflow, of their own volition, if they had read that part of the git-push manpage and been horrified (as one should be).

The magit key sequence for this abominable operation is `P "f-u"`. And every single time I am forced to do it, I read "f-u" as it ought to be read.

Rebase-push is the way to do it (patch sets in Gerrit).

Rebase-force-push is absolutely not.

You see, any development workflow inevitably has to integrate changes from at least one other branch (typically latest develop or master), without destroying change history, nor review history. Gerrit makes this trivial.

It's a bit difficult to convey exactly why I'm so rah-rah Gerrit, because it is a matter of day-to-day experience of

  - Trivial for committer to send up reviews-preserving rebase-push responses to commit reviews (NO force-push, ever --- that's an "admin" action to *evict* / permanently wipe out disaster scenarios such as when someone accidentally commits and pushes out a plaintext secret or a giant blob of the executable of the source code etc.).

  - Fast-for-the-reviewer, per-commit, diff-based, inline-commenting code reviews.

  - The years-apart experience of being able to dig into any part of one's (immutable) software change history to offer a teaching moment to someone new to the team.
... to name a few key ones.
adityaathalye 8 minutes ago | parent [-]

Slapping this "stacked diff" business on top of something so broken as Github/lab/bucket is a concrete example of... https://en.wikipedia.org/wiki/Lipstick_on_a_pig

tcoff91 9 hours ago | parent | prev [-]

Most editors have some kind of way to review github PRs in your editor. VSCode has a great one. I use octo.nvim since I use neovim.

nine_k 8 hours ago | parent [-]

Can these tools e.g. do per-commit review? I mean, it's not the UI what's the problem (though it's not ideal), it's the whole idea of commenting the entire PR at once, partly ignoring the fact that the code in it changes with more commits pushed.

Phabricator and even Gerrit are significantly nicer.

dathanb82 5 hours ago | parent [-]

Unless you have a “every commit must build” rule, why would you review commits independently? The entire PR is the change set - what’s problematic about reviewing it as such?

riffraff 24 minutes ago | parent | next [-]

There's a certain set of changes which are just easier to review as stacked independent commits.

Like, you can do a change that introduced a new API and one that updates all usages.

It's just easier to review those independently.

Or, you may have workflows where you have different versions of schemas and you always keep the old ones. Then you can do two commits (copy X to X+1; update X+1) where the change is obvious, rather than seeing a single diff which is just a huge new file.

I'm sure there's more cases. It's not super common but it is convenient.

steveklabnik 5 hours ago | parent | prev [-]

In stacked diffs system, each commit is expected to land cleanly, yes.

verst 2 hours ago | parent [-]

But isn't that why you would squash before merging your PR? If you define a rule that PRs must be squashed you would still have the per commit build.

steveklabnik 34 minutes ago | parent [-]

Squash merge is an artifact of PRs encouraging you to add commits instead of amending them, due to GitHub not being able to show you proper interdiffs, and making comments disappear when you change a diff at that line. In that context, when you add fixup commits, sure, squashing makes sense, but the stacked diffs approach encourages you to create commits that look like you want them to look like directly, instead of requiring you to roll them up at the end.

calebio 8 hours ago | parent | prev | next [-]

I miss the Phabricator review UI so much.

treefry 2 hours ago | parent | next [-]

Same here. Don't understand why Github hasn't supported this until now. I'm tired of reviewing PRs with thousands of lines of changes, which are getting worse nowadays with vibe coding.

sam_bristow 4 hours ago | parent | prev | next [-]

What does Facebook use internally these days. I'm amazed that the state of review tools is still at or behind what we had a decade ago for the most part.

ivantop 3 hours ago | parent [-]

It’s still phabricator

sam_bristow 3 hours ago | parent [-]

Any idea if their internal version has improved dramatically since they stopped maintaining the public version?

kqr 40 minutes ago | parent [-]

I don't think they ever maintained the public project. Priestly spun off a company to do that.

montag 8 hours ago | parent | prev [-]

Me too. And I'm speaking from using it at Rdio 15 years ago.

Nothing since (Gerrit, Reviewboard, Github, Critique) has measured up...

Rodeoclash an hour ago | parent [-]

Thanks for your work on Rdio. I miss it. Were you around when that guy managed to spam plays to get fake albums to the top of the charts?

nerdypepper 2 hours ago | parent | prev | next [-]

tangled.org supports native stacking with jujutsu, unlike github's implementation, you don't need to create a new branch per change: https://blog.tangled.org/stacking/

eru 2 hours ago | parent | prev [-]

Oh, phabricator. I hated that tool with a passion. It always destroyed my carefully curated PR branch history.

See https://stackoverflow.com/questions/20756320/how-to-prevent-...

illamint 2 hours ago | parent [-]

Good. That's the point.

eru 2 hours ago | parent [-]

The point of what?

I hope they fixed phabricator in the meantime.

dbetteridge an hour ago | parent [-]

The point is the main branch reflects the "units" of change, not the individual commits to get there.

One merged pr is a unit of change, at the end of the day the steps you took to produce it aren't relevant to others.

My opinion of course, I'm open to understanding why preserving individual commits is beneficial

eru 29 minutes ago | parent [-]

You can get what you want from `git log --first-parent` without having to toss out information.

See how the Linux kernel handles git history to see a good example of non-linear history and where it helps. They use merge commits, ie commits with more than one ancestor, all the time.