> a chain of small, focused pull requests that build on each other — each one independently reviewable.

I have never understood what this even means.

Either changes are orthogonal (and can be merged independently), or they’re not. If they are, they can each be their own PR. If they’re not, why do you want to review them independently?

If you reject change A and approve change B, nothing can merge, because B needs A to proceed. If you approve change A and reject change B, then the feature is only half done.

Is it just about people wanting to separate logical chunks of a change so they can avoid get distracted by other changes? Because that seems like something you can already do by just breaking a PR into commits and letting people look at one of those at a time.

I’ve tried my best to give stacked-diff proponents the benefit of the doubt but none of it actually makes sense to me.

▲

steveklabnik 8 hours ago | parent | next [-]

The canonical example here is a feature for a website that requires both backend and frontend work. The frontend depends on the backend, but the backend does not depend on the frontend. This means that the first commit is "independent" in the sense that it can land without the second, but the second is not, hence, a stack. The root of the stack can always be landed independently of what is on top of it, while the rest of the stack is dependent.

> If they’re not, why do you want to review them independently?

For this example, you may want review from both a backend engineer and a frontend engineer. That said, see this too though:

> that seems like something you can already do by just breaking a PR into commits and letting people look at one of those at a time.

If you do this in a PR, both get assigned to review the whole thing. Each person sees the code that they don't care about, because they're grouped together. Notifications go to all parties instead of the parties who care about each section. Both reviews can proceed independently in a stack, whereas they happen concurrently in a PR.

> If you approve change A and reject change B, then the feature is only half done.

It depends on what you mean by "the feature." Seen as one huge feature, then yes, it's true that it's not finished until both land. But seen as two separate but related features, it's fine to land the independent change before the dependent one: one feature is finished, but the other is not.

▲

Phelinofist 7 hours ago | parent | next [-]

If the layers of a stack have a disjoint set of reviewers things are viewed in separation which might lead to issues if there is no one reviewing the full picture.

	▲	steveklabnik 7 hours ago \| parent [-]
		That is why your forge will show that these two things are related to each other, and you may have the same person assigned to review both. It can show you this particular change in the context of the rest of them. But not every reviewer will always want to see all of the full context at all times.

▲

ninkendo 6 hours ago | parent | prev [-]

> If you do this in a PR, both get assigned to review the whole thing. Each person sees the code that they don't care about, because they're grouped together.

There are two separate issues you’re bringing up:

- Both groups being “assigned” the PR: fixable with code owners files. It’s more elegant than assigning diffs to people: groups of people have ownership over segments of the codebase and are responsible for approving changes to it. Solves the problem way better IMO.

- Both groups “seeing” all the changes: I already said GitHub lets you view single commits during PR review. That is already a solved problem.

And I didn’t even bring up the fact that you can just open a second PR for the frontend change that has the backend commit as the parent. Yes, the second PR is a superset of the first, but we’ve already established that (1) the second change isn’t orthogonal to the first one and can’t be merged independently anyway, and (2) reviewers can select only the commits that are in the frontend range. Generally you just mark the second PR as draft until the first one merges (or do what Gitlab does and mark it as “depends on” the first, which prevents it from merging until the first one is done.) The first PR being merged will instantly make the second PR’s diff collapse to just the unique changes once you rebase/merge in the latest main, too.

All of this is to explain how we can already do pretty much all of this. But in reality, it’s silly to have people review change B if change A hasn’t landed yet. A reviewer from A may completely throw the whole thing out and tell you to start over, or everything could otherwise go back to the drawing board. Making reviewers look at change B before this is done, is a potential for a huge waste of time. But then you may think reviewers from change B may opt to make the whole plan go back to the drawing board too, so what makes A so special? And the answer is it’s both a bad approach: just make the whole thing in one PR, and discuss it holistically. Code owners files are for assigning ownership, and breaking things into separate commits is to help people look at a subset of the changes. (Or just, like, have them click on the folder in the source tree they care about. This is not a problem that needs a whole new code review paradigm.)

▲

steveklabnik 6 hours ago | parent [-]

> fixable with code owners files.

Code owners automatically assigns reviewers. You still end up in the state where many groups are assigned to the same PR, rather than having independent reviews.

> I already said GitHub lets you view single commits during PR review.

Yes, you can look at them, but your review is still in the context of the full PR.

> And I didn’t even bring up the fact that you can just open a second PR for the frontend change that has the backend commit as the parent.

The feature being discussed here is making this a first-class feature of the platform, much nicer to use. The second PR is "stacked" on top of the first.

▲

ninkendo 6 hours ago | parent [-]

> You still end up in the state where many groups are assigned to the same PR

> Yes, you can look at them, but your review is still in the context of the full PR.

Why is this a bad thing? I don’t get it. This has literally never been a problem once in my career. Is the issue that people can’t possibly scroll past another discussion? Or… I seriously am racking my brain trying to imagine why it’s a bad thing to have more than one stakeholder in a discussion.

I can think of a lot of reasons why doing the opposite, and siloing off discussions, leads to disaster. That is something I’ve encountered constantly in my career. We start out running an idea past group A, they iterate, then once we reach a consensus we bring the conclusion to group B and they have concerns. But oh, group A already agreed to this so you need to get on board. So group B feels railroaded. Then more meetings are called and we finally bring all the stakeholders together to discuss, and suddenly hey, group A and B both only had a partial view of the big picture, and why didn’t we all discuss this together in the first place? That’s happened more times in my career than I can count. The number of times group B is mad that they have to move their finger to scroll past what group A is talking about? Exactly zero.

▲

steveklabnik 5 hours ago | parent [-]

It's totally possible that you aren't the target audience for this sort of feature. It tends to be more useful in very large team and/or monorepo contexts.

This isn't about siloing discussions: it's about focus. You can always see the full stack if you want to go look at the other parts, the key is that you don't have to.

The goal is to get thoroughly reviewed changes. It's much easier to review five 100 line changes than one 500 line one, and it's easier to review five 500 line changes than it is a 2500 line one. Keeping commits small and tightly reviewed leads to better outcomes in the end. Massive PRs lead to rubber stamps of +1.

I agree that that scenario sounds like a nightmare. But I don't think that a PR is the right place to solve that problem: it sounds like something that should have been sorted before any of the code was written in the first place.

▲

ninkendo 5 hours ago | parent [-]

> It's much easier to review five 100 line changes than one 500 line one, and it's easier to review five 500 line changes than it is a 2500 line one.

This is true if the changes are orthogonal and are truly independent. One should always favor small independent changes if one can.

But when changes are all actually part of the same unit, and aren’t separable (apart from maybe the first of N of them which may be mergeable independently), proponents always seem to advocate that stacked diffs can somehow change this fact. “Oh if only we had stacked diffs we could break this into smaller changes”, ignoring the fact that no, they’d still be ordered and dependent on one another.

Stacked diffs seem like a UI convenience for reviewers… that’s fine I guess. GitHub is basically what you get when you ask the question “how can we make code review as tedious and unhelpful as possible”, and literally anything would be better than what we have (seriously I could fill a book with how bad GitHub is. I don’t think I could design a worse experience if I tried.) So, maybe I should just be happy they’re trying anything.

▲

steveklabnik 5 hours ago | parent [-]

In stacked diffs systems, the idea is that the base of the stack (once reviewed) can always be merged independently, so you're totally right that like, if you just purely think you can split things up when they shouldn't be split up, that would be bad.

This is the model that the kernel uses, as well as tons of other projects (any Gerrit user, for example), and so it has gotten real-world use and at scale. That said, everyone is also entitled to their preferences :)

▲

ninkendo 4 hours ago | parent [-]

> This is the model that the kernel uses

Nah.

The kernel uses a mailing list, and a “review” means a mailing list thread. With some nice CLI tools to integrate with git when you want to actually apply the patch (or start a review thread.)

In that world, “[PATCH 2/5]” (or whatever) in the subject title, and a different CC list for each patch, is a nice way to be able to ensure different subsets of the patch series have different discussions. That’s great.

But if you’re going to compare this to a GitHub UI, you have to choose the basis for comparison, because the two are so utterly different. Choosing one aspect (can we make sure discussions are kept separate), and saying “therefore the kernel uses stacked diffs” is a huge misrepresentation of how different GitHub’s approach is.

Because the kernel approach is the platonic ideal of a code review: it’s a simple threaded discussion between stakeholders, centered around a topic (the patch, which is inlined right in the email.) I would wager close zero kernel maintainer actually look at the diffs exclusively via their email client. They probably just check out the changes locally and look at them, and the purpose of the mailing list is to facilitate focused discussion on parts of the change (which is all we really want, in the end.)

GitHub has so thoroughly shit the bed on actually developing a good model of “threaded discussion about a change”, that you have to change the way you think about git’s model to fix how awful GitHub is at allowing review discussion to stay focused. You shouldn’t need to think about stacked diffs and multiple PR’s. You should use git branches as intended, multiple commits representing changes, and a merge meaning “this branch makes it or not.” That GitHub’s UI for discussing subsets of a change is so abysmal, does not mean the model is wrong. It means their discussion system is so abysmal that a mailing list TUI can run circles around it. Fixing this is GitHub’s problem, and doesn’t require any changes to how PR’s should be split up.

If you have a 2500-line PR with 5 500-line commits, GitHub should not require you to split things up further in any way, just to unfuck their discussion system.

Random idea I spent 10 seconds thinking about: let me start a “here’s a thread discussing the UI changes” and add folks to it, and “here’s a thread discussing the backend changes”, and add folks to that. I can then say “let’s not merge this until both threads are green”. You still see the whole change in the UI. (You can click directories to drill into the changes, that solves the “but the diff is too big” issue.) Discussion on a chunk of the diff is scoped to a discussion thread, which you select when sending the message. Thus, all discussion on any part of the diff is still scoped to a “discussion thread” of arbitrary subsets of stakeholders.

None of this needs me to change how I split up my git branches, an entire logical change is still either “merged” or “not-merged” (seriously who cares about the Pyrrhic victory of merging only change 1/N), and if we want to limit scopes of discussion to subsets of a change, we can just… do that.

▲

steveklabnik 4 hours ago | parent [-]

Sorry, I am talking about stacked diffs in general, not this specific implementation on GitHub. That "Patch 2/5" is five stacked diffs, on top of each other. Forges that are stacked-diff native do that same kernel flow, just on the web instead of over email. You can also see this corroborated over here: https://news.ycombinator.com/item?id=47758251

All of the advantages, like "it’s a simple threaded discussion between stakeholders, centered around a topic", is exactly why people like stacked diffs over PRs.

GitHub is doing "stacked PRs", which is like stacked diffs but more like PRs in the sense that they're stacked branches rather than stacked diffs. I agree that this seems less ideal, but they also are putting it into an existing project, rather than rebuilding everything around it. There's pros and cons to both approaches, but I agree that I'd prefer a native system built for this, personally. I'm still glad they're going to be popularizing the general concept.

▲

ninkendo 3 hours ago | parent [-]

> Sorry, I am talking about stacked diffs in general, not this specific implementation on GitHub

My point is that the LKML and what GitHub do is so different that the definition of “stacked diffs in general” can only describe a tiny aspect of each, if you want to call both of their approaches by the same name. From where I sit, the only common element between them is “they offer a way to keep discussion separated.”

If that’s all people are actually complaining about, there are a thousand better ways to “keep discussion separated” that don’t require me to pretend that it’s ok that only a subset of my branch is ok to merge.

In git, a branch is the thing you either merge or don’t. You merge multiple commits at once, or you don’t. It’s a great model. Breaking up the branch into smaller pieces, and giving people the impression it’s ok to merge the first commit but not the rest, just to unfuck the discussion UX, is putting the cart before the horse. I make a branch strictly because I want it to either all merge or none of it merge. It’s the only sensible approach in my book. If a discussion system is so bad that this is unworkable, it means the discussion system is bad, it doesn’t mean the conceptual model of a merge is bad.

▲

steveklabnik 3 hours ago | parent [-]

> My point is that the LKML and what GitHub do is so different that the definition of “stacked diffs in general” can only describe a tiny aspect of each

That's fine, what I mean is, when we started this convo, I thought you were asking about the general concept of stacked diffs, not the specifics of what GitHub is releasing here. That's my mistake for misunderstanding, sorry about that.

This is also (assumedly, anyway) why they're calling this "stacked PRs" and not "stacked diffs," because what they're doing is slightly different than Gerrit, Phabricator, Critique, etc.

	▲	ninkendo 2 hours ago \| parent [-]
		Thanks for indulging me so far, by the way, I really appreciate this discussion, it's very stimulating. After thinking about the whole thing I think I can summarize my opinion a lot better now: Stacked diffs are a category error. Units of discussion, and units of integration, should not be conflated. A branch is my unit of intended integration: merge all of it or none of it. The fact that reviewers need smaller slices to discuss does not imply those slices should become independently landable history objects. That’s a UX concern for the review tool, not something I should have to encode into Git history. The ideal system would let me seed discussion however I want (by commit, by path, by subsystem, by semantic region of the diff, etc) without forcing me to pretend those are separate merge units. Github nails the "merge unit" (CI runs against the whole branch, the branch either merges or doesn't, etc), but absolutely fumbles in the discussion part. I hate that I'd have to change the merge unit just to fix their discussion UX.

▲

nerdypepper 2 hours ago | parent | prev | next [-]

we have been stacking on tangled.org for a while now, you can see a few examples of stacks we have made here: https://tangled.org/tangled.org/core/pulls?state=merged&q=st...

for example, this stack adds a search bar: https://tangled.org/tangled.org/core/pulls/1287

- the first PR in the stack creates a search index.

- the second one adds a search API handler.

- the last few do the UI.

these are all related. you are right that you can do this by breaking a change into commits, and effectively that is what i do with jujutsu. when i submit my commits to the UI, they form a PR stack. the commits are individually reviewable and updatable in this stacking model.

gh's model is inherently different in that they want you to create a new branch for every new change, which can be quite a nuisance.

have written more about the model here: https://blog.tangled.org/stacking/

▲

ninkendo 2 hours ago | parent [-]

> - the first PR in the stack creates a search index.

> - the second one adds a search API handler.

> - the last few do the UI.

So you're saying you're going to merge (and continuously integrate, perhaps to production) a dangling, unused search index, consuming resources with no code using it, just to make your review process easier?

It's very depressing that review UX is so abysmal that you have to merge features before they're done just to un-fuck it.

Why can't the change still be a big branch that is either all merged or not... and people can review it in chunks? Why do we require that the unit of integration equals the unit of review?

The perverse logic always goes something like this:

"This PR is too big, break it up into several"

Why?

"It's easier to review small, focused changes"

Why can't we do that in one PR?

"Because... well, you see GitHub's UI makes it really hard to ..."

And that ends up being the root-cause answer. I should be able to make a 10,000 line change in a single commit if I want, and reviewers should be able to view subsets of it however they want: A thread of discussion for the diffs within the `backend` folder. A thread of discussion for the diffs within the `frontend` folder, etc etc. Or at the very least I should be able to make a single branch with multiple commits based on topic (and under no obligation for any of them to even compile, let alone be merge-able) and it should feel natural to review each commit independently. None of this should require me to contort the change into allowing integration partially-completed work, just to allow the review UX to be manageable.

	▲	matharmin an hour ago \| parent [-]
		This is not just about the UI, it's about the mental model and management of the changes. Just covering the review process: Yes, you can structure your PR into 3 commits to be reviewed separately. I occasionally structure my PRs like this - it does help in some cases. But if those separate parts are large, you really want more structure around it than just a commit. For example, let's say you have parts A, B and C, with B depending on A, and C depending on B. 1. I may want to open a PR for A while still working on B. Someone may review A soon, in which case I can merge immediately. Or perhaps it will only be reviewed after I finished C, in which case I'll use a stacked PR. 2. The PR(s) may need follow up changes after initial review. By using stacked PRs instead of just separate commits, I can add more commits to the individual PRs. That makes it clear what parts those commits are relevant to, and makes it easy to re-review the individual parts with updated changes. Separate commits don't give you that. Stacked PRs is not a workflow I'd use often, but there are cases where it's a valuable tool. Then apart from the review process, there are lots of advantages to keeping changes small. Typically, the larger a change, the longer it lives in a separate branch. That gives more time for merge conflicts to build up. That gives more time for underlying assumptions to change. That makes it more difficult to keep a mental map of all the changes that will be merged. There are also advantages to deploying small changes at a time, that I won't go into here. But the parent's process of potentially merging and deploying the search index first makes a lot of sense. The extra overhead of managing the index while it's "unused" for a couple of days is not going to hurt you. It allows early testing of the index maintenance in production, seeing the performance overhead and other effects. If there's an issue, it's easy to revert without affecting users. The overall point is that as features become large, the entire lifecycle becomes easier to manage if you can split it into smaller parts. Sometimes the smaller parts may be user-visible, sometimes not. For features developed in a day or two, there's no need to split it further. But if it will span multiple weeks, in a project with many other developers working on, then splitting into smaller changes helps a lot. Stacked PRs is not some magical solution here, but it is one tool that helps manage this.

▲

mh2266 5 hours ago | parent | prev | next [-]

you're upgrading the repository from language version 1 to 2, version 2 adds new compiler errors that rejects some old code, or the library has removed some old deprecated API the repository was still using in some places—the key here being that it can't be something that needs to be completely atomic.

you have hundreds or thousands of files to fix. that is unreviewable as a single commit, but as a per-file, per-library, per-oncall, etc. commit it is not that bad.

	▲	ninkendo 2 hours ago \| parent [-]
		> you have hundreds or thousands of files to fix. that is unreviewable as a single commit, but as a per-file, per-library, per-oncall, etc. commit it is not that bad Why is it intrinsically unreviewable as a single commit? Why can't the discussion/review system allow scoping discussions to a single folder of the change, or a single library, or a particular code-owner's "slice" of the repo, etc? The answer to this question is always unsatisfactory to me. It always ends up being "because GitHub's UI makes it hard to <foo>" and it's just taken as an immutable law of the universe that we're stuck with that UI's limitations. If a change is huge, find some basis by which to discuss it in smaller chunks. That basis doesn't have to be the PR itself (such that you have to make smaller PR's to make discussion manageable.) It can be a subdirectory of the diff. A wildcard-match over the source files. Whatever the case needs to be, the idea is still that the discussion UX shouldn't make reviewing large changes painful. Why do we tolerate the fact that GitHub doesn't let you say "approved for changes in `frontend/*`" or "approved for the changes I'm a code-owner of", and have the PR check system mark the PR as approved once all slices have been approved? Why do we tolerate that a thousand-file change is "unreviewable"? Instead we have to change our unit of integration, allowing partially-complete work to be merged, just because the review UX sucks.

▲

esafak 7 hours ago | parent | prev | next [-]

Feature B depends on feature A, but you don't need B to understand A. Why wouldn't you create separate PRs?? It is faster to review and deploy.

	▲	fmbb 7 hours ago \| parent [-]
		Of course you would create separate PRs. Why would you waste time faffing about building B on top of a fantasy version of A? Your time is probably better spent reviewing your colleague’s feature X so they can look at your A.

▲

charcircuit 8 hours ago | parent | prev [-]

>If you reject change A and approve change B, nothing can merge

The feature is also half done in this case. The author can fix up the concerns the reviewer had in A and then both can be merged at the same time.

	▲	ninkendo an hour ago \| parent [-]
		Couldn’t they do that in one PR? Seriously, couldn’t you just say “hey Alice, could you review the A parts of this PR” and “hey Bob, could you review the B parts”, then only merge once both of them approve? Even GitHub, for all its faults, supports code owners files, such that this can even be policy.