Remix.run Logo
MathMonkeyMan 6 days ago

I remember a Rich Hickey talk where he described Datomic, his database. He said "the problem with a database is that it's over there." By modeling data with immutable "facts" (a la Prolog), much of the database logic can be moved closer to the application. In his case, with Clojure's data structures.

Maybe the the problem with CI is that it's over there. As soon as it stops being something that I could set up and run quickly on my laptop over and over, the frog is already boiled.

The comparison to build systems is apt. I can and occasionally do build the database that I work on locally on my laptop without any remote caching. It takes a very long time, but not too long, and it doesn't fail with the error "people who maintain this system haven't tried this."

The CI system, forget it.

Part of the problem, maybe the whole problem, is that we could get it all working and portable and optimized for non-blessed environments, but it still will only be expected to work over there, and so the frog keeps boiling.

I bet it's not an easy problem to solve. Today's grand unified solution might be tomorrow's legacy tar pit. But that's just software.

reactordev 5 days ago | parent | next [-]

The rule for CI/CD and DevOps in general is boil your entire build process down to one line:

    ./build.sh
If you want to ship containers somewhere, do it in your build script where you check to see if you’re running in “CI”. No fancy pants workflow yamls to vendor lock yourself into whatever CI platform you’re using today, or tomorrow. Just checkout, build w/ params, point your coverage checker at it.

This is also the same for onboarding new hires. They should be able to checkout, and build, no issues or caveats, setup for local environment. This ensures they are ready to PR by end of the day.

(Fmr Director of DevOps for a Fortune 500)

maratc 5 days ago | parent | next [-]

Yeah, that's a good rule. Except, do you want to build Debug or Release? Or maybe RelWithDebugInfo? And do you want that with sanitizers maybe? And what the sanitizers' options should be? Do you want to compile your tests too, if you want to run them later on a different machine? And what about that dependency that takes two hours to compile, maybe you just want to reuse the previous compilation of it? And if so, where to take that from? Etc. etc.

Before long, you need another script that will output the train of options to your `build.sh`.

(If Fortune 500 companies can do a one-line build with zero parameters, I suspect I'd be very bored there.)

reactordev 5 days ago | parent [-]

Of course we had parameters but we never ship debug builds. Treat everything like production.

If you want to debug, docker compose or add logs and metrics to seek what you find.

pxc 5 days ago | parent | prev | next [-]

You still inevitably need a bunch of CI platform-specific bullshit for determining "is this a pull request? which branch am I running on?", etc. Depending on what you're trying to do and what tools you're working with, you may need such logic both in an accursed YAML DSL and in your build script.

And if you want your CI jobs to do things like report cute little statuses, integrate with your source forge's static analysis results viewer, or block PRs, you have to integrate with the forge at a deeper level.

There aren't good tools today for translating between the environment variables or other things that various CI platforms expose, managing secrets (if you use CI to deploy things) that are exposed in platform-specific ways, etc.

If all you're doing with CI is spitting out some binaries, sure, I guess. But if you actually ask developers what they want out of CI, it's typically more than that.

michaelmior 5 days ago | parent | next [-]

A lot of CI platforms (such as GitHub) spit out a lot of environment variables automatically that can help you with the logic in your build script. If they don't, they should give you a way to set them. One approach is to keep the majority of the logic in your build script and just use the platform-specific stuff to configure the environment for the build script.

Of course, as you mention, if you want to do things like comment on PRs or report detailed status information, you have to dig deeper.

pxc 5 days ago | parent | next [-]

Yes, and real portability for working with the environment variables is doable but there's nothing out there that provides it for you afaik. You just have to read a lot carefully.

My team offers integrations of static analysis tools and inventorying tools (SBOM generation + CVE scanning) to other teams at my organization, primarily for appsec purposes. Our organization's departments have a high degree of autonomy, and tooling varies a lot. We have code hosted in GitLab, GitHub, Azure DevOps, and in distant corners my team has not yet worked with, elsewhere. Teams we've worked with run their CI in GitLab, GitHub, Azure DevOps, AWS CodeBuild, and Jenkins. Actual runners teams use may be SaaS-provided by the CI platform, or self-hosted on AWS or Azure. In addition to running in CI, we provide the same tools locally, for use on macOS as well as Linux via WSL.

The tools my team uses for these scans are common open-source tools, and we distribute them via Nix (and sometimes Docker). That saves us a lot of headaches. But every team has their own workflow preferences and UI needs, and we have to meet them on the platforms they already use. For now we manage it ourselves, and it's not too terrible. But if there were something that actually abstracted away boring but occasionally messy differences like which environment variables mean in different CI systems, that would be really valuable for us. (The same goes for even comment bots and PR management tools. GitHub and GitLab are popular, but Azure DevOps is deservedly marginal, so even general-purpose tools rarely support both Azure DevOps and other forges.)

If your concern is that one day, a few years from now, you'll need to migrate from one forge to another, maybe you can say "my bash script handles all the real build logic" and get away with writing off all the things it doesn't cover. Maybe you spend a few days or even a few weeks rewriting some platform-specific logic when that time comes and forget about it. But when you're actually contending with many such systems at once, you end up wishing for sane abstractions or crafting them yourself.

merb 5 days ago | parent | prev [-]

how can you build your containers in parallel?

over multiple machines? I'm not sure that a sh script can do that with github

pxc 5 days ago | parent | next [-]

If you build them with Nix, you can. Just call `nix build` with a trailing `&` a bunch of times.

But it's kind of cheating, because the Nix daemon actually handles per-machine scheduling and cross-machine orchestration for you.

Just set up some self-hosted runners with Nix and an appropriately configured remote builders configuration to get started.

If you really want to, you can graduate after that to a Kubernetes cluster where Nix is available on the nodes. Pass the Nix daemon socket through to your rootless containers, and you'll get caching in the Nix store for free even with your ephemeral containers. But you probably don't need all that anyway. Just buy or rent a big build server. Nix will use as many cores as you have by default. It will be a long time before you can't easily buy or rent a build server big enough.

reactordev 4 days ago | parent | prev [-]

these problems are general ones and the solution is the same as running programs in parallel or across machines. When needing to build different architectures (and needing a host to provide the toolchains), what's stopping you from issuing more than 1 command in your CI/CD pipeline? Most pipelines have a way of running something on a specific host. So does k8s, ecs, <pick your provider>, and probably your IT team.

My experience, when it gets time to actually build the thing. A one-liner (with args if you need them) is the best approach. If you really REALLY need to, you can have more than one script for doing it - depending on what path down the pipeline you take. Maybe it's

    1) ./build.sh -config Release
    2) ./deploy.sh -docker -registry=<$REGISTRY> --kick

Just try not to go too crazy. The larger the org, the larger this wrangling task can be. Look at Google and gclient/gn. Not saying it's bad, just saying it's complicated for a reason. You don't need that (you'll know if you do).

The point I made is I hate when I see 42 lines in a build workflow yaml that isn't syntax highlighted because it's been |'d in there. I think the yaml's of your pipelines, etc, should be configuration for the pipeline and the actual execution should be outsourced to a script you provide.

oftenwrong 5 days ago | parent | prev [-]

There are some very basic tools that can help with portability, such as https://github.com/milesj/rust-cicd-env , but I agree that there is a lot of proprietary, vendor-specific, valuable functionality available in the average "CI" system that you cannot make effective use of with this approach. Still, it's the approach I generally favor for a number of reasons.

XorNot 5 days ago | parent | prev | next [-]

The other rule is that script should run as a user. Solely on that working directory.

There are too many scripts like that which start, ask for sudo and then it's off to implementing someones "great idea" about your systems network interfaces.

reactordev 5 days ago | parent [-]

sudo should not be required to build software.

If there’s something you require that requires sudo, it’s a pre-build environment setup on your machine. On the host. Or wherever. It’s not part of the build. If you need credentials, get them from secrets or environment variables.

immibis 5 days ago | parent [-]

For use cases like making tar files with contents owned by root, Debian developed the tool "fakeroot", which intercepts standard library functions so that when the build script sets a file to be owned by root and then reads the ownership later, it sees it's owned by root, so it records that in the tar file.

reactordev 4 days ago | parent [-]

Debian takes the You can’t touch this approach to things to solve their issues. Instead of work arounds, they just hack at the lower kernel level and trace all you do. It’s a flex. fakeroot isn’t the only tool like this. I love me some Debian.

BobbyTables2 5 days ago | parent | prev | next [-]

You’re not wrong but your suggestion also throws away a lot of major benefits of CI. I agree jobs should be one liners but we still need more than one…

The single job pipeline doesn’t tell you what failed. It doesn’t parallelize unit and integration test suites while dealing with the combinatorial matrix of build type, target device, etc.

At some point, a few CI runners become more powerful than a developer’s workstation. Parallelization can really matter for reducing CI times.

I’d argue the root of the problem is that we are stuck on using “make” and scripts for local build automation.

We need something descriptive enough to describe a meaningful CI pipeline but also allow local execution.

Sure, one can develop a bespoke solution, but reinventing the wheel each time gets tiring and eventually becomes a sizable time sink.

In principle, we should be able to execute pieces of .gitlab-ci.yml locally, but even that becomes non trivial with all the nonstandard YAML behaviors done in gitlab, not to mention the varied executor types.

Instead we have a CI workflow and a local workflow and hope the two are manually kept in sync.

In some sense, the current CI-only automation tools shouldn’t even need to exist (gitlab, Jenkins, etc) — why didn’t we just use a cron job running “build.sh” ?

I argue these tools should mainly only have to focus on the “reporting/artifacts” with the pipeline execution parts handled elsewhere (or also locally for a developer).

Shame on you GitLab!

reactordev 4 days ago | parent [-]

You are mistaking a build for a pipeline. I still believe in pipelines and configuring the right hosts/runners to produce your artifacts. Your actual build on that host/runner should be a one-liner.

mrbombastic 5 days ago | parent | prev | next [-]

How do you get caching of build steps with this approach? Or do you just not?

arianvanp 5 days ago | parent | next [-]

Use a modern hermetic build system with remote caching or remote execution. Nix, Bazel, buck, pants. Many options

pwnna 5 days ago | parent [-]

This is like fighting complexity with even more complexity. Nix and bazel are definitely not close to actually achieving hermetic build at scale. And when they break the complexity increases exponentially to fix.

pxc 5 days ago | parent [-]

What's not hermetic with Nix? Are you talking about running with the sandbox disabled, or and macOS quirks? It's pretty damn hard to accidentally depend on the underlying system in an unexpected way with Nix.

wredcoll 5 days ago | parent [-]

My experience with nix, at a smaller scale than what you're talking about, is that it only worked as long as every. single. thing. was reimplemented inside nix. Once one thing was outside of nix, everything exploded and writing a workaround was miserable because the nix configuration did not make it easy.

pxc 5 days ago | parent [-]

> every. single. thing. was reimplemented inside nix

That's kinda what hermetic means, though, isn't it? Whether that's painful or not, that's pretty much exactly what GGP was asking for!

> Once one thing was outside of nix, everything exploded and writing a workaround was miserable because the nix configuration did not make it easy.

Nix doesn't make it easy to have Nix builds depend on non-Nix things (this is required for hermeticity), but the other way around is usually less troublesome.

Still, I know what you mean. What languages were you working in?

wredcoll 4 days ago | parent [-]

It was the dev environment for a bunch of wannabe microservices running across node/java/python

And like, I'm getting to the point of being old enough that I've "seen this before"; I feel like I've seen other projects that went "this really hard problem will be solved once we just re-implement everything inside our new system" and it rarely works; you really need a degree of pragmatism to interact with the real world. Systemd and Kubernetes are examples of things that do a lot of re-implementation but are mostly better than the previous.

pxc 4 days ago | parent [-]

> Systemd and Kubernetes are examples of things that do a lot of re-implementation but are mostly better than the previous.

I feel the same way about systemd, and I'll take your word for it with respect to Kubernetes. :)

> "this really hard problem will be solved once we just re-implement everything inside our new system" [...] rarely works

Yes. 100%. And this is definitely characteristic of Nix's ambition in some ways as well as some of the most painful experiences users have with it.

> you really need a degree of pragmatism to interact with the real world

Nix is in fact founded on a huge pragmatic compromise: instead of beginning with a new operating system, or a new executable format with a new linker, or even a new basic build system (a la autotools or make)! Instead of doing any of those things, Nix's design manages to bring insights and features from programming language design (various functional programming principles and, crucially, memoization and garbage collection) to build systems and package management tools, on top of existing (even aging) operating systems and toolchains.

I would also contend that the Nixpkgs codebase is a treasure, encoding how to build, run, and manage an astonishing number of apps (over 120,000 packages now) and services (I'd guess at least 1,000; there are some 20,000 configuration options built into NixOS). I think this does to some extent demonstrate the viability of getting a wide variety of software to play nice with Nix's commitments.

Finally, and it seems you might not be aware of this, but there are ways within Nix to relax the normal constraints! And of course you can also use Nix in various ways without letting Nix run the show.[0] (I'm happy to chat about this. My team, for instance, uses Nix to power Python development environments for AWS Lambdas without putting Nix in charge of the entire build process.)

However:

  - fully leveraging Nix's benefits requires fitting within certain constraints
  - the Nix community, culturally, does not show much interest in relaxing those constraints even when possible[1], but there is more and more work going on in this area in recent years[2][3] and some high-profile examples/guides of successful gradual adoption[4]
  - the Node ecosystem's habit of expecting arbitrary network access at build time goes against one of the main constraints that Nix commits to by default, and *this indeed often makes packaging Node projects "properly" with Nix very painful*
  - Python packaging is a mess and Nix does help IME, but getting there can be painful
Maybe if you decide to play with Nix again, or you encounter it on a future personal or professional project, you can remember this and look for ways to embrace the "heretical" approach. It's more viable and more popular than ever :)

--

0: https://zimbatm.com/notes/nix-packaging-the-heretic-way ; see also the community discussion of the post here: https://discourse.nixos.org/t/nix-packaging-the-heretic-way/...

1: See Graham Christensen's 2022 NixCon talk about this here. One such constraint he discusses relaxing, build-time sandboxing, is especially useful for troublesome cases some Node projects: https://av.tib.eu/media/61011

2: See also Tom Bereknyei's NixCon talk from the same year; the last segment of it is representative of increasing interest among technical leaders in the Nix community on better enabling and guiding gradual adoption: https://youtu.be/2iugHjtWqIY?t=830

3: Towards enabling gradual adoption for the most all-or-nothing part of the Nix ecosystem, NixOS, a talk by Pierre Penninckx from 2024: https://youtu.be/CP0hR6w1csc

4: One good example of this is Mitchell Hashimoto's blog posts on using Nix with Dockerfiles, as opposed to the purist's approach of packaging your whole environment via Nix and then streaming the Nix packages to a Docker image using a Nix library like `dockerTools` from Nixpkgs: https://mitchellh.com/writing/nix-with-dockerfiles

fireflash38 5 days ago | parent | prev | next [-]

Even just makefiles have 'caching', provided you set dependencies and output correctly.

A good makefile is really nice to use. Not nice to read or trace unfortunately though.

reactordev 4 days ago | parent | prev [-]

We get them with docker.

Everything becomes a container so why not use the container engine for it. If you know how layers work…

HPsquared 5 days ago | parent | prev | next [-]

Sounds like the Lotus philosophy, "simplify and add lightness".

marsven_422 5 days ago | parent | prev [-]

[dead]

DrBazza 6 days ago | parent | prev | next [-]

Your build should be this:

    build.bash <debug|release>
and that's it (and that can even trigger a container build).

I've spent far too much time debugging CI builds that work differently to a local build, and it's always because of extra nonsense added to the CI server somehow. I've yet to find a build in my industry that doesn't yield to this 'pattern'.

Your environment setup should work equally on a local machine or a CI/CD server, or your devops teams has identically set it up on bare metal using Ansible or something.

nrclark 6 days ago | parent | next [-]

Agreed with this sentiment, but with one minor modification: use a Makefile instead. Recipes are still chunks of shell, and they don’t need to produce or consume any files if you want to keep it all task-based. You get tab-completion, parallelism, a DAG, and the ability to start anywhere on the task graph that you want.

It’s possible to do all of this with a pure shell script, but then you’re probably reimplementing some or all of the list above.

gchamonlive 6 days ago | parent | next [-]

Just be aware of the "Makefile effect"[1] which can easily devolve into the Makefile also being "over there", far from the application, just because it's actually a patchwork of copy-paste targets stitched together.

[1] https://news.ycombinator.com/item?id=42663231

DrBazza 5 days ago | parent | prev | next [-]

> use a Makefile instead

I was making a general comment that your build should be a single 'command'. Personally, I don't care what the command is, only that it should be a) one command, and b) 100% runnable on a dev box or a server. If you use make, you'll soon end up writing... shell scripts, so just use a shell script.

In an ideal world your topmost command would be a build tool:

     ./gradlew build
     bazel build //...
     make debug
     cmake --workflow --preset
Unfortunately, the second you do that ^^^, someone edits your CI/CD to add a step before the build starts. It's what people do :(

All the cruft that ends up *in CI config*, should be under version control, and inside your single command, so you can debug locally.

chubot 5 days ago | parent [-]

That's exactly why the "main" should be shell, not make (see my sibling reply). So when someone needs to add that step, it becomes:

    #!/bin/sh

    step-I-added-to-shell-rather-than-CI-yaml
    make debug  # or cmake, bazel
This is better so you can run the whole thing locally, and on different CI providers

In general, a CI is not a DAG, and not completely parallel -- but it often contains DAGs

chubot 6 days ago | parent | prev | next [-]

Make is not a general purpose parallel DAG engine. It works well enough for small C projects and similar, but for problems of even medium complexity, it falls down HARD

Many years ago, I wrote 3 makefiles from scratch as an exploration of this (and I still use them). I described the issues here: https://lobste.rs/s/yd7mzj/developing_our_position_on_ai#c_s...

---

The better style is in a sibling reply -- invoke Make from shell, WHEN you have a problem that fits Make.

That is, the "main" should be shell, not Make. (And it's easy to write a dispatcher to different shell functions, with "$@", sometimes called a "task file" )

In general, a project's CI does not fit entirely into Make. For example, the CI for https://oils.pub/ is 4K lines of shell, and minimal YAML (portable to Github Actions and sourcehut).

https://oils.pub/release/latest/pub/metrics.wwz/line-counts/...

It invokes Make in a couple places, but I plan to get rid of all the Make in favor of Python/Ninja.

DrBazza 5 days ago | parent [-]

Portability to other CI/CDs systems is an understated reason to use a single build command.

dgfitz 6 days ago | parent | prev [-]

You invoke CMake/qmake/configure/whatever from the bash script.

I hate committing makefiles directly if it can be helped.

You can still call make in the script after generating the makefile, and even pass the make target as an argument to the bash script if you want. That being said, if you’re passing more than 2-3 arguments to the build.sh you’re probably doing it wrong.

nrclark 6 days ago | parent | next [-]

Yes to calling CMake/etc. No to checking in generated Makefiles. But for your top-level “thing that calls CMake”, try writing a Makefile instead of a shell script. You’ll be surprised at how powerful it is. Make is a dark horse.

dgfitz 5 days ago | parent [-]

I wouldn't be surprised at all, make is great!

My contention is that a build script should ideally be:

sha-bang

clone && cd $cloned_folder

${generate_makefile_with_tool}

make $1

Anything much longer than that can (and usually will) quickly spiral out of control.

Make is great. Unless you're code-golfing, your makefile will be longer than a few lines and a bunch of well-intentioned-gremlins will pop in and bugger the whole thing up. Just seen it too many times.

Edit: in the jenkins case, in a jenkins build shell the clone happens outside build.sh:

(in jenkins shell):

clone && cd clone ./build.sh $(0-1 args)

(inside build.sh): $(generate_makefile_with_tool) make $1

lenkite 5 days ago | parent | prev [-]

I have experienced horror build systems where the Makefile delegates to a shell script which then delegates to some sub-module Makefile, which then delegates to a shell script...

The problem is that shell commands are very painful to specify in a Makefile with weird syntactical rules. Esp when you need them to run in one shell - a lot of horror quoting needed.

mikepurvis 6 days ago | parent | prev | next [-]

There are various things that can be a reasonable candidate for the "top level" build entrypoint, including Nix, bazel, docker bake, and probably more I'm not thinking of. They all have an entrypoint that doesn't have a ton of flags or nonsense, and operate in a pretty self contained environment that they set up and manage themselves.

Overall I'm not a fan of wrapping things; if there are flags or options on the top-level build tool, I'd rather my devs explore those and get used to what they are and can do, rather than being reliant on a project-specific script or make target to just magically do the thing.

Anyway, other than calling the build tool, CI config can have other steps in it, but it should be mostly consumed with CI-specific add-ons, like auth (OIDC handshake), capturing logs, uploading artifacts, sending a slack notification, whatever it is.

DrBazza 5 days ago | parent [-]

Fortunately most CI/CD systems expose an environment variable during the build so you can detect most of those situations and still write a script that runs locally on a developer box.

Our wrapping is 'minimal', in that you can still run

    bazel build //...
or

    cmake ...
and get the same build artefacts as running:

    build.bash release
My current company is fanatical about read-only for just about every system we have (a bit like Nix, I suppose), and that includes CI/CD. Once the build is defined to run debug or release, rights are removed so the only thing you can edit are the build scripts you have under your control in your repo. This works extremely well for us.
mikepurvis 5 days ago | parent [-]

Interestingly despite being pretty hard-nosed about a lot of things, Nix does not insist on a read-only source directory at build time— the source is pulled into a read-only store path, but from there it is copied into the build sandbox, not bind-mounted.

I expect this is largely a concession to the reality that most autotools projects still expect an in-source build, not to mention Python wanting to spray pyc files and build/dist directories all over the place.

kqr 5 days ago | parent | prev | next [-]

I tried to drive this approach at a previous job but nobody else on the team cared so I ended up always having to mirror all the latest build changes into my bash script.

The reason it didn't catch on? Everyone else was running local builds in a proprietary IDE, so to them the local build was never the same anyway.

germandiago 5 days ago | parent | prev [-]

I always use, no matter what I am using underneath, a bootstrap script, a configure script and a build step.

That keeps the cli interface easy, expectable and guessable.

KronisLV 6 days ago | parent | prev | next [-]

> Part of the problem, maybe the whole problem, is that we could get it all working and portable and optimized for non-blessed environments, but it still will only be expected to work over there, and so the frog keeps boiling.

Build the software inside of containers (or VMs, I guess): a fresh environment for every build, any caches or previous build artefacts explicitly mounted.

Then, have something like this, so those builds can also be done locally: https://docs.drone.io/quickstart/cli/

Then you can stack as many turtles as you need - such as having build scripts that get executed as a part of your container build, having Maven or whatever else you need inside of there.

It can be surprisingly sane: your CI server doing the equivalent of "docker build -t my_image ..." and then doing something with it, whereas during build time there's just a build.sh script inside.

kqr 6 days ago | parent [-]

This sounds a lot like "use Nix".

justinrubek 5 days ago | parent | next [-]

Unfortunately, that's the last thing a lot of people want to hear, despite it saving a whole lot of heartache.

zhengyi13 5 days ago | parent [-]

I mean, sure (also bazel I think), but I feel like that's because the learning curve for these tools to a first approximation looks a bit like the infamous EvE Online learning curve[0].

[0]: https://imgur.com/gallery/eve-online-learning-curve-jj16ThL

KronisLV 5 days ago | parent | prev [-]

I mean, if it's easy enough to actually get your average developer to use it, then sure. In my experience, things that are too hard will just not be done, or at least not properly.

dapperdrake 6 days ago | parent | prev | next [-]

Transactions and a single consistent source of truth with stuff like observability and temporal ordering is centralized and therefore "over there" for almost every place you could be in.

As long as communications have bounded speed (speed of light or whatever else) there will be event horizons.

The point of a database is to track changes and therefore time centrally. Not because we want to, but because everything else has failed miserably. Even conflicting CRDT change merges and git merges can get really hairy really quickly.

People reinvent databases about every 10 years. Hardware gets faster. Just enjoy the show.

MathMonkeyMan 5 days ago | parent [-]

I haven't used Datomic, but you're right that the part that requires over there is "single consistent source of truth." There's only ever a single node that is sequencing all writes. Perhaps as a result of this, it provides strong [verified ACID guarantees][1].

What I got from Hickey's talk is that he wanted to design a system that resisted the urge to encode everything in a stored procedure and run it on the database server.

[1]: https://jepsen.io/analyses/datomic-pro-1.0.7075

AtlasBarfed 6 days ago | parent | prev | next [-]

I want my build system to be totally declarative

Oh the DSL doesn't support what I need it to do.

Can I just have some templating or a little bit of places to put in custom scripts?

Congratulations! You now have a turing complete system. And yes, per the article that means you can cryptocurrency mine.

Ansible terraform Maven Gradle.

Unfortunate fact is that these IT domains (builds and CI) are at a junction of two famous very slippery slopes.

1) configuration

2) workflows

These two slippery slopes are famous for their demos of how clean and simple they are and how easy it is to do. Anything you need it to do.

In the demo.

And sure it might stay like that for a little bit.

But inevitably.... Script soup

lelanthran 6 days ago | parent | next [-]

Alternative take: CI is the successful monetization of Make-as-a-Service.

lenkite 4 days ago | parent | prev [-]

No, you keep your build system declarative, but you support a clean plugin API that permits injection into the build lifecycle and allow configuring/invoking the plugin with your DSL.

MortyWaves 6 days ago | parent | prev | next [-]

It’s why I’ve started making CI simply a script that I can run locally or on GitHub Actions etc.

Then the CI just becomes a bit of yaml that runs my script.

maccard 6 days ago | parent | next [-]

How does that script handle pushing to ghcr, or pulling an artifact from a previous stage for testing?

In my experience these are the bits that fail all the time, and are the most important parts of CI once you go beyond it taking 20/30 seconds to build.

A clean build in an ephemeral VM of my project would take about 6 hours on a 16 core machine with 64GB RAM.

thechao 6 days ago | parent | next [-]

Sheesh. I've got a multimillion line modern C++ protect that consists of a large number of dylibs and a few hundred delivered apps. A completely cache-free build is an only few minutes. Incremental and clean (cached) builds are seconds, or hundreds of milliseconds.

It sounds like you've got hundreds of millions of lines of code! (Maybe a billion!?) How do you manage that?

maccard 6 days ago | parent | next [-]

It’s a few million lines of c++ combined with content pipelines. Shader compilation is expensive and the tooling is horrible.

Our cached builds on CI are 20 minutes from submit to running on steam which is ok. We also build with MSVC so none of the normal ccache stuff works for us, which is super frustrating

thechao 6 days ago | parent [-]

Fuck. I write shader compilers.

maccard 5 days ago | parent [-]

Eh, you write them I (ab)use them.

bluGill 5 days ago | parent | prev [-]

I have 15 million lines of C++, and builds are several hours. We split into multi-repo (for other reasons) and that helps because compiling is memory bandwidth limited - on the CI system by we can split the different repos to different CI nodes.

MortyWaves 6 days ago | parent | prev [-]

To be honest I haven’t really thought about it and it’s definitely something it can’t do, you’d probably need to call their APIs or something.

I am fortunate in that the only things I want to reuse is package manager caches.

maccard 4 days ago | parent [-]

That’s fair, but surely you must see that’s a very simple build.

The complicated part comes when you have job A that builds and Job B that deploys - they run on two different machine specs so you’re not paying for a 16 core machine to wait for helm apply to wait for 5 minutes - they need somewhere secure to shuffle that artifact around. Their access to that service is likely different to your local access to that service, so you run your build locally and it’s fine but then the build machine doesn’t have write access to the new path you’ve just tested and it fails.

90% of the time these are where I see CI failures

maratc 5 days ago | parent | prev | next [-]

You must be very lucky to be in a position where you know what needs to be done before the run begins. Not everyone is in that position.

At my place, we have ~400 wall hours of testing, and my run begins by figuring out what tests should be running and what can be skipped. This depends on many factors, and the calculation of the plan already involves talking to many external systems. Once we have figured out a plan for the tests, we can understand the plan for the build. Only then we can build, and test afterwards. I haven't been able to express all of that in "a bit of yaml" so far.

j4coh 6 days ago | parent | prev [-]

Are you not worried about parallelisation in your case? Or have you solved that in another way (one big beefy build machine maybe?)

MortyWaves 6 days ago | parent [-]

Honestly not really… sure it might not be as fast but the ability to know I can debug it and build it exactly the same way locally is worth the performance hit. It probably helps I don’t write C++, so builds are not a multi day event!

layer8 5 days ago | parent | prev [-]

Yes, the build system should be independent from the platform that hosts it. Having GitHub or GitLab execute your build is fine, but you should as easily be able to execute it locally on your own infrastructure. The definition of the build or integration should be independent from that, and the software that ingests and executes such definitions shouldn’t be a proprietary SaaS.