Remix.run Logo
danudey 6 hours ago

I've set up PXE booting at two previous companies for very different use cases.

The first was to automate server deployment; we ran bare metal servers, and even though we had managed hosting in our data centre the installation, configuration, and deployment of a server could potentially take days since it was just me doing it and I had other things to do.

So one day I set to work. I installed an Ubuntu server the same way I always did and then captured the debconf configuration to turn into a preseed file. I set up the disk partitioning, etc., and configured the OS to boot from DHCP. Then I configured the DHCP server with MAC addresses for every server we got and an associated IP address so that a given physical server would always get the same IP address.

Then I set up an internal apt repository; that's where I put custom packages, backports I had to recompile, third-party packages (e.g. perconadb) and so on.

Lastly, I set up salt (the management orchestration tool, like puppet or chef or ansible) with a nice simple (detailed) configuration.

The machines would be configured to boot via PXE. They'd load the kernel and initrd, which contained the preseed file that answered all of the installation/configuration questions. Lastly it ran the post-install shell script which started salt and ran the initial configuration, much of which was based on hostname. This would turn the current DHCP-provided IP address into a static networking configuration so that the server wasn't reliant on DHCP anymore; it would ensure that SSH keys were installed, and that the right services were enabled or disabled, install some packages based on the hostname (which represented the role, e.g. db02.blah.blah got percona installed). I also had some custom data sources (whatever you would call them) so that I could install the right RAID controller software based on which PCI devices were present; after all that, it would reboot. Once it rebooted from the local disk, salt would pick back up again and do the rest of the configuration (now that it wasn't running from a chroot and had all the required systemd services running). What used to take me several days to do for two servers turned into something one of our co-ops could do in an hour.

Second was another company that wanted to standardize the version of Linux its developers were running. Again, I set up an Ubuntu installer and configured it to boot iPXE and then fetch the kernel and the root image via HTTPS. The Ubuntu installer at that point was a Snap, and the default 'source' was a squashfs file that it unpacked to the new root filesystem before proceeding with package installation. I set up some scripts and configurations to take the default squashfs filesystem, unpack it, install new packages via apt in a chroot, and then repack it again. This let me do things like ensure Firefox, Thunderbird, and Chrome were installed and configured not from snaps; update to the latest packages; make sure Gnome was installed, etc. A lot of that was stuff the installer would do, of course, but given we were on gigabit ethernet it was significantly faster to download a 2 GB squashfs file than to download a 512M squashfs file and then download new or updated packages. One again what used to start with "Here's a USB, I think it has the latest Ubuntu on it" and take most of a day turned into "Do a one-off boot from the network via UEFI, choose a hostname, username, and password, and then just wait for twenty minutes while you get a coffee or meet your coworkers". I even found a "bug" (misbehaviour) in the installer where it would mount the squashfs and then rsync the files, which seemed significantly slower because the kernel was only using one thread for decompressing; using `unsquashfs` could use all cores and was dramatically faster, so I got to patch that (which I'm not sure ever made it into the installer anyway).

The one thing I couldn't make work was the OEM installation, where you put everything down onto the system unattended then put the user through the Ubuntu OOBE process. That would have made it far easier to pre-provision systems for users ahead of time; I did replace the default Plymouth splash screen logo with our company logo though, which was pretty cool.

I also set up network booting of macOS at another job, but that's sort of a very different process because it has all its own tooling, etc. for managing and Apple ended up moving from custom deployment images to static images and MDM for post-install configuration.

TL;DR network booting is pretty great actually; it's a very niche use case but if you're clever you can get a lot done. There's also lots of options for booting into a bootloader that can then present other options, allowing you to choose to netboot Ubuntu Desktop, Ubuntu Server, Windows, RHEL, Gentoo, a rescue image, or anything else you want.

nyrikki 6 hours ago | parent [-]

> The one thing I couldn't make work was the OEM installation, where you put everything down onto the system unattended then put the user through the Ubuntu OOBE process.

Did you try chain booting into iPXE and using SYSLINUX?

I used just nginx try, where I could place a pressed for a known provisioning event, otherwise providing various live and utility images if the MAC address file didn’t exist for one off or emergency repair.

I could even serve up windows instances.

That is also very useful because occasionally you run into PXE firmware that is crippled, it may not apply now, but only having a tiny iPXE image on tftp helps with speed and security.

I would bet almost all vendors just use iPXE anyway, and at least you use to be able to replace the firmware on intel cards with it.