Remix.run Logo
Learning to Boot from PXE(blog.imraniqbal.org)
77 points by speckx 11 hours ago | 33 comments
theandrewbailey 11 hours ago | parent | next [-]

Oh oh oh I know this!

I work in the refurb division of an ewaste recycling company[0]. To prepare a machine for sale, the drive needs to be wiped, and (optionally) an OS loaded. Wiping happens in WipeOS[1], which loads when you PXE boot on the internal company network. To install an OS, I have a separate network on my desk that will load iVentoy[2] when PXE booted, where I can further boot from ISOs I have on my server, but I almost always install Linux Mint. With those 2 things, I can largely do my job without fumbling with and losing USB drives.

I have 2 16 port switches on my desk, with over a dozen ethernet cables plugged into each. The yellow cables will PXE boot WipeOS, and the black ones PXE boot iVentoy.

[0] https://www.ebay.com/str/evolutionecycling

[1] https://www.wipeos.com/

[2] https://www.iventoy.com/en/index.html

servercobra 5 hours ago | parent | next [-]

Ah I did something similar at the university I worked at as a student. Everything was already set to network boot as the first step, so I set up a PXE server that loaded up DBAN. When we needed to wipe a lab before decommissioning, we'd flip their network to the PXE DBAN network, tell them all to reboot, and decom them in the morning.

Saved us a bunch of hours we then used to play Minecraft haha

danudey 5 hours ago | parent [-]

I've seen this done in some settings as well; the 'wipe and install the system' VLAN and the 'normal behaviour' VLAN. When you want to reinstall a server you tell it to reboot and then swap the VLAN; once the installation is done you swap it back.

Alternately, you can have your DHCP server be aware of all of your systems and know which ones need to be reinstalled; then just configure every server to network boot by default with a short timeout. If the DHCP server thinks your system needs to be erased then it serves a boot/wipe/reinstall image; otherwise, it doesn't and the system continues to boot normally.

servercobra 3 hours ago | parent [-]

Now that I think about it, I think our admin system for the DHCP server was how we handled it instead of VLANs. Also helped with automated installation of Windows on desktops, bootstrapping servers, etc.

neuronflux 5 hours ago | parent | prev | next [-]

I'm guessing you use WipeOS to more easily handle securely erasing disks. Could you have iVentoy host WipeOS to simplify the setup?

thyristan 2 hours ago | parent [-]

Yes, but then you need to select the proper menu option at boot time. Sometimes just moving the hardware stack one to the left and swapping the cables is quicker.

danudey 5 hours ago | parent | prev [-]

I've set up PXE booting at two previous companies for very different use cases.

The first was to automate server deployment; we ran bare metal servers, and even though we had managed hosting in our data centre the installation, configuration, and deployment of a server could potentially take days since it was just me doing it and I had other things to do.

So one day I set to work. I installed an Ubuntu server the same way I always did and then captured the debconf configuration to turn into a preseed file. I set up the disk partitioning, etc., and configured the OS to boot from DHCP. Then I configured the DHCP server with MAC addresses for every server we got and an associated IP address so that a given physical server would always get the same IP address.

Then I set up an internal apt repository; that's where I put custom packages, backports I had to recompile, third-party packages (e.g. perconadb) and so on.

Lastly, I set up salt (the management orchestration tool, like puppet or chef or ansible) with a nice simple (detailed) configuration.

The machines would be configured to boot via PXE. They'd load the kernel and initrd, which contained the preseed file that answered all of the installation/configuration questions. Lastly it ran the post-install shell script which started salt and ran the initial configuration, much of which was based on hostname. This would turn the current DHCP-provided IP address into a static networking configuration so that the server wasn't reliant on DHCP anymore; it would ensure that SSH keys were installed, and that the right services were enabled or disabled, install some packages based on the hostname (which represented the role, e.g. db02.blah.blah got percona installed). I also had some custom data sources (whatever you would call them) so that I could install the right RAID controller software based on which PCI devices were present; after all that, it would reboot. Once it rebooted from the local disk, salt would pick back up again and do the rest of the configuration (now that it wasn't running from a chroot and had all the required systemd services running). What used to take me several days to do for two servers turned into something one of our co-ops could do in an hour.

Second was another company that wanted to standardize the version of Linux its developers were running. Again, I set up an Ubuntu installer and configured it to boot iPXE and then fetch the kernel and the root image via HTTPS. The Ubuntu installer at that point was a Snap, and the default 'source' was a squashfs file that it unpacked to the new root filesystem before proceeding with package installation. I set up some scripts and configurations to take the default squashfs filesystem, unpack it, install new packages via apt in a chroot, and then repack it again. This let me do things like ensure Firefox, Thunderbird, and Chrome were installed and configured not from snaps; update to the latest packages; make sure Gnome was installed, etc. A lot of that was stuff the installer would do, of course, but given we were on gigabit ethernet it was significantly faster to download a 2 GB squashfs file than to download a 512M squashfs file and then download new or updated packages. One again what used to start with "Here's a USB, I think it has the latest Ubuntu on it" and take most of a day turned into "Do a one-off boot from the network via UEFI, choose a hostname, username, and password, and then just wait for twenty minutes while you get a coffee or meet your coworkers". I even found a "bug" (misbehaviour) in the installer where it would mount the squashfs and then rsync the files, which seemed significantly slower because the kernel was only using one thread for decompressing; using `unsquashfs` could use all cores and was dramatically faster, so I got to patch that (which I'm not sure ever made it into the installer anyway).

The one thing I couldn't make work was the OEM installation, where you put everything down onto the system unattended then put the user through the Ubuntu OOBE process. That would have made it far easier to pre-provision systems for users ahead of time; I did replace the default Plymouth splash screen logo with our company logo though, which was pretty cool.

I also set up network booting of macOS at another job, but that's sort of a very different process because it has all its own tooling, etc. for managing and Apple ended up moving from custom deployment images to static images and MDM for post-install configuration.

TL;DR network booting is pretty great actually; it's a very niche use case but if you're clever you can get a lot done. There's also lots of options for booting into a bootloader that can then present other options, allowing you to choose to netboot Ubuntu Desktop, Ubuntu Server, Windows, RHEL, Gentoo, a rescue image, or anything else you want.

nyrikki 4 hours ago | parent [-]

> The one thing I couldn't make work was the OEM installation, where you put everything down onto the system unattended then put the user through the Ubuntu OOBE process.

Did you try chain booting into iPXE and using SYSLINUX?

I used just nginx try, where I could place a pressed for a known provisioning event, otherwise providing various live and utility images if the MAC address file didn’t exist for one off or emergency repair.

I could even serve up windows instances.

That is also very useful because occasionally you run into PXE firmware that is crippled, it may not apply now, but only having a tiny iPXE image on tftp helps with speed and security.

I would bet almost all vendors just use iPXE anyway, and at least you use to be able to replace the firmware on intel cards with it.

zorlack 8 hours ago | parent | prev | next [-]

The fun thing about learning to boot from PXE, is that you have to learn it every time you onboard a new type of hardware... or a new VM hypervisor... or new NIC firmware... or new BIOS firmware.

God help you if you actually want to install an operating system.

PXE is such a vital capability for working with on-prem servers. But it's ten different things which all have to play nicely together. Every time I build a PXE system I feel like I'm reinventing the universe in my tiny subnet.

hardwaresofton 4 hours ago | parent | next [-]

Agreed, PXE seems ideal for provisioning things, but it's just too hard to use, especially when you're not on a network you fully control.

I just want to start the computer, and have it download an immutable OS image from somewhere I decide (and supply a checksum for, etc). I don't want to set up TFTP or any of this other stuff. It feels like I should be able to just specify an IP (let's say) a checksum (maybe supply that information to the NIC directly somehow), and be off to the races after a reboot.

convolvatron an hour ago | parent [-]

replace the PXE stack with an OS installer written in UEFI. This bootload can be installed through a guest running on the host in the EFI partition, or possibly through PXE or direct UEFI http load.

this allows you intermediate the boot process without coordinating with the administrative owner of the DHCP server, and is actually less janky than PXE

legooolas 6 hours ago | parent | prev | next [-]

I've not found this at all -- PXE "just works" on legacy boot or UEFI for me. I've used it for years to install hosts via Foreman (https://theforeman.org/), as well as for personal stuff on my home network, and it's so much better than getting people to use USB sticks or whatever else!

generalizations 5 hours ago | parent | prev | next [-]

I’m confused, are you talking about getting PXE enabled in the hardware, or customizing something about your PXE software for the new hardware?

zorlack 4 hours ago | parent | next [-]

There's a lot of nonsense at every level. Especially when dealing with heterogenous infrastructure.

Some NICs support http. Some NICs support tftp. Some NICs have enough memory for a big iPXE, other NICs don't. Some BMC systems make next-boot-to-lan easy, but not all.

We almost always use iPXE in order to normalize our pxe environment before OS kickstart. There's a lot to it and quite a lot of little things that can go wrong. Oh, and every bit of it becomes critical infra.

generalizations 4 hours ago | parent [-]

Ok, that makes more sense. I'm used to iPXE, and I guess that quick bootstrap from PXE->iPXE bypasses a lot of the nonstandard weirdness.

kasabali 5 hours ago | parent | prev [-]

All of 'em.

webdevver 8 hours ago | parent | prev | next [-]

we need to go /stalinmode/ on the whole bootup and initialization industry subsector. it should be required by law for that stuff to be open source and documented.

"but muh competitive advantage??"

its literally a for loop that reads sectors from disk/network into memory and jumps to the start address.

if a local build of the (vendor provided source code) firmware doesn't match the checksum of the build thats flashed on the actual mobo, you get sent to a cobalt mine.

toast0 4 hours ago | parent | next [-]

Boot by committee (UEFI) doesn't seem much better than boot by fiat (BIOS). For everything nice it gives you, you lose something nice that BIOS gave you ... or you have something nice that you lose when you exit boot services. Or there's an extension for something nice that isn't usable on mainstream hardware.

UEFI gives you nicer video modes, but not a text mode after boot services.

UEFI has an extension for booting images from the network, but afaik, it's impossible to use, and there's no reasonable way to boot from a disk image; working UEFI network boot has to pull pieces out of the filesytem and present them seperately; as opposed to MEMDISK which makes the image available as a BIOS disk and the image is labeled so that one the OS is loaded, the image can be used without BIOS hooks. If this is possible on UEFI generally, it isn't widely distributed knowledge. Something that will work on any UEFI system that makes it to iPXE, subject to changes to the OS in the image (which is reasonable... MEMDISK needs changes too, unless the OS runs all disk I/O through BIOS APIs)

pjc50 7 hours ago | parent | prev [-]

You're getting downvotes for being hyperbolic about it, but boot integrity is really both a consumer safety and a national security issue.

happyPersonR 5 hours ago | parent | prev [-]

Yeah in order to automate, you’ve gotta know something about what you’re automating. PXE is not different.

starkparker 6 hours ago | parent | prev | next [-]

I've used PXE (not even iPXE, just DHCP/TFTP without HTTP) mainly in environments where a LAN client-server game would need to be launched on many systems at once. Nothing quite like rolling out a hand-tailored distro for a single game to 16 computers and seeing them all boot and load straight into the game, one after the other, entirely unattended, from one broadcast boot-over-Ethernet trigger.

I think at one point we were even using distcc to use the clients to speed up rebuilds while iterating on the game. I should revisit that with iPXE and icecream.

bradfa 6 hours ago | parent | prev | next [-]

PXE is awesome, especially if you combine it with systemd's UKI mechanism and its EFI stub. You can load a single file via TFTP or HTTP(S) and boot into a read-only (or ramdisk-only) full Linux system. Most off the shelf distributions can be made to work in this way, with a small bit of effort. A very usable Debian system is a few hundred MB.

You can extend this with secure boot (using your own keys) to sign the entire UKI file, so your firmware will authenticate the full "disk" image that it boots into.

anonymousiam 5 hours ago | parent | prev | next [-]

Having done lots of network booting over the years, here are a few of my lessons learned:

PXE is a big improvement over the boot EPROMs that we needed to install on our NICs back in the day. Those would get an address via DHCP and then TFTP the boot image, and boot it.

I've had some trouble with PXE boot that's been caused by STP. If your PXE boot server has, or is behind a bridge with STP turned on, it can prevent the client from booting. I think this has something to do with the STP "learning state", but turning off STP on the bridge can solve the problem, as long as you're sure that you will not be creating any network loops on the affected interfaces.

There's also a new "https boot", which is supposed to be a PXE replacement, but TLS certs have time validity windows, and some clients may not have an RTC, or might have a dead CMOS battery, and those might not boot if the date is wrong.

thyristan 2 hours ago | parent | next [-]

You don't need to turn off STP, usually it's enough to set the forward delay to a very small value ("port fast" in cisco commands). If there is a loop, the port will usually still detect it, you at the most get a handful of multiplied packets.

And all the "http boot" firmware I've seen either always ignores certificate errors or doesn't do TLS anyways.

tstack 4 hours ago | parent | prev [-]

> There's also a new "https boot", which is supposed to be a PXE replacement, but TLS certs have time validity windows, and some clients may not have an RTC, or might have a dead CMOS battery, and those might not boot if the date is wrong.

I think the lack of entropy right after boot can also be a problem for the RNG. But, maybe that has been solved in more modern hardware.

pzmarzly 8 hours ago | parent | prev | next [-]

TFTP is crazy slow, even with RFC 7740 (buffering), but the payloads are usually small so few people care.

Thankfully modern BIOSes tend to implement HTTP boot option, where you can point to any HTTP or HTTPS URL (as long as the URL ends with ".efi", which is a pretty dumb limitation if you ask me).

legooolas 6 hours ago | parent | next [-]

You can also do things like boot with PXE (Legacy or UEFI PXE boot) to get a small image like iPXE, and then have iPXE do the http boot part. This means that you have an extra shim but you can pull larger images than TFTP is any good for.

TFTP is also UDP and I don't think it is pipelined, so it's all req->ack->req->ack, so any additional latency hits it hard too.

pjc50 7 hours ago | parent | prev [-]

They let you boot off HTTPS? That explains why corp IT pushed out a Dell BIOS vulnerability update today relating to OpenSSL in my BIOS.

kotaKat 6 hours ago | parent [-]

Yup! You can point your BIOS at a .efi and it’ll Just Boot It. We’ve even got Wi-Fi support in some of these as well for a full wireless deploy…

https://www.dell.com/support/manuals/en-us/bios-connect/http...

ronniefalcon 6 hours ago | parent | prev | next [-]

Lots of fun with this and lots of possibilities.

Had great experience using PXE to boot HPC farms, mounting the OS from a NAS and using only a local disk in the machine for tmp and other writable locations. I am not sure how 'diskless' linux works these days on rocky flavours but was solid in centos 5 through 7.

happyPersonR 5 hours ago | parent | prev | next [-]

I’m glad a lot of server stuff has redfish. But something better still needs to be there for non-server stuff for sure. Raspberry pi style bootloaders would be amazing, ones we could configure to use a certain image before powering on for first boot would be even more amazinger.

latchkey 5 hours ago | parent | prev | next [-]

I figured out how to PXE boot 20,000 PS5 APU blades (BC-250) during covid when I couldn't even get to the actual hardware. Great fun.

Scott-David 3 hours ago | parent | prev [-]

"Clear steps and tips—great for anyone learning PXE booting."

"Excellent explanation, makes PXE boot much less intimidating."

"A practical guide for mastering PXE booting efficiently.