static usb ethnet addresses
[automated-distro-installer] / README
1 # This file is part of Ian Kelling's automated-distro-installer
2 # Copyright (C) 2024 Ian Kelling
3
4 # This program is free software; you can redistribute it and/or
5 # modify it under the terms of the GNU General Public License
6 # as published by the Free Software Foundation; either version 2
7 # of the License, or (at your option) any later version.
8
9 # This program is distributed in the hope that it will be useful,
10 # but WITHOUT ANY WARRANTY; without even the implied warranty of
11 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
12 # GNU General Public License for more details.
13
14 # You should have received a copy of the GNU General Public License
15 # along with this program; if not, write to the Free Software
16 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
17
18 PXE install w multi-boot, btrfs & Libreboot support
19
20 Some things are specific to my home network, and uses files with secrets
21 that are not in this repo. I use this for bare metal and vms, and two
22 scripts which can run post boot so I use them on vps distributed image
23 as well.
24
25 Features people may find useful: installs encrypted trisquel, debian,
26 ubuntu, arch, and parabola (archlike install is likely broken, I've only
27 done pxe boots recently), in a multi-boot setup using multiple
28 subvolumes of a single btrfs filesystem. Utilizes multiple disks, with
29 scripts to automatically decrypt on intentional reboots, but not after
30 shutdown or power loss.
31
32 Normal install mode for fai is using pxe, but on a libreboot system,
33 there is no pxe. The pxe in a normal computer is nonfree
34 firmware. Alternatives to normal pxe that I've tried:
35
36 * libreboot + seabios + ipxe
37
38 * Use a live cd to call pxe-kexec, this is described later in this file.
39
40 * Use the fai autodiscover iso. This is more automated, so nicer.
41
42 * Use an install method above to setup a gnu/linux disk partition that
43 coordinates with libreboot grub to acts like a pxe boot using
44 kexec. The boot process takes a bit longer than normal pxe. This is
45 the bootstrap partition in my scripts.
46
47 Things I haven't tried:
48
49 * The bios chip has enough room for an initrd. This could be setup to
50 work like the partition I use to kexec, but it would be faster, and
51 not require installing to disk.
52
53 The partititioning and filesystem script is at
54 fai/config/hooks/partition.DEFAULT. Disks are grouped as ssd or hdd and
55 raided in raid 1 or raid 0 per configuration. The base partitions are
56 divided into boot, swap, and root, (only boot is unencrypted). There are
57 scripts to resize those partitions post-provision and while the system
58 is running.
59
60 People who use fai may find these things as useful examples: it uses
61 dnsmasq (on a openwrt machine) for dhcp instead of the isc
62 dhcp. fai-wrapper is a small script to use basic fai classes outside of
63 fai. It does not use the fai partitioning tool, but the script is
64 inspired from it and works outside of fai. It supports running a fai
65 server on debian within android via Maru.
66
67 It also automates configuration of an openwrt router after manual
68 initial installation.
69
70 After provisionining is done, I sync files using btrfs, or unison for
71 vps, then automate further setup using a different set of scripts,
72 https://iankelling.org/git/?p=distro-setup;a=tree.
73
74 My network is a wndr3700v2 router with openwrt on it and a few pcs/laptops.
75
76 Since fai requires a debian server as the fai server, there are also
77 scripts to automate a debian install using pxe and preseeding, which can
78 be done from any distro.
79
80 Some of the scripts have dependencies for some simple obvious utility
81 scripts from https://iankelling.org/git, and of course there are some
82 hostnames that are specific to my network.
83
84
85 # Per-host/install configuration
86
87 Before doing a fai install, you will need to populate a class file. I
88 use one called 51-multi-boot, which you can see example of in
89 fai/config/class/50-host-classes.
90
91 Before doing a fai install, you will need to populate /q/root/luks and
92 /q/root/shadow, see their references. You might also want to copy
93 existing /etc/ssh/*host* to
94 /p/c/machine_specific/HOST/filesystem/etc/ssh
95
96 host-* luks keyfiles generated like:
97 h=demohost; head -c 2048 /dev/urandom | od | se dd of=/q/root/luks/host-$h
98
99 Configuration of which luks key to use is in
100 fai/config/hooks/partition.DEFAULT
101
102 Configuration of which (if any) shadow file to use is in
103 fai/config/distro-install-common/end
104 and which shadow file / luks file(s) to copy into the new machine depends
105 on fai-redep arguments.
106
107 Also, setup dns in bind and wrt-setup-local.
108
109 After install, btrbk to setup data, and then distro-begin && distro end.
110 See notes in distro-begin for other configuration.
111
112 # Scripts (meant to be used directly):
113
114
115 # Setup the environment for the install
116
117 # create tiny autodiscover cd
118 # todo: with fai-revm at least, this complains about missing vmlinuz. need to fix this.
119 fai-redep && sudo fai-cd -g $PWD/grub.cfg.autodiscover -f -A $BASEFILE_DIR/autodiscover.iso
120 # create normal fai cd (replace TARGET_HOSTNAME)
121 fai-redep -t TARGET_HOSTNAME && sudo fai-cd -M -g $PWD/grub.cfg.netinst-noreboot -f $BASEFILE_DIR/netinst.iso
122 # note, may need to set hostname, depending on config,
123 # and some other things for environment not on your lan
124 # for example see fai/config/class/LINODE.var. See linode notes below.
125
126 mymk-basefile # Create basefiles for various distros
127 archlike-pxe # Setup pxe boot server from an archlike base image
128 fai-redep # Deploy fai configuration to host "faiserver"
129 faiserver-uninstall # uninstall fai-server
130 faiserver-setup # install fai-server on the current machine
131 myfai-chboot # setup fai tftp and nfs. useful for doing pxe-kexec or booting from a fai-cd.
132 pxe-server # disable/enable pxe dhcp, tfp, and nfs. calls myfai-chboot
133 wrt-setup # setup my router in general: dhcp, dns, etc.
134
135
136 # Script to do a distro install
137
138 faiserver-revm # using pxe & preseed, create a vm which is a fai server
139 dsfull # install & post-install a new fai distro
140 arch-init-remote # install arch after it's been booted into it's setup env
141 live-kexec # Kexec this or a remote machine using host faiserver. also
142 useful to run as curl live-kexec|bash
143
144
145 # Test scripts
146
147 arch-revm # test arch install on a fresh vm
148 fai-revm # test fai install on a fresh vm
149
150
151 # Scripts to call after a distro install for various reasons
152
153 chboot # Set grub to boot into a different distro (installed earlier)
154 install-chboot # reinstall chboot to /boot subvols, for chboot updates.
155 eboot # reboot without automatic disk decryption
156 fai-wrapper # use fai classes outside of fai. sourced, not called.
157 faiserver-disable # Disable the fai nfs server exports
158 fresize # resize swap or boot partitions in a host
159
160
161 # NAT/forward/vpn tftp
162
163 I tried to get this working, but failed.
164
165 tftp server in theory can be forwarded over a vpn, eg on a wireguard tunnel.
166
167 However, I found that when actually pxe booting, it wouldn't work, only
168 the 1st filename would be requested, eg, in the logs:
169
170 Jun 20 23:51:02 kd in.tftpd[4021350]: RRQ from 10.2.0.12 filename pxelinux.0
171
172
173 To get that far, nating tftp requires some special attention in iptables, like so:
174
175 https://unix.stackexchange.com/questions/579508/iptables-rules-to-forward-tftp-via-nat
176 iptables -t raw -A PREROUTING -p udp --dport 69 -s 209.51.188.0/24 -j CT --helper tftp
177 modprobe nf_nat_tftp
178
179 to test tftp from a client machine:
180
181 tftp SERVER_IP -c get pxelinux.0
182 rm -fv pxelinux.0
183
184
185 # Common problems
186
187 ## kernel mismatch very early error, no remote logs:
188
189 ERROR: the running kernel does not match the kernel modules inside the nfsroot.
190 ERROR: Kernel modules directory /lib/modules/5.10.0-8-amd not available. Only found /lib/modules/5.10.0-15-amd64
191
192 solution: if running from fai-cd, recreate autodiscover cd as noted above in setup.
193
194 ## Weird package dependency errors
195
196 for example: in fai.log, within instsoft.DEBIAN
197 ```
198 The following packages have unmet dependencies:
199 libc6 : Breaks: locales (< 2.36) but 2.35-0ubuntu3.7+11.0trisquel1 is to be installed
200 ```
201
202 In this case, it was because the basefile was missing, and so instead
203 fai decided to use the wrong basefile.
204
205 for example: in fai.log, within instsoft.DEBIAN
206
207 ```
208 ftar: No matching class found in /var/lib/fai/config/basefiles//
209 ftar: extracting /var/tmp/base.tar.zst to /target/
210 ```
211
212 # What good logs look like:
213
214 logging nfs traffic from server
215
216 s rpcdebug -m nfsd -s all
217
218
219 normal nfs mount & umount logs look like:
220
221 journalctl -ef | gr nfs
222
223 Jun 20 22:15:36 kd rpc.mountd[2025725]: authenticated mount request from 10.32.2.1:865 for /srv/fai/nfsroot (/srv/fai/nfsroot)
224 Jun 20 22:15:36 kd kernel: nfsd: exp_rootfh(/srv/fai/nfsroot [00000000e8c53e54] *:dm-0/5521225)
225 Jun 20 22:15:36 kd kernel: nfsd: fh_compose(exp 00:1b/5521225 fai/nfsroot, ino=5521225)
226 Jun 20 22:15:36 kd kernel: nfsd: FSINFO(3) 28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000
227 Jun 20 22:15:36 kd kernel: nfsd: fh_verify(28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000)
228 Jun 20 22:15:36 kd kernel: nfsd: PATHCONF(3) 28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000
229 Jun 20 22:15:36 kd kernel: nfsd: fh_verify(28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000)
230 Jun 20 22:15:36 kd kernel: nfsd: GETATTR(3) 28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000
231 Jun 20 22:15:36 kd kernel: nfsd: fh_verify(28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000)
232 Jun 20 22:15:36 kd kernel: nfsd: FSINFO(3) 28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000
233 Jun 20 22:15:36 kd kernel: nfsd: fh_verify(28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000)
234 Jun 20 22:15:36 kd kernel: nfsd: GETATTR(3) 28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000
235 Jun 20 22:15:36 kd kernel: nfsd: fh_verify(28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000)
236 Jun 20 22:15:45 kd rpc.mountd[2025725]: authenticated unmount request from 10.32.2.1:986 for /srv/fai/nfsroot (/srv/fai/nfsroot)
237
238 normal tftpd logs from:
239
240 after setting -vv in TFTP_OPTIONS in /etc/default/tftpd-hpa
241
242 journalctl -u tftpd-hpa
243
244 Jun 20 23:51:02 kd in.tftpd[4021350]: RRQ from 10.2.0.12 filename pxelinux.0
245 Jun 20 23:51:02 kd in.tftpd[4021351]: RRQ from 10.2.0.12 filename ldlinux.c32
246 Jun 20 23:51:02 kd in.tftpd[4021352]: RRQ from 10.2.0.12 filename pxelinux.cfg/a913a477-fca6-234d-a928-6bb011decd05
247 Jun 20 23:51:02 kd in.tftpd[4021352]: sending NAK (1, File not found) to 10.2.0.12
248 Jun 20 23:51:02 kd in.tftpd[4021353]: RRQ from 10.2.0.12 filename pxelinux.cfg/01-52-54-00-9c-ef-ad
249 Jun 20 23:51:02 kd in.tftpd[4021353]: sending NAK (1, File not found) to 10.2.0.12
250 Jun 20 23:51:02 kd in.tftpd[4021354]: RRQ from 10.2.0.12 filename pxelinux.cfg/0A02000C
251 Jun 20 23:51:02 kd in.tftpd[4021355]: RRQ from 10.2.0.12 filename vmlinuz-5.10.0-15-amd64
252 Jun 20 23:51:03 kd in.tftpd[4021356]: RRQ from 10.2.0.12 filename initrd.img-5.10.0-15-amd64
253
254
255
256 # Replacing a raid 10 disk
257
258 pxe-server -S HOST fai
259
260 # btrfs replace or delete. prefer replace. to setup partitions on replacement drive:
261 scp fai-wrapper HOST:
262 ssh root@HOST
263 . fai-wrapper
264 export SPECIAL_DISK=/dev/REPLACEMENT_DEV
265 /var/lib/fai/config/hooks/partition.DEFAULT
266
267
268 ssh root@HOST
269 for x in /target/* /target; do umount $x; done
270 cat >p
271 PASSWORD HERE(ctrl-d ctrl-d)
272 cd /dev/disk/by-id/
273 for d in ata*part1; do cryptsetup luksOpen -d /root/p $d crypt_dev_$d; done
274 x=(/dev/mapper/*part1); mount -o subvol=root_trisquelflidas $x /mnt
275 # btrfs fi show /mnt
276 # btrfs replace start -f /dev/mapper/OLD_DEV /dev/mapper/NEW_DEV /mnt
277 # btrfs replace status /mnt
278 # nohup btrfs dev delete /dev/sde1 /mnt
279 mount -o subvol=boot_trisquelflidas /dev/sda3 /mnt/boot
280 # also replace or delete disk for boot
281 for x in dev proc sys; do mount -o bind /$x /mnt/$x; done
282 chroot /mnt /bin/bash
283 # replace disk in fstab
284 # replace disk in /etc/crypttab
285 update-grub
286 update-initramfs -u
287 mount /a
288 /a/exe/keyscript-on
289 exit
290 reboot
291
292
293 # Expected output in fai logs
294
295
296 ## On focal:
297
298 fai.log:updatebase.UBUNTU FAILED with exit code 1.
299 the real error is dpkg-reconfigure locales, seems to be related
300 to a workaround for < 20.04, relevant comment:
301 # in case the locales are already included inside the base file (Ubuntu)
302 in config/hooks/instsoft.DEBIAN
303
304
305 ## For flidas,
306
307 when installing systemd, this error happens, and it's
308 a superflous upstream bug based on reading the post install script:
309
310 addgroup: The group `systemd-journal' already exists as a system group. Exiting.
311 Operation failed: No such file or directory
312
313 ## On nabia/newer,
314
315 python is removed, now its python3,
316 and its easier to just let the package get removed than
317 do host class package config.
318 fai.log:WARNING: These unknown packages are removed from the installation list: python python-minimal
319
320 Similar to python, linux-image-amd64 is the debian package name
321 for the kernel, linux-image-generic is for ubuntu, but the
322 DEBIAN class is defined on ubuntu and its easier to just let
323 the package get removed with this warning:
324 fai.log:WARNING: These unknown packages are removed from the installation list: linux-image-amd64
325 Also, cryptsetup-initramfs is new to buster/nabia, it gets removed
326 on earlier versions.
327
328 ## parted error
329 fai.log:Error: /dev/vda: unrecognised disk label
330 This is from parted -m $d unit MiB print.
331 It happens when there are no partitions yet.
332
333
334 ######## notes on creating a lan with just 2 computers ########
335
336
337 ## below assumes eth0 is the ethernet device used to connect to the target computer.
338
339
340 # this is not strictly needed. I had my connection die at some point,
341 # and I suspected this might help.
342 # based on
343 # https://support.qacafe.com/knowledge-base/how-do-i-prevent-network-manager-from-controlling-an-interface/
344 cat > /etc/NetworkManager/conf.d/99-fai-tmp.conf <<'EOF'
345 [main]
346 plugins=keyfile
347
348 [keyfile]
349 unmanaged-devices=interface-name:eth0
350 EOF
351 ser restart NetworkManager
352
353
354 cat >> /etc/network/interfaces <<'EOF'
355 iface eth0 inet static
356 address 10.0.44.1/24
357 EOF
358
359 ifup eth0
360
361 # note turn off fsf vpn, so route to coresite is the normal route.
362 echo 1 > /proc/sys/net/ipv4/ip_forward
363 m s iptables -t nat -A POSTROUTING -o $(ip -4 route get 8.8.8.8 | sed -nr 's,^.* dev\s+(\S+).*,\1,p') -j MASQUERADE
364
365
366 change /p/c/machine_specific/vps/bind-initial/db.b8.nz
367 faiserver 10.0.44.1
368 TARGET_HOSTNAME 10.0.44.2
369
370 apt install isc-dhcp-server
371
372 cat >> /etc/default/isc-dhcp-server <<'EOF'
373 INTERFACESv4="eth0"
374 EOF
375
376 edit ./dhcpd.conf to change mac address and target host name.
377
378 s cp /b/fai/dhcpd.conf /etc/dhcp/
379 ser restart isc-dhcp-server
380
381 edit /a/bin/fai/fai/config/class/51-multi-boot
382
383 pxe-server -d TARGET fai
384
385 Then do a pxe boot on the target host
386
387
388
389 ##### linode notes ######
390
391 * create 2 disks, installer (3000 mb, raw), boot (remaining, raw)
392 * create 2 profiles w direct boot, no helpers:
393 * installer (sda=boot, sdb=installer, boot dev=sdb)
394 * boot (sda=boot)
395 * Boot into rescue mode, ssh in with lish,
396 curl url_to_some_fai_cd_created_image | dd of=/dev/sda
397 poweroff
398 * boot into installer.
399 * Lish shows console, at the end of install, it gives prompt because
400 logs failed to save remotely, check the logs, then reboot into boot
401 profile if all is well. If that doesn't happen, turn off lassie in
402 settings.
403
404
405
406 ###### ubuntu notes ######
407
408 For someone who really needed ubuntu on host tp, otherwise they would
409 end up on a non-gnu os, and I didn't want to figure out how to get all
410 the default software installed, I did the following:
411
412 # On remote host:
413 # install etiona
414 cd /b/fai
415 # set 51-multi-boot to set classes outside of fai-wrapper conditional, including NOWIPE
416 . fai-wrapper
417 ./fai/config/hooks/partition.DEFAULT
418
419 # on remote host
420 # install ubuntu 20.04 using virt-install
421 sudo -i
422 virt-install --os-variant=ubuntu16.04 --cdrom ubuntu-20.04-desktop-amd64.iso --disk path=u2004.qcow2 -r 2048 --vcpus 1 -n u2004
423 qemu-img create -o preallocation=metadata -f qcow2 u2004.qcow2 15G
424 # alternatively, also tried a physical install, because I know the virtual install ends up
425 # with some differen things, like some spice service. then pulled the data out with
426 rsync -ahSAX --numeric-ids --exclude=proc --exclude=sys --exclude=dev --exclude=tmp --exclude=run root@tp:/ .; mkdir proc sys dev tmp
427
428 modprobe nbd
429 qemu-nbd --connect=/dev/nbd0 u1804.qcow2 -f qcow2
430 qemu-nbd --connect=/dev/nbd0 u2004.qcow2 -f qcow2
431 mount /dev/nbd0p1 /mnt/1 # bionic
432 mount /dev/nbd0p5 /mnt/1 # focal
433 mount -o bind /mnt/root/root_ubuntubionic /mnt/2
434 mount -o bind /mnt/root/root_ubuntufocal /mnt/2
435 mkdir -p /mnt/2/boot
436 mount -o bind /mnt/boot/boot_ubuntubionic /mnt/2/boot
437 mount -o bind /mnt/boot/boot_ubuntufocal /mnt/2/boot
438 # S = sparse, A = acls, X = xattrs
439 rsync -ahSAX --numeric-ids /mnt/1/ /mnt/2
440
441 cd /mnt/2
442 cp /tmp/fai/crypttab etc
443 sed -i "s#/root/keyscript,#decrypt_keyctl,#" etc/crypttab
444 cp /tmp/fai/fstab etc
445 echo "tmpfs /tmp tmpfs nodev,nosuid,size=50%,mode=1777 0 0" >> etc/fstab
446 chrbind
447 chroot .
448 mv /etc/resolv.conf /etc/resolv.conf.old
449 echo nameserver 1.1.1.1 >/etc/resolv.conf
450 # install programs from /a/bin/fai/fai/config/package_config/STANDARD:
451 apt install -y openssh-client openssh-server cryptsetup keyutils btrfs-progs console-setup kbd pciutils usbutils unattended-upgrades initramfs-tools-core dropbear-initramfs
452 mv /etc/resolv.conf.old /etc/resolv.conf
453 exit
454 d=etc/initramfs-tools
455 mkdir -p $d/root/.ssh etc/dropbear-initramfs root/.ssh
456 chmod 700 $d/root $d/root/.ssh root/.ssh
457 cp -p /root/.ssh/authorized_keys $d/root/.ssh/authorized_keys
458 cp -p /root/.ssh/authorized_keys etc/dropbear-initramfs
459 cp -p /root/.ssh/authorized_keys root/.ssh/authorized_keys
460 chroot .
461 sed -ri 's/^ *GRUB_CMDLINE_LINUX_DEFAULT=.*/GRUB_CMDLINE_LINUX_DEFAULT="rd.luks.crypttab=no"/' /etc/default/grub
462 grub-install --no-floppy $(grub-probe -tdrive -d /dev/sda)
463 update-grub
464 grub-bios-setup -d /boot/grub/i386-pc -s /dev/sda
465 exit
466 umount proc
467 umount dev
468 umount sys
469 reboot
470
471 # for switching the boot to root2
472 zboot
473 # for switching back, efibootmgr, if there is a problem with the root filesystem detection,
474 # boot into the debian bootstrap distro, run partition.DEFAULT using comments for mktab arg.
475 # then manually run iboot and then reboot.
476
477
478 # pine rock64 notes
479 # the only useful image is ubuntu 18.04 ayafun or something.
480 # using emmc usb:
481 s mount /dev/sdb7 /mnt/1
482 s cp `which qemu-arm-static` /mnt/1/usr/bin
483 s chroot /mnt/1 qemu-arm-static /bin/bash
484 usermod --login iank --move-home --home /home/iank rock46
485 groupmod --new-name iank rock64
486 passwd iank
487 # boot it
488 s apt-get update
489 s apt dist-upgrade
490
491
492 ### How to merge upstream fai-config
493
494 git checkout upstream
495 cd path-to-fai-config
496 git pull --stat
497 # the following needs modification if there was deletions or renames
498 rsync --exclude /.git -rlpgoDcvi . /b/fai/fai/config/
499 cd /b/fai/fai/config/
500 # where XXXXX is the git commit hash
501 # note, several files which just had trailing space changes will get ignored.
502 git commit -am "update upstream to XXXXX"
503 git checkout master
504 git merge upstream
505 # fix conflicts
506 git commit
507
508
509 # TODO
510 Change arch to archlike and to support arch and parabola
511
512
513 # License
514
515 The license for the project is GPLv2 or later, mostly because fai is and
516 I periodically merge the upstream example config, which contains small
517 scripts. Also, there is a modified encrypt.upstream, which is from the
518 cryptsetup package in arch, which is under the same license.