X-Git-Url: https://iankelling.org/git/?a=blobdiff_plain;f=README;h=0dc966b444daaec21b33185ff1b45979a82626b7;hb=137ffae7de84a51c4b438ccf2fb50f5571f522a6;hp=5f780755ffa4702790b6bcda31e2aacff1fa8253;hpb=fe81034ee9664d8e131bac218b40d99a58a31649;p=automated-distro-installer diff --git a/README b/README index 5f78075..0dc966b 100644 --- a/README +++ b/README @@ -76,7 +76,7 @@ fai/config/class/50-host-classes. Before doing a fai install, you will need to populate /q/root/luks and /q/root/shadow, see their references. You might also want to copy existing /etc/ssh/*host* to -/p/c/machine_specific/HOST/filesystem/etc/ssh. +/p/c/machine_specific/HOST/filesystem/etc/ssh host-* luks keyfiles generated like: head -c 2048 /dev/urandom | od | s dd of=/q/root/luks/host-demohost @@ -138,10 +138,84 @@ faiserver-disable # Disable the fai nfs server exports fresize # resize swap or boot partitions in a host -# Replacing a raid 10 disk +# NAT/forward/vpn tftp + +I tried to get this working, but failed. + +tftp server in theory can be forwarded over a vpn, eg on a wireguard tunnel. + +However, I found that when actually pxe booting, it wouldn't work, only +the 1st filename would be requested, eg, in the logs: + +Jun 20 23:51:02 kd in.tftpd[4021350]: RRQ from 10.2.0.12 filename pxelinux.0 + + +To get that far, nating tftp requires some special attention in iptables, like so: + +https://unix.stackexchange.com/questions/579508/iptables-rules-to-forward-tftp-via-nat +iptables -t raw -A PREROUTING -p udp --dport 69 -s 209.51.188.0/24 -j CT --helper tftp +modprobe nf_nat_tftp + +to test tftp from a client machine: + +tftp SERVER_IP -c get pxelinux.0 +rm -fv pxelinux.0 + + +# Common problems + +## kernel mismatch very early error, no remote logs: + +ERROR: the running kernel does not match the kernel modules inside the nfsroot. +ERROR: Kernel modules directory /lib/modules/5.10.0-8-amd not available. Only found /lib/modules/5.10.0-15-amd64 + +solution: if running from fai-cd, recreate autodiscover cd as noted above in setup. + +# What good logs look like: + +logging nfs traffic from server + +s rpcdebug -m nfsd -s all + + +normal nfs mount & umount logs look like: + +journalctl -ef | gr nfs -# i expect better results with newer kernel and btrfs progs than the default stretch -fai-server buster +Jun 20 22:15:36 kd rpc.mountd[2025725]: authenticated mount request from 10.32.2.1:865 for /srv/fai/nfsroot (/srv/fai/nfsroot) +Jun 20 22:15:36 kd kernel: nfsd: exp_rootfh(/srv/fai/nfsroot [00000000e8c53e54] *:dm-0/5521225) +Jun 20 22:15:36 kd kernel: nfsd: fh_compose(exp 00:1b/5521225 fai/nfsroot, ino=5521225) +Jun 20 22:15:36 kd kernel: nfsd: FSINFO(3) 28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000 +Jun 20 22:15:36 kd kernel: nfsd: fh_verify(28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000) +Jun 20 22:15:36 kd kernel: nfsd: PATHCONF(3) 28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000 +Jun 20 22:15:36 kd kernel: nfsd: fh_verify(28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000) +Jun 20 22:15:36 kd kernel: nfsd: GETATTR(3) 28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000 +Jun 20 22:15:36 kd kernel: nfsd: fh_verify(28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000) +Jun 20 22:15:36 kd kernel: nfsd: FSINFO(3) 28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000 +Jun 20 22:15:36 kd kernel: nfsd: fh_verify(28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000) +Jun 20 22:15:36 kd kernel: nfsd: GETATTR(3) 28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000 +Jun 20 22:15:36 kd kernel: nfsd: fh_verify(28: 00070001 00543f49 00000000 d185f7b0 58d1a3c6 00000000) +Jun 20 22:15:45 kd rpc.mountd[2025725]: authenticated unmount request from 10.32.2.1:986 for /srv/fai/nfsroot (/srv/fai/nfsroot) + +normal tftpd logs from: + +after setting -vv in TFTP_OPTIONS in /etc/default/tftpd-hpa + +journalctl -u tftpd-hpa + +Jun 20 23:51:02 kd in.tftpd[4021350]: RRQ from 10.2.0.12 filename pxelinux.0 +Jun 20 23:51:02 kd in.tftpd[4021351]: RRQ from 10.2.0.12 filename ldlinux.c32 +Jun 20 23:51:02 kd in.tftpd[4021352]: RRQ from 10.2.0.12 filename pxelinux.cfg/a913a477-fca6-234d-a928-6bb011decd05 +Jun 20 23:51:02 kd in.tftpd[4021352]: sending NAK (1, File not found) to 10.2.0.12 +Jun 20 23:51:02 kd in.tftpd[4021353]: RRQ from 10.2.0.12 filename pxelinux.cfg/01-52-54-00-9c-ef-ad +Jun 20 23:51:02 kd in.tftpd[4021353]: sending NAK (1, File not found) to 10.2.0.12 +Jun 20 23:51:02 kd in.tftpd[4021354]: RRQ from 10.2.0.12 filename pxelinux.cfg/0A02000C +Jun 20 23:51:02 kd in.tftpd[4021355]: RRQ from 10.2.0.12 filename vmlinuz-5.10.0-15-amd64 +Jun 20 23:51:03 kd in.tftpd[4021356]: RRQ from 10.2.0.12 filename initrd.img-5.10.0-15-amd64 + + + +# Replacing a raid 10 disk pxe-server -S HOST fai @@ -180,12 +254,43 @@ reboot # Expected output in fai logs -For flidas, when installing systemd, this error happens, and it's + +## On focal: + +fai.log:updatebase.UBUNTU FAILED with exit code 1. +the real error is dpkg-reconfigure locales, seems to be related +to a workaround for < 20.04, relevant comment: +# in case the locales are already included inside the base file (Ubuntu) +in config/hooks/instsoft.DEBIAN + + +## For flidas, + +when installing systemd, this error happens, and it's a superflous upstream bug based on reading the post install script: addgroup: The group `systemd-journal' already exists as a system group. Exiting. Operation failed: No such file or directory +## On nabia/newer, + +python is removed, now its python3, +and its easier to just let the package get removed than +do host class package config. +fai.log:WARNING: These unknown packages are removed from the installation list: python python-minimal + +Similar to python, linux-image-amd64 is the debian package name +for the kernel, linux-image-generic is for ubuntu, but the +DEBIAN class is defined on ubuntu and its easier to just let +the package get removed with this warning: +fai.log:WARNING: These unknown packages are removed from the installation list: linux-image-amd64 +Also, cryptsetup-initramfs is new to buster/nabia, it gets removed +on earlier versions. + +## parted error +fai.log:Error: /dev/vda: unrecognised disk label +This is from parted -m $d unit MiB print. +It happens when there are no partitions yet. # linode notes @@ -268,6 +373,13 @@ umount dev umount sys reboot +# for switching the boot to root2 +zboot +# for switching back, efibootmgr, if there is a problem with the root filesystem detection, +# boot into the debian bootstrap distro, run partition.DEFAULT using comments for mktab arg. +# then manually run iboot and then reboot. + + # pine rock64 notes # the only useful image is ubuntu 18.04 ayafun or something. # using emmc usb: @@ -282,6 +394,23 @@ s apt-get update s apt dist-upgrade +### How to merge upstream fai-config + +git checkout upstream +cd path-to-fai-config +git pull --stat +# the following needs modification if there was deletions or renames +rsync --exclude /.git -rlpgoDcvi . /b/fai/fai/config/ +cd /b/fai/fai/config/ +# where XXXXX is the git commit hash +# note, several files which just had trailing space changes will get ignored. +git commit -am "update upstream to XXXXX" +git checkout master +git merge upstream +# fix conflicts +git commit + + # TODO Change arch to archlike and to support arch and parabola