ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

iSCSI Root Like a Data Center

Tue May 19, 2020 11:37 pm

Even if not counterfeit, the least reliable component of the Raspberry Pi is frequently the SD card. While a broken SD card is user-replaceable, using an iSCSI network block device for the root file system in addition to reliability allows easy backup, check pointing and re-imaging the OS on demand. This thread will share experiences and best practices on how to do this.

In getting started, it is worth noting that Raspbian is based on Debian and the documentation for Debian states
Debian wrote: Booting Debian with iSCSI root disk

There is no way to do this with the standard Debian initrd. The init4boot project supplies the needed infrastructure (especially an adapted initrd).
This sounds a bit problematic, especially as the init4boot project was last updated January 29, 2010 for Debian 5.0.1 Lenny. To put that in perspective, init4boot development ceased before the first Raspberry Pi was available.

Fortunately, there is already some support for including the necessary iSCSI modules and tools on the initial RAM disk when PXE booting a suitable x86 server as part of the open-iscsi package. This provides a solid starting place from which to make the necessary modifications for Raspbian. Therefore, since
Cantus in Fraggle Rock wrote: [She] has a long way to go, though the journey is short
the first step will be configuring things so one doesn't need to swap SD cards while debugging why it doesn't work.
Last edited by ejolson on Sat May 23, 2020 4:05 pm, edited 4 times in total.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Wed May 20, 2020 1:26 am

In anticipation of the inconvenience that occurs when changes to the boot partition lead to an SD card that doesn't boot, I decided to make an extra boot partition. To do this I downloaded 2020-02-13-raspbian-buster-lite.zip, unpacked it and copied the image to an new SD card. You can do this using Etcher, the new Raspberry Pi Imager or

Code: Select all

$ sha256sum 2020-02-13-raspbian-buster-lite.zip 
12ae6e17bf95b6ba83beca61e7394e7411b45eba7e6a520f434b0748ea7370e8  2020-02-13-raspbian-buster-lite.zip
$ unzip 2020-02-13-raspbian-buster-lite.zip
$ su
# dd if=2020-02-13-raspbian-buster-lite.img of=/dev/sdX bs=4M
where the value of X is carefully chosen to be the new SD card and not, for example, a critical system disk.

Before booting the newly imaged card, edit cmdline.txt to prevent the automatic resizing of the root filesystem. Otherwise, there will not be space left on the card to squeeze in the extra boot partition. In details, after editing cmdline.txt should looks contain the single line

Code: Select all

console=serial0,115200 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait sdhci.debug_quirks2=4
I also changed root to /dev/mmcblk0p2 rather than whatever the UUID originally was and touched ssh to enable ssh, both because of personal preference and because the UUID will change when I resize the partition. The mysterious quirks option has also been added. Place the card in a Pi 4B, log in and change the password.

The rest of this post discusses how to create an extra boot partition at the very end of the card and then manually resize the root partition to fill up the space as follows. First, check that the filesystem has not already been resized with

Code: Select all

$ sudo -s
# fdisk /dev/mmcblk0
Command (m for help): p
Disk /dev/mmcblk0: 14.4 GiB, 15489564672 bytes, 30253056 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x738a4d67

Device         Boot  Start     End Sectors  Size Id Type
/dev/mmcblk0p1        8192  532479  524288  256M  c W95 FAT32 (LBA)
/dev/mmcblk0p2      532480 3612671 3080192  1.5G 83 Linux
There should be two partitions. The second is the root partition and since it is 1.5G this confirms that it has not been resized yet. If it has been resized and already fills the card, it is easier to start over than fix things.

To create the extra 256MB boot partition at the end calculate where it should begin by subtracting 524288 from the number of sectors. In the present case there are 30253056 sectors on the card so

30253056-524288=29728768

Next type

Code: Select all

Command (m for help): n
Partition type
   p   primary (2 primary, 0 extended, 2 free)
   e   extended (container for logical partitions)
Select (default p): p
Partition number (3,4, default 3): 3
First sector (2048-30253055, default 2048): 29728768
Last sector, +/-sectors or +/-size{K,M,G,T,P} (29728768-30253055, default 30253055): 

Created a new partition 3 of type 'Linux' and of size 256 MiB.

Command (m for help): t
Partition number (1-3, default 3): 3
Hex code (type L to list all codes): c

Changed type of partition 'Linux' to 'W95 FAT32 (LBA)'.
The disk should now look like

Code: Select all

Device         Boot    Start      End Sectors  Size Id Type
/dev/mmcblk0p1          8192   532479  524288  256M  c W95 FAT32 (LBA)
/dev/mmcblk0p2        532480  3612671 3080192  1.5G 83 Linux
/dev/mmcblk0p3      29728768 30253055  524288  256M  c W95 FAT32 (LBA)
Finally, resize partition 2 by carefully deleting it and creating a new one that starts at exactly the same place. Let it default to the maximum size and do not delete the signature.

Code: Select all

Command (m for help): d
Partition number (1-3, default 3): 2

Partition 2 has been deleted.

Command (m for help): n
Partition type
   p   primary (2 primary, 0 extended, 2 free)
   e   extended (container for logical partitions)
Select (default p): p
Partition number (2,4, default 2): 2
First sector (2048-30253055, default 2048): 532480
Last sector, +/-sectors or +/-size{K,M,G,T,P} (532480-29728767, default 29728767): 

Created a new partition 2 of type 'Linux' and of size 13.9 GiB.
Partition #2 contains a ext4 signature.

Do you want to remove the signature? [Y]es/[N]o: N

Command (m for help): w

The partition table has been altered.
Syncing disks.
Note that this clumsy way of deleting and replacing partition 2 has likely changed its UUID. Instead of changing the UUID back, let's create a new post-pandemic normal by labeling the root filesystem.

Code: Select all

# umount /boot
# fatlabel /dev/mmcblk0p1 TBOOT
# mount /boot
# e2label /dev/mmcblk0p2 TROOT
Then edit /etc/fstab so it looks like

Code: Select all

proc        /proc proc defaults         0 0
LABEL=TBOOT /boot vfat defaults         0 2
LABEL=TROOT /     ext4 defaults,noatime 0 1
At this point, it should be fine to reboot the machine. To do this type

Code: Select all

# sync
# sync
# sync
# /sbin/reboot
If all went well, the machine will reboot. Before proceeding, check the partition and then resize the root filesystem.

Code: Select all

$ sudo -s
# fdisk /dev/mmcblk0
Command (m for help): p
Disk /dev/mmcblk0: 14.4 GiB, 15489564672 bytes, 30253056 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x738a4d67

Device         Boot    Start      End  Sectors  Size Id Type
/dev/mmcblk0p1          8192   532479   524288  256M  c W95 FAT32 (LBA)
/dev/mmcblk0p2        532480 29728767 29196288 13.9G 83 Linux
/dev/mmcblk0p3      29728768 30253055   524288  256M  c W95 FAT32 (LBA)

Command (m for help): q
After checking that partition 2 is now much bigger than 1.5G, resize the underlying filesystem with

Code: Select all

# resize2fs /dev/mmcblk0p2
resize2fs 1.44.5 (15-Dec-2018)
Filesystem at /dev/mmcblk0p2 is mounted on /; on-line resizing required
old_desc_blocks = 1, new_desc_blocks = 1
The filesystem on /dev/mmcblk0p2 is now 3649536 (4k) blocks long.
At this point the root filesystem has been expanded to fill partition 2. This can be verified by using df to obtain

Code: Select all

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        14G  1.3G   12G  10% /
devtmpfs        841M     0  841M   0% /dev
tmpfs           970M     0  970M   0% /dev/shm
tmpfs           970M  8.4M  962M   1% /run
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           970M     0  970M   0% /sys/fs/cgroup
/dev/mmcblk0p1  253M   53M  200M  21% /boot
tmpfs           194M     0  194M   0% /run/user/1000
The next post will describe formatting the extra boot partition and checking that it works. This will make it possible to experiment by creating initial RAM filesystems and changing parameters without fear of rendering the entire SD card unbootable.
Last edited by ejolson on Sat May 23, 2020 4:07 pm, edited 2 times in total.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Wed May 20, 2020 3:18 am

So far nothing has been accomplished with regards to the objective of network booting the Pi and then mounting root on an iSCSI block device. In a way all I'm doing in these first posts is describing a difficult way to install bog-standard Raspbian on an SD card and there is, indeed, still a long way to go.

To finish creating a Raspbian image with a fail-safe boot partition along with an extra one to experiment with, format the extra partition and copy the necessary boot files to it using

Code: Select all

$ sudo -s
# mkdosfs -F32 -n TXTRA /dev/mmcblk0p3
# mkdir /extra
# mount LABEL=TXTRA /extra
# cd /extra
# umask 000
# (cd /boot; tar clf - . ) | tar xf -
# ls 
bcm2708-rpi-b.dtb         bcm2711-rpi-4-b.dtb  fixup.dat         start4cd.elf
bcm2708-rpi-b-plus.dtb    bootcode.bin         fixup_db.dat      start4db.elf
bcm2708-rpi-cm.dtb        cmdline.txt          fixup_x.dat       start4.elf
bcm2708-rpi-zero.dtb      config.txt           issue.txt         start4x.elf
bcm2708-rpi-zero-w.dtb    COPYING.linux        kernel7.img       start_cd.elf
bcm2709-rpi-2-b.dtb       fixup4cd.dat         kernel7l.img      start_db.elf
bcm2710-rpi-2-b.dtb       fixup4.dat           kernel8.img       start.elf
bcm2710-rpi-3-b.dtb       fixup4db.dat         kernel.img        start_x.elf
bcm2710-rpi-3-b-plus.dtb  fixup4x.dat          LICENCE.broadcom
bcm2710-rpi-cm3.dtb       fixup_cd.dat         overlays
# cd
# umount /extra
Once the boot files are copied and the /extra partition unmounted, check whether the extra boot partition actually works by issuing a reboot with that partition.

Code: Select all

# systemctl reboot 3
Here the 3 indicates to reboot the Pi using partition 3 rather than the default boot partition, which is 1. If the Pi reboots without difficulty, then it's time to install open-iscsi and generate an initial RAM filesystem to use for experimentation.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Wed May 20, 2020 3:57 am

The next step is to get the /extra boot partition set up to use an initial RAM filesystem. A properly configured initial RAM filesystem will eventually be needed for network booting and it's easier to debug things one step at a time.

Since we will be making all experimental changes, henceforth, to the extra boot partition. Let's update /etc/fstab so that's the partition mounted under /boot by default. First, unmount the original boot partition with

Code: Select all

# umount /boot
Then, edit /etc/fstab so it reads as

Code: Select all

proc        /proc proc defaults         0 0
LABEL=TXTRA /boot vfat defaults         0 2
LABEL=TROOT /     ext4 defaults,noatime 0 1
Finally, mount the extra boot partition with

Code: Select all

# mount /boot
Verify everything is mounted correctly.

Code: Select all

$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/root        14G  1.3G   12G  10% /
devtmpfs        841M     0  841M   0% /dev
tmpfs           970M     0  970M   0% /dev/shm
tmpfs           970M  8.4M  962M   1% /run
tmpfs           5.0M  4.0K  5.0M   1% /run/lock
tmpfs           970M     0  970M   0% /sys/fs/cgroup
tmpfs           194M     0  194M   0% /run/user/1000
/dev/mmcblk0p3  253M   53M  200M  21% /boot
If /dev/mmcblk0p3 is now mounted as /boot, then it should be safe to start modifying the configuration. If something goes wrong with the experiments, then a simple power cycle will reload the original boot configuration from partition 1 and there will be no need for any further rescue operations (at least in theory).

To enable boot using an initial RAM disk one needs to create the RAM disk and change config.txt to enable it. These commands create the initial RAM disk.

Code: Select all

# uname -a
Linux raspberrypi 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux
# cd /boot
# update-initramfs -c -k 4.19.97-v7l+
update-initramfs: Generating /boot/initrd.img-4.19.97-v7l+
# mv initrd.img-4.19.97-v7l+ initrd7l.img
Here the kernel version number 4.19.97-v7l+ was discerned from uname and then included on the update-initramfs line to specify the version of the kernel modules to include. Note it is important that the contents of the initial RAM disk always be in sync with the kernel7l.img version. As a result, it will be necessary to update initrd7l.img by using a mv command similar to the one above every time a new initrd.img-X.YY.ZZ-v7l+ is created.

To activate the initial RAM filesystem edit config.txt so that the last lines appear as

Code: Select all

[pi4]
# Enable DRM VC4 V3D driver on top of the dispmanx display stack
dtoverlay=vc4-fkms-v3d
max_framebuffers=2
kernel=kernel7l.img
initramfs initrd7l.img

[all]
#dtoverlay=vc4-fkms-v3d
Leave all other lines (not shown) in the above file the same.

It should be pointed out that this example uses the 32-bit LPA kernel on a Raspberry Pi 4B. There are multiple reasons for this. First, that is the standard kernel on the Pi 4B. Second, I've had some trouble with warm reboots and the 64-bit kernel. While possible, a Pi 3B with iSCSI root due to the slowness of the 100 Mbit networking is not so much like a data center. At the same time, the modifications needed for older Pi models are straight forward and easy to make.

To test the initial RAM filesystem type

Code: Select all

# sync
# sync
# sync
# systemctl reboot 3
to reboot the system using the extra boot partition just updated. With luck, it will again reboot, but this time loading the initial RAM filesystem before finally pivoting root to the SD card. The next step is to modify the initial RAM filesystem to pivot root to an iSCSI network block device instead.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Wed May 20, 2020 8:04 pm

While optimally a Raspberry Pi would be used as the server for the iSCSI target devices, typical data centers are more likely to use a fibre-channel iSCSI SAN storage controller such as the hip hop IBM Storwize V7000F. Another option might be a smaller NAS such as those marketed by QNAP or Synology.

At any rate, since the resulting iSCSI target will appear as simply another block device on the Pi we configured in the previous posts, the Pi can format that block device with any filesystem then read and write to it through the standard Linux unified buffer cache. Not only does this perform better than NFS and Samba, but it allows making Docker containers in ways that can't be done using traditional network file systems.

For good reasons many branded NAS offerings have an interface that allows one to configure an iSCSI target while holding a drink in one hand and a mouse in the other. FreeNAS and Open Media Vault provide similar benefits for thirsty engineers whose employer can't afford a proprietary solution.

While I've been happily creating iSCSI targets without spilling even a drop using a QNAP TS-231P NAS and a mirrored array of two hard disks, a nuanced comparison of the differences between competing mouse interfaces would lengthen this notably short journey in an intolerable way. Therefore, I'll instead describe the commands needed to set up an iSCSI target on a generic Debian-based Linux machine. In particular, the following should work on a second Pi 4B running Raspian, an Odroid N2 running Armbian or even something as exotic as an old-school Intel-compatible desktop running authentic Debian Buster.

Now, log into the machine that will function as the iSCSI storage controller (not the Pi we configured in the previous posts) and type

Code: Select all

# apt-get install tgt
This installs the target framework that allows a Linux system to provide network block devices over iSCSI transports. While it is better performing to serve logical disk partitions created using LVM2 or volumes in a Ceph cluster, for experimentation it is more convenient to serve disk images contained in regular files.

The particular name is tapir.wulf with IP 192.168.174.153 of the machine I will use for the server. This is a type of pig on the wulf segment of the private network topology ill-conceived by myself during a party in 1999 that I didn't want to do all over again. The Raspberry Pi in the previous posts is turbo.wulf with IP 192.168.174.145 on the same subnet.

Take a moment to read the man pages for tgt-admin and targets.conf and then create the file iroot.conf in /etc/tgt/conf.d readable only by root that reads

Code: Select all

<target iqn.1999-01.wulf.tapir:odroid:iroot>
        initiator-address 192.168.174.145
        backing-store /home/targets/iroot.img
</target>
Note that the name of the server has been embedded in the target name followed by a colon and some made-up stuff which needs to be unique but might as well be a random number, especially if the server is not running on an Odroid single-board computer. The initiator address is the IP number of the Pi 4B configured earlier.

Under the assumption that the wulf segment of the LAN is secure and to simplify debugging I have, for the moment, left out any configuration for password authentication. As frequently pointed out by security professionals, it is easy to get so excited when it finally works that people forget to put the password back in. Fortunately, in these days of quarantine, any such excitement is nearly impossible. At the same time, don't forget to type

Code: Select all

# chmod 600 /etc/tgt/conf.d/iroot.conf
so the file is not world readable.

We now need to create the file that will be used as the backing store. Since there is a BTRFS-formatted SSD mounted as /x/lefb on the server, I first made a symbolic link to point /home/target to that device and then did some other stuff. The entire sequence of commands is as follows.

Code: Select all

# cd /x/lefb
# mkdir targets
# cd /home
# ln /x/lefb/targets .
# cd /x/lefb/targets
# touch iroot.img
# btrfs property set iroot.img compression ""
# dd bs=1024K seek=32768 count=0 if=/dev/zero of=iroot.img
# chmod 600 iroot.img
Note the above makes a sparse image file consisting of a single 32GB hole. The hole will get filled in with data as things are written to the image later. If you are following along but not using BTRFS on your backing store it is fine to omit the btrfs line that turns off compression.

Assuming that you are blessed with a version of Debian that has been assimilated by systemd, you may now activate the configuration by restarting the iSCSI target and check it worked.

Code: Select all

# systemctl restart tgt
# tgtadm --mode target --op show
Target 1: iqn.1999-01.wulf.tapir:odroid:iroot
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            SWP: No
            Thin-provisioning: No
            Backing store type: null
            Backing store path: None
            Backing store flags: 
        LUN: 1
            Type: disk
            SCSI ID: IET     00010001
            SCSI SN: beaf11
            Size: 34360 MB, Block size: 512
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            SWP: No
            Thin-provisioning: No
            Backing store type: rdwr
            Backing store path: /home/targets/iroot.img
            Backing store flags: 
    Account information:
    ACL information:
        192.168.174.145
Does anyone know what LUN 0 is about?

Anyway, LUN 1 looks correct. The next post will verify the target is working by mounting it on the Pi, partitioning, formatting and copying the operating system over to it while making a timing to test performance.
Last edited by ejolson on Mon May 25, 2020 4:16 am, edited 21 times in total.

trejan
Posts: 1685
Joined: Tue Jul 02, 2019 2:28 pm

Re: iSCSI Root Like a Data Center

Wed May 20, 2020 8:21 pm

ejolson wrote:
Wed May 20, 2020 8:04 pm
Does anyone know what LUN 0 is about?
LUN 0 is how the initiator interrogates the target to discover what other LUNs exist.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Wed May 20, 2020 11:38 pm

trejan wrote:
Wed May 20, 2020 8:21 pm
ejolson wrote:
Wed May 20, 2020 8:04 pm
Does anyone know what LUN 0 is about?
LUN 0 is how the initiator interrogates the target to discover what other LUNs exist.
That's clever to use the same protocol as a way to query the available targets. While interrogation got a bad name after the replacement of the Medieval Inquisition by the Inquisición Española some time ago, there is fortunately no need for a Pi to perform similar or any different types of interrogation.

This post describes verifying the server set up in the previous post is working and then copying the root files over to the iSCSI target. Log into the Pi 4B with the initial RAM filesystem and extra boot partition configured at the beginning of this thread and install open-iscsi.

Code: Select all

# apt-get install open-iscsi
Now, set up the iSCSI initiator. To avoid the iSCSI inquisition we will carefully enter the exact target name by hand.

Code: Select all

# iscsiadm -m node --target iqn.1999-01.wulf.tapir:odroid:iroot \
    --portal 192.168.174.153 -o new
New iSCSI node [tcp:[hw=,ip=,net_if=,iscsi_if=default] 192.168.174.153,3260,-1 iqn.1999-01.wulf.tapir:odroid:iroot] added
Check it's possible attach the target with

Code: Select all

# iscsiadm -m node -L all
Logging in to [iface: default, target: iqn.1999-01.wulf.tapir:odroid:iroot, portal: 192.168.174.153,3260] (multiple)
Login to [iface: default, target: iqn.1999-01.wulf.tapir:odroid:iroot, portal: 192.168.174.153,3260] successful.
If instead of successful, you get an error message like

Code: Select all

Logging in to [iface: default, target: iqn.1999-01.wulf.tapis:odroid:iroot, portal: 192.168.174.153,3260] (multiple)
iscsiadm: Could not login to [iface: default, target: iqn.1999-01.wulf.tapis:odroid:iroot, portal: 192.168.174.153,3260].
iscsiadm: initiator reported error (19 - encountered non-retryable iSCSI login failure)
iscsiadm: Could not log into all portals
that means you have made a mistake. While many would suggest using LUN 0 would avoid any such doctrinal and related dogmatic errors, history suggests otherwise. To fix the error, copy the target and portal carefully from the error message into a delete command such as

Code: Select all

# iscsiadm -m node --target iqn.1999-01.wulf.tapis:odroid:iroot \
    --portal 192.168.174.153 -o delete
repent and err no more.

At this point, the Linux kernel log on the 4B should show the presence of an attached iSCSI network block device poetically called /dev/sda. Check with

Code: Select all

# dmesg | tail -n12
[68824.273029] scsi host0: iSCSI Initiator over TCP/IP
[68824.291772] scsi 0:0:0:0: RAID              IET      Controller       0001 PQ: 0 ANSI: 5
[68824.293521] scsi 0:0:0:0: Attached scsi generic sg0 type 12
[68824.295917] scsi 0:0:0:1: Direct-Access     IET      VIRTUAL-DISK     0001 PQ: 0 ANSI: 5
[68824.297879] sd 0:0:0:1: Power-on or device reset occurred
[68824.298028] sd 0:0:0:1: Attached scsi generic sg1 type 0
[68824.300997] sd 0:0:0:1: [sda] 67108864 512-byte logical blocks: (34.4 GB/32.0 GiB)
[68824.301013] sd 0:0:0:1: [sda] 4096-byte physical blocks
[68824.301348] sd 0:0:0:1: [sda] Write Protect is off
[68824.301364] sd 0:0:0:1: [sda] Mode Sense: 69 00 10 08
[68824.302013] sd 0:0:0:1: [sda] Write cache: enabled, read cache: enabled, supports DPO and FUA
[68824.312330] sd 0:0:0:1: [sda] Attached SCSI disk
Before the days of social distancing, I would go for a pleasant walk at this point. As that is ruled out, let's muddle on to the end. Remember, the journey is short, though there's still a long way to go.

Assuming everything is fine, it is now possible to partition, format, mount and then copy files to the iSCSI network block device as if it were a drive physically attached to the system. Since the iSCSI transport occurs at a layer deep within the Linux kernel, it is almost impossible from user space to distinguish the difference between a real physical drive attached to the system.

To partition the network block device type

Code: Select all

# fdisk /dev/sda

Welcome to fdisk (util-linux 2.33.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0x413f8794.

Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p): p
Partition number (1-4, default 1): 1
First sector (2048-67108863, default 2048): 
Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-67108863, default 67108863): 

Created a new partition 1 of type 'Linux' and of size 32 GiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
then format and copy the files with

Code: Select all

# mke2fs -t ext4 -L IROOT /dev/sda1
# mkdir /iroot
# mount /dev/sda1 /iroot
# apt-get install rsync
# time rsync -glopqrtxDH --numeric-ids / /iroot

real    0m50.495s
user    0m12.211s
sys 0m22.811s
It apparently took less than a minute to copy the entire Raspian Lite directory tree to the iSCSI target. More specifically

1323968/50.495/1024=25.6

implies an effective rate of 25.6 MB/sec. Much of that was likely the time needed to read the files from the SD card.

Since the network block device looks so much like a physical device, it is also possible to run hdparm to measure performance. This post closes with exactly that.

Code: Select all

# apt-get install hdparm
# hdparm -t /dev/sda

/dev/sda:
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 Timing buffered disk reads: 326 MB in  3.00 seconds = 108.66 MB/sec
# hdparm -t /dev/sda

/dev/sda:
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 Timing buffered disk reads: 328 MB in  3.01 seconds = 108.95 MB/sec
# hdparm -t /dev/sda

/dev/sda:
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0a 00 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 Timing buffered disk reads: 326 MB in  3.01 seconds = 108.28 MB/sec
The above output indicates a raw read speed of 108 MB/sec, which is close to the maximum for gigabit Ethernet.
Last edited by ejolson on Fri May 22, 2020 3:58 pm, edited 1 time in total.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Thu May 21, 2020 6:03 am

I was zooming with the dog developer about this thread and an argument ensued whether that's a BBC Micro or an Apple II in the background of Sprocket's workshop.

https://www.youtube.com/watch?v=CkeuLfuBVPI

At any rate, starting at 7:25 one can hear the guiding principle followed here concerning root on iSCSI: There is is a long way to go, though the journey is short.

Apparently Apple TV is broadcasting a remake of Fraggle Rock where each fraggle is quarantined in a separate cave. The only way they can communicate seems to be through Zoom, though the frame rate is substantially better. The reason I know this is because my wife got a new iPhone for her birthday last December and her courtesy subscription is still running.

https://tv.apple.com/us/show/fraggle-ro ... 36eb9wm3gl

Although the songs were familiar, it was depressing to see each fraggle in the separate window of a conference call. Then Fido started whining, where is Sprocket?

I hope the remake will survive long enough to reflect a complete phased in recovery after the epidemic. In stage two they might--even worse--be distanced by 6 feet and wearing masks, but after it is safe, how I long to see all the fraggles singing together again!

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Thu May 21, 2020 10:09 pm

Okay, so nothing is working so far.

As with the magical pipes of Cantus in Fraggle Rock the source seems to be a mysterious and invisible, a mysterious and invisible. A mysterious and invisible what? I don't know; it was so mysterious and invisible. Anyway, add

Code: Select all

sdhci.debug_quirks2=4
to cmdline.txt and all is well.

The fact that this option also slows down SD card speed is likely the reason why Raspberry Pi Imager is now the way to install Raspbian instead of NOOBS or PINN. Since people running a data center generally prefer if things work, as well as the fact that SD card speed doesn't matter with iSCSI root, I updated the previous posts to add this option where needed.

For the record, the mechanism of using

Code: Select all

# systemctl reboot 3
to boot from the extra boot partition requires this option on a Pi 4B.

As far as performance goes, it would appear the read speeds from my cost effective Micro Center branded SD card are now a factor of two slower. With the mysterious option enabled the output from hdparm is

Code: Select all

# hdparm -t /dev/mmcblk0

/dev/mmcblk0:
 HDIO_DRIVE_CMD(identify) failed: Invalid argument
 Timing buffered disk reads:  70 MB in  3.07 seconds =  22.82 MB/sec
# hdparm -t /dev/mmcblk0

/dev/mmcblk0:
 HDIO_DRIVE_CMD(identify) failed: Invalid argument
 Timing buffered disk reads:  70 MB in  3.07 seconds =  22.82 MB/sec
# hdparm -t /dev/mmcblk0

/dev/mmcblk0:
 HDIO_DRIVE_CMD(identify) failed: Invalid argument
 Timing buffered disk reads:  70 MB in  3.07 seconds =  22.82 MB/sec
Again, the SD card worked faster before this quirk was added. In particular,

Code: Select all

# hdparm -t /dev/mmcblk0

/dev/mmcblk0:
 HDIO_DRIVE_CMD(identify) failed: Invalid argument
 Timing buffered disk reads: 132 MB in  3.04 seconds =  43.42 MB/sec
# hdparm -t /dev/mmcblk0

/dev/mmcblk0:
 HDIO_DRIVE_CMD(identify) failed: Invalid argument
 Timing buffered disk reads: 130 MB in  3.00 seconds =  43.32 MB/sec
# hdparm -t /dev/mmcblk0

/dev/mmcblk0:
 HDIO_DRIVE_CMD(identify) failed: Invalid argument
 Timing buffered disk reads: 130 MB in  3.00 seconds =  43.32 MB/sec
Fortunately, due to the post

viewtopic.php?f=91&t=266092#p1646408

the iSCSI medley can begin!
Last edited by ejolson on Sat May 23, 2020 6:04 am, edited 8 times in total.

trejan
Posts: 1685
Joined: Tue Jul 02, 2019 2:28 pm

Re: iSCSI Root Like a Data Center

Thu May 21, 2020 11:45 pm

ejolson wrote:
Thu May 21, 2020 10:09 pm
As with the magical pipes of Cantus in Fraggle Rock the source seems to be a mysterious and invisible, a mysterious and invisible, undocumented option.

Code: Select all

sdhci.debug_quirks2=4
Whilst it is poorly documented, it isn't a super secret Pi specific option. It is an option for the Linux SDHCI driver to enable certain workarounds. 4 = SDHCI_QUIRK2_NO_1_8_V which means no 1.8V is available and that stops it from entering UHS-I mode since 1.8V is required.

I suspect this quirk option isn't needed on the Pi 4B v1.2 but can't test it.
ejolson wrote:
Thu May 21, 2020 10:09 pm
As far as performance goes, it would appear the read speeds from my inexpensive Micro Center branded SD card are now a factor of two slower.
Yes. You've downgraded from DDR50 (50MB/s) to HS (25MB/s).

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Thu May 21, 2020 11:50 pm

trejan wrote:
Thu May 21, 2020 11:45 pm
ejolson wrote:
Thu May 21, 2020 10:09 pm
As with the magical pipes of Cantus in Fraggle Rock the source seems to be a mysterious and invisible, a mysterious and invisible, undocumented option.

Code: Select all

sdhci.debug_quirks2=4
Whilst it is poorly documented, it isn't a super secret Pi specific option. It is an option for the Linux SDHCI driver to enable certain workarounds. 4 = SDHCI_QUIRK2_NO_1_8_V which means no 1.8V is available and that stops it from entering UHS-I mode since 1.8V is required.

I suspect this quirk option isn't needed on the Pi 4B v1.2 but can't test it.
ejolson wrote:
Thu May 21, 2020 10:09 pm
As far as performance goes, it would appear the read speeds from my inexpensive Micro Center branded SD card are now a factor of two slower.
Yes. You've downgraded from DDR50 (50MB/s) to HS (25MB/s).
Do you know why this setting would affect how the parameter passed to the reboot system call is processed?

trejan
Posts: 1685
Joined: Tue Jul 02, 2019 2:28 pm

Re: iSCSI Root Like a Data Center

Fri May 22, 2020 12:04 am

ejolson wrote:
Thu May 21, 2020 11:50 pm
Do you know why this setting would affect how the parameter passed to the reboot system call is processed?
The next partition number is jammed into "spare" bits in the PM_RSTS register so it can persist over a restart. Why the SD card affects this is related to UHS mode which runs at 1.8V. You have to power cycle the card to get it out of UHS mode and back to 3.3V operation to reinitialise it. The problem is that the 4B v1.1 can only do this by resetting the PMIC which resets the SoC as well and clears the PM_RSTS register.

The quirks line works because you're stopping the SD card from entering UHS mode so it stays on 3.3V. This means no PMIC reset is needed and therefore the PM_RSTS register doesn't get cleared.

There is a commit that mentions a later revision of the 4B having a GPIO to toggle the 1.8V supply to the SD card. I'm guessing this change was made on the v1.2. Anybody got one to try it out on?

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Sat May 23, 2020 12:54 am

So picking up where we were before watching all those reruns of Fraggle Rock, let's continue on this short journey of iSCSI root like a data center.

Assuming you still have the network block device /dev/sda1 mounted under /iroot on the Pi 4B, edit the file /iroot/etc/fstab so it looks like

Code: Select all

proc        /proc proc defaults         0 0
LABEL=TXTRA /boot vfat defaults         0 2
LABEL=IROOT /     ext4 defaults,noatime 0 1
There is only a single letter change to specify IROOT.

It is also necessary to regenerate the initial RAM filesystem, as we installed the open-iscsi package since creating the original one. Do this as before with

Code: Select all

# cd /boot
# update-initramfs -c -k 4.19.97-v7l+
# mv initrd.img-4.19.97-v7l+ initrd7l.img
Next, make the necessary changes to the cmdline.txt on the extra boot partition, currently mounted under boot, to attach the iSCSI device so the initial RAM file system can subsequently pivot root to it as if it were a local mount.

We need to know the weird iSCSI name of Pi 4B that was created randomly when we installed the open-iscsi package. Thus, type

Code: Select all

# cat /etc/iscsi/initiatorname.iscsi
## DO NOT EDIT OR REMOVE THIS FILE!
## If you remove this file, the iSCSI daemon will not start.
## If you change the InitiatorName, existing access control lists
## may reject this initiator.  The InitiatorName must be unique
## for each iSCSI initiator.  Do NOT duplicate iSCSI InitiatorNames.
InitiatorName=iqn.1993-08.org.debian:01:e3b7873b683
If you are following along, please resist the temptation to remove that file. Also, the initiator name on your Pi will likely be different in the last colon delimited field. Please let me know if your IQN is exactly the same as what I posted and we can track down what's going wrong with the Pi random number generator. Either way, now set up cmdline.txt as

Code: Select all

iscsi_initiator=iqn.1993-08.org.debian:01:e3b7873b683 iscsi_target_name=iqn.1999-01.wulf.tapir:odroid:iroot iscsi_target_ip=192.168.174.153 console=serial0,115200 console=tty1 root=LABEL=IROOT rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait sdhci.debug_quirks2=4
Three new fields have been added, one specifying the initiator and the other two describing the target made earlier without a mouse on the NAS. Be sure to also change the root option to specify IROOT, otherwise, after all this work, the Pi will still mount the SD card as root.

Before booting into the iSCSI root, let's tidy things up by also changing the /etc/fstab on the SD card so it reads as

Code: Select all

proc        /proc proc defaults         0 0
LABEL=TBOOT /boot vfat defaults         0 2
LABEL=TROOT /     ext4 defaults,noatime 0 1
At this point the Pi will by default boot into a standard Raspbian distribution with root on the SD card. Then, by entering

Code: Select all

# systemctl reboot 3
your Pi will root like a data center.

It worked for me on the second try, but if you are careful it should work the first time around. Fortunately, if the iSCSI reboot fails, simply press ctrl-alt-del and the Pi will boot from the first boot partition back into the SD card. From there it will be possible to poke around, see what went wrong and fix it.

After the reboot, log in to check if the Pi is really running with an iSCSI root.

Code: Select all

$ df
Filesystem     1K-blocks    Used Available Use% Mounted on
udev              856432       0    856432   0% /dev
tmpfs             198600    2936    195664   2% /run
/dev/sda1       32895760 1325020  29876688   5% /
tmpfs             992996       0    992996   0% /dev/shm
tmpfs               5120       4      5116   1% /run/lock
tmpfs             992996       0    992996   0% /sys/fs/cgroup
/dev/mmcblk0p3    258095   62399    195697  25% /boot
tmpfs             198596       0    198596   0% /run/user/0
tmpfs             198596       0    198596   0% /run/user/1000
Be careful to continue social distancing while doing the happy dance (or unhappy dance depending on the output). Since there is still a long way to go, the next post will discuss how to tighten security.
Last edited by ejolson on Sat May 23, 2020 6:01 am, edited 7 times in total.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Sat May 23, 2020 4:54 am

Security will have to wait. With apologies to Liam Sternberg and The Bangles,

All the old programs on the NAS
They do the iSCSI dance don't you know
If they execute too quick (go boot go)
They're paging out like a domino

It's so bizarre with the Pi
Invisible options for a bet
SDHCI crocodiles (go boot go)
No SD card and it don't boot yet

Binary types connected by pipes say
(Boot oh boot go, IPL oh boot go)
iSCSI root like a data center
They're booting like a data center

All the Pi's in the marketplace say
(Boot oh boot go, IPL oh boot go)
iSCSI root like a data center
They queue in line like a data center

All Pi computers with all of their kin
(Boot oh boot go, IPL oh boot go)
iSCSI root like a data center
iSCSI root like a data center


I would post a link to the original but understand it was banned by the BBC in 1991 and I don't want to get banned myself. However, if someone reading this is a member of a band that records an iSCSI remake based on the above lyrics, please post a link here.

Back on topic, Fido just sent me compelling evidence

https://www.youtube.com/watch?v=CIm3mMmmuUQ

that the computer in Sprocket's workshop is, in fact, the rare and wonderful Apple III, which, according to Steve Wozniak, was designed by the marketing department. Thankfully, all the Pi models (even the 3B) appear to be created by engineers.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Sat May 23, 2020 7:15 pm

In this post we add security to the iSCSI drive based on the challenge handshake authentication protocol. There are four places where the password needs to be added: To
  • The NAS providing the share.
  • cmdline.txt on /dev/mmcblk0p3.
  • /etc/iscsi/iscsid.conf on /dev/mmcblk0p2.
  • iscsid.conf on the network block device.
Since adding the CHAP password is likely to make the network block device become temporarily unavailable, first boot back into the SD card by typing ctrl-alt-del on the Pi 4B. Verify the network block device is no longer mounted as root with

Code: Select all

$ df
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/root       14339504 1290784  12434336  10% /
devtmpfs          860900       0    860900   0% /dev
tmpfs             992996       0    992996   0% /dev/shm
tmpfs             992996   16812    976184   2% /run
tmpfs               5120       4      5116   1% /run/lock
tmpfs             992996       0    992996   0% /sys/fs/cgroup
/dev/mmcblk0p1    258095   53463    204632  21% /boot
tmpfs             198596       0    198596   0% /run/user/1000
Note that /dev/sda1 is no longer root and the size of the partition matches the size of the SD card. We can now safely make the changes needed to add passwords. Although the universal Fido-approved password is bark256, I generated mine using the built-in random number generator on the Pi.

Code: Select all

$ (dd if=/dev/random bs=16 count=1 | base64 | head -c15); echo
1+0 records in
1+0 records out
16 bytes copied, 0.0019375 s, 8.3 kB/s
R2w5hfKJrrv0rKv
Note this password contains

15*log2(64)=90

bits of entropy. If that's not enough, try bark256.

Now, log into the NAS (in this case the computer named tapir.wulf) and set the password for the target

Code: Select all

$ slogin tapir.wulf
$ su -
# cd /etc/tgt/conf.d
Then edit iroot.conf in /etc/tgt.conf.d so it reads as follows.

Code: Select all

<target iqn.1999-01.wulf.tapir:odroid:iroot>
    initiator-address 192.168.174.145
    backing-store /home/targets/iroot.img
    outgoinguser turbo R2w5hfKJrrv0rKv
</target>
The added line gives the outgoing user and password. Both of these are entirely arbitrary. Here I have set the user as the shortened name for the Pi 4B with the iSCSI root and the password to be the 90-bit random number generated above. If for some reason you have bypassed the two firewalls shielding the tapir from the rest of the Internet, please note I used a different password in my personal setup and encourage you to do the same.

Before logging out of the server, restart the target and check it with

Code: Select all

# systemctl restart tgt
# tgtadm --mode target --op show
Target 1: iqn.1999-01.wulf.tapir:odroid:iroot
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            SWP: No
            Thin-provisioning: No
            Backing store type: null
            Backing store path: None
            Backing store flags: 
        LUN: 1
            Type: disk
            SCSI ID: IET     00010001
            SCSI SN: beaf11
            Size: 34360 MB, Block size: 512
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            SWP: No
            Thin-provisioning: No
            Backing store type: rdwr
            Backing store path: /home/targets/iroot.img
            Backing store flags: 
    Account information:
        turbo (outgoing)
    ACL information:
        192.168.174.145
The user turbo should now be listed under account information.

Next on the list is to update cmdline.txt on the extra boot partition of the Pi 4B. If needed, log out of the NAS and then on the 4B type

Code: Select all

$ sudo -s
# mount LABEL=TXTRA /extra
# cd /extra
and then edit cmdline.txt in /extra so it reads all on one line as

Code: Select all

iscsi_username=turbo iscsi_password=R2w5hfKJrrv0rKv iscsi_initiator=iqn.1993-08.org.debian:01:e3b7873b683 iscsi_target_name=iqn.1999-01.wulf.tapir:odroid:iroot iscsi_target_ip=192.168.174.153 console=serial0,115200 console=tty1 root=LABEL=IROOT rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait sdhci.debug_quirks2=4
Two new options have been added to the beginning that reflect the exact user and password. At this point, it is worth observing that since the boot partition is formatted with the MSDOS file system, it's irritatingly difficult to secure cmdline.txt so it's not world readable. A reasonable work around is to make the entire boot directory readable only by root. We'll discuss this in the next post.

The last two items on the list are essentially the same and can be done together. In fact, they likely don't need done at all. However, in order to avoid a possible human panic when waking up in the middle of the night with a fever and cough in response to an iSCSI root related Linux panic, update iscsid.conf in /etc/iscsi so the lines shown below read

Code: Select all

# To enable CHAP authentication set node.session.auth.authmethod
# to CHAP. The default is None.
node.session.auth.authmethod = CHAP

# To set a CHAP username and password for initiator
# authentication by the target(s), uncomment the following lines:
node.session.auth.username = turbo
node.session.auth.password = R2w5hfKJrrv0rKv
Then copy this file to the network block device using

Code: Select all

# /etc/init.d/open-iscsi restart
# iscsiadm -m node -L all
Logging in to [iface: default, target: iqn.1999-01.wulf.tapir:odroid:iroot, portal: 192.168.174.153,3260] (multiple)
Login to [iface: default, target: iqn.1999-01.wulf.tapir:odroid:iroot, portal: 192.168.174.153,3260] successful.
# mount LABEL=IROOT /iroot
# cd /iroot/etc/iscsi
# cp /etc/iscsi/iscsid.conf .
# cd /
# sync
# umount /iroot
# iscsiadm -m node -U all
Logging out of session [sid: 3, target: iqn.1999-01.wulf.tapir:odroid:iroot, portal: 192.168.174.153,3260]
Logout of [sid: 3, target: iqn.1999-01.wulf.tapir:odroid:iroot, portal: 192.168.174.153,3260] successful.
At this point it should be possible to reboot the Pi 4B into the password protected iSCSI root like a data center.

Code: Select all

# systemctl reboot 3
It worked for me the first time!
Last edited by ejolson on Sun May 24, 2020 4:05 am, edited 14 times in total.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Sat May 23, 2020 7:29 pm

Yikes, I just got a phone call from the canine coder. Fido was barking mad. What kind of security is that? You've just put the password in a world readable file on the boot partition and also on an authenticated but non-encrypted iSCSI network block device. The rest sounded like barking to me.

Do you think using the bark256 password would have saved my ears? Other than an ultrasonic dog collar, does anyone have an idea how to improve things?

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Sun May 24, 2020 2:18 am

In the middle of the night a noise woke me up. I looked out and saw the canine coder on the dog house positioning a parabolic antenna pointed distinctly towards the segmented system area network behind the router behind the router where I live.

While I don't think the 8-bit computers which cohabitate in the dog house have enough processing power for the software defined radio decoders needed to snoop gigabit Ethernet, I decided to perform some security as an afterthought before phase two allowed a virus to invade my iSCSI root unlike a data center.

Trying not to panic I logged in and edited the version of iscsi.conf in /etc/iscsi on the block network device so the node session password was

Code: Select all

node.session.auth.password = bark256
My hope is, since the initial RAM file system mounts the iSCSI root device, that the real password does not actually need to be stored on the network block device.

To address the other problem I edited /etc/fstab and added fmask=077 to the boot mount so it now looks like

Code: Select all

proc        /proc proc defaults         0 0
LABEL=TXTRA /boot vfat fmask=077        0 2
LABEL=IROOT /     ext4 defaults,noatime 0 1
Then I rebooted the iSCSI root with

Code: Select all

# systemctl reboot 3
to verify everything works. It did and just in time too! Upon glancing out the window I noticed that Fido had finished fiddling with the antenna and was already back inside the dog house.

In closing, it's worth mentioning that it is equally insecure to store passwords and ssh private keys on an unencrypted NFS or SMB mount. Although both NFSv4 and SMB3 support encryption, depending on network speed, the additional server-side resources can become noticeable. An alternative, only possible with iSCSI, would be to set up an encrypted root using using dm-crypt on the Pi 4B client side. This provides full-disk encryption of the network block device without any additional load on the server.

Unfortunately, if Fido has recourse to government levels of funding for mischief making, even dm-crypt may not be enough, because full-disk encryption was not designed to withstand continuous surveillance of the encrypted volume over time. In that case, an additional layer of security would be needed, such as using a peashooter to knock that antenna off the top of the dog house.

These and similar kinds of problems may explain why the data center here has been designing a robot to exterminate security issues inside and around the perimeter of the facility.

https://www.switch.com/switch-sentry/

Could a Raspberry Pi walk like a Dalek?

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Sun May 24, 2020 8:25 am

The dog developer sent me an invitation for Zoom. During the meeting it became clear the house where I live was blocking the direct line of sight to the microwave tower for the Internet service. Fido complained, even with the transmitter turned to maximum there is no connection. What a relief, I sighed, I thought you were trying to get my 90-bit iSCSI password.

After an amazing impersonation of a hyena, the dog developer managed to type "tail /proc/cmdline" in the Zoom chat window. Knowing Fido's dislike for cat, I perversely tried

Code: Select all

$ cat /proc/cmdline
coherent_pool=1M 8250.nr_uarts=0 cma=64M cma=256M video=HDMI
-A-1:[email protected],margin_left=0,margin_right=0,margin_top=0
,margin_bottom=0 smsc95xx.macaddr=DC:A6:32:0D:8F:6A vc_mem.m
em_base=0x3ec00000 vc_mem.mem_size=0x40000000  iscsi_usernam
e=turbo iscsi_password=R2w5hfKJrrv0rKv iscsi_initiator=iqn.1
993-08.org.debian:01:e3b7873b683 iscsi_target_name=iqn.1999-
01.wulf.tapir:odroid:iroot iscsi_target_ip=192.168.174.153 c
onsole=ttyS0,115200 console=tty1 root=LABEL=IROOT rootfstype
=ext4 elevator=deadline fsck.repair=yes rootwait sdhci.debug
_quirks2=4
then wished I hadn't. Quickly I cleared the screen hoping the dog developer hadn't been recording the Zoom session and instead entered

Code: Select all

$ sudo chmod 400 /proc/cmdline
There were more hyena sounds and "dmesg | grep iscsi_password" appeared in the chat window. This time I was careful to disable my screen share.

Code: Select all

$ dmesg | grep iscsi_password
[    0.000000] Kernel command line: coherent_pool=1M 8250.nr
_uarts=0 cma=64M cma=256M video=HDMI-A-1:[email protected],margi
n_left=0,margin_right=0,margin_top=0,margin_bottom=0 smsc95x
x.macaddr=DC:A6:32:0D:8F:6A vc_mem.mem_base=0x3ec00000 vc_me
m.mem_size=0x40000000  iscsi_username=turbo iscsi_password=R
2w5hfKJrrv0rKv iscsi_initiator=iqn.1993-08.org.debian:01:e3b
7873b683 iscsi_target_name=iqn.1999-01.wulf.tapir:odroid:iro
ot iscsi_target_ip=192.168.174.153 console=ttyS0,115200 cons
ole=tty1 root=LABEL=IROOT rootfstype=ext4 elevator=deadline 
fsck.repair=yes rootwait sdhci.debug_quirks2=4
I reflected, so this is what it's like to play whack-a-mole with a password that appears on the command line. After a bit of web searching I tried

Code: Select all

$ sudo sysctl -w kernel.dmesg_restrict=1
$ dmesg
dmesg: read kernel buffer failed: Operation not permitted
I then updated /etc/rc.local and added

Code: Select all

echo Some security as an afterthought for iSCSI root...
chmod 400 /proc/cmdline
sysctl -w kernel.dmesg_restrict=1
to make these changes permanent and rebooted with

Code: Select all

$ sudo systemctl reboot 3
to verify that everything still worked.

It should be pointed out it is a bad idea to place any password on a command line and it is not surprising that it leaked out the way it did. Since this is not a remotely exploitable leak, it is a non-issue if the only account on the Pi is the trusted pi user who already has root privilege. On the other hand, if shell access is given to other users or a web server runs amok, then leaving world readable files with critical system passwords lying about might be a bad idea.

The alternative and likely recommended solution is to place the password in /etc/iscsi.initramfs so it's automatically embedded into the initial RAM filesystem and need not appear explicitly on the command line. The next post will explore this idea.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Mon May 25, 2020 3:17 am

I decided to try the iscsi.initramfs idea mentioned in the previous post. As none of this stuff seems documented, I looked at the iscsi script in initramfs-tools/hooks that builds the initial RAM filesystem and discovered the file containing the CHAP password needs to be placed in the /etc/iscsi subdirectory.

Since that's on the network block device, this poses the same problem as before: Data transfers to and from the iSCSI disk are authenticated but not encrypted. As the point is to increase security rather than reduce it, sending critical system passwords in plain text over the network would be counter productive.

I wanted to ask the dog developer for advice, but as the Internet connection to the dog house was down, the only way to do this was through FidoNet using dial-up. After downloading the correct app for my mobile phone, I eventually heard the distinctive warbling for the carrier of a 1200 baud modem and was able to connect. The resulting PETSCII reply initially looked like barking to me, but eventually I understood the idea: Use symbolic links to place all files which contain sensitive system passwords in a secret directory on the SD card.

Upon thinking about this for a while, I wondered whether it might be easier to win the game of whack-a-mole or maybe use a dm-crypt secured network block device. At the same time to see whether Fido's idea would work I tried it.

On the Pi 4B I typed

Code: Select all

$ df
Filesystem     1K-blocks    Used Available Use% Mounted on
udev              856432       0    856432   0% /dev
tmpfs             198600    5404    193196   3% /run
/dev/sda1       32895760 1325536  29876172   5% /
tmpfs             992996       0    992996   0% /dev/shm
tmpfs               5120       4      5116   1% /run/lock
tmpfs             992996       0    992996   0% /sys/fs/cgroup
/dev/mmcblk0p3    258095   62399    195697  25% /boot
tmpfs             198596       0    198596   0% /run/user/1000
to verify it was booted into the iSCSI root, and then made the suggested changes.

Code: Select all

$ sudo -s
# cd /boot
# mkdir secret
# cd secret
# mv /etc/iscsi/iscsid.conf .
# touch iscsi.initramfs
# cd /etc/iscsi
# ln -s /boot/secret/iscsid.conf .
# ln -s /boot/secret/iscsi.initramfs .
I edited iscsid.initramfs in /boot/secret so it contained the line

Code: Select all

ISCSI_PASSWORD=R2w5hfKJrrv0rKv
and changed iscsid.conf so instead of bark256 it also had the correct password. Finally, I regenerated the initial RAM filesystem with

Code: Select all

# cd /boot
# uname -a
Linux raspberrypi 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux
# update-initramfs -c -k 4.19.97-v7l+
update-initramfs: Generating /boot/initrd.img-4.19.97-v7l+
# mv initrd.img-4.19.97-v7l+ initrd7l.img
and removed the CHAP password from /boot/cmdline.txt so it won't keep popping up in unexpected places. For reference cmdline.txt now looks like

Code: Select all

iscsi_username=turbo iscsi_initiator=iqn.1993-08.org.debian:01:e3b7873b683 iscsi_target_name=iqn.1999-01.wulf.tapir:odroid:iroot iscsi_target_ip=192.168.174.153 console=serial0,115200 console=tty1 root=LABEL=IROOT rootfstype=ext4 elevator=deadline fsck.repair=yes rootwait sdhci.debug_quirks2=4
Before rebooting I commented out the recently added lines in /etc/rc.local because /proc/cmdline and dmesg no longer contain any sensitive system passwords. For reference, the relevant parts of rc.local now appear as

Code: Select all

#echo Some security as an afterthought for iSCSI root...
#chmod 400 /proc/cmdline
#sysctl -w kernel.dmesg_restrict=1
At this point it seemed reasonable to reboot, so I typed

Code: Select all

# systemctl reboot 3
Amazingly, the Pi 4B rebooted with iSCSI root even more like a data center than before. In particular

Code: Select all

$ cat /boot/secret/iscsi.initramfs 
cat: /boot/secret/iscsi.initramfs: Permission denied
$ cat /proc/cmdline 
coherent_pool=1M 8250.nr_uarts=0 cma=64M cma=256M video=HDMI
-A-1:[email protected],margin_left=0,margin_right=0,margin_top=0
,margin_bottom=0 smsc95xx.macaddr=DC:A6:32:0D:8F:6A vc_mem.m
em_base=0x3ec00000 vc_mem.mem_size=0x40000000  iscsi_usernam
e=turbo iscsi_initiator=iqn.1993-08.org.debian:01:e3b7873b68
3 iscsi_target_name=iqn.1999-01.wulf.tapir:odroid:iroot iscs
i_target_ip=192.168.174.153 console=ttyS0,115200 console=tty
1 root=LABEL=IROOT rootfstype=ext4 elevator=deadline fsck.re
pair=yes rootwait sdhci.debug_quirks2=4
verifies the password no longer leaks through the command line.

Note, a disadvantage of the changes in this post is that the initial RAM file system has to be updated each time the password is changed. Moreover, a separate initial RAM file system needs to be maintained for each different password. For these reasons, under certain circumstances--for example, network booting a large number of independently administered Pi 4B computers in a data center--it's possible whack-a-mole would be more fun.

Image

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Tue May 26, 2020 5:54 am

Today I was feeling quite happy with how my Pi 4B was running with an iSCSI root filesystem when I accidentally unplugged the switch. I plugged it back in again, but the 4B had hung. I suspect the dhcpcd network configuration program downed the interface when the carrier was lost and that this action was irreversible because there was no root filesystem from which to run anything later when the carrier came back. If anyone knows how to fix that, help would be appreciated.

Before experimenting further with static IP addresses and more network induced crashes, I decided to make a backup. To do this I logged into the iSCSI target server, the one named tapir, and typed

Code: Select all

# tgtadm --mode target --op show
Target 1: iqn.1999-01.wulf.tapir:odroid:iroot
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
        I_T nexus: 16
            Initiator: iqn.1993-08.org.debian:01:e3b7873b683 alias: raspberrypi
            Connection: 0
                IP Address: 192.168.174.145
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            SWP: No
            Thin-provisioning: No
            Backing store type: null
            Backing store path: None
            Backing store flags: 
        LUN: 1
            Type: disk
            SCSI ID: IET     00010001
            SCSI SN: beaf11
            Size: 34360 MB, Block size: 512
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            SWP: No
            Thin-provisioning: No
            Backing store type: rdwr
            Backing store path: /home/targets/iroot.img
            Backing store flags: 
    Account information:
        turbo (outgoing)
    ACL information:
        192.168.174.145
The nexus information indicates the iroot target is currently mounted by the Raspberry Pi 4B. Since better backups are made from a clean filesystem, I went back to the Pi with iSCSI root and rebooted it back to the SD card.

Code: Select all

$ sudo systemctl reboot
After some time I checked the server again and found that even though the Pi was now running from the SD card the server thought it was still logged in. Yikes, does that mean the Pi is not logging out of the target when it is rebooted?

I decided the difficulty might be because of the crash I had earlier, so I proceded to try again. After restarting the target, there was no change. Then I booted the Pi back and forth one more time between the iSCSI root and the SD card. After doing this, somewhat to my surprise, tgtadm now showed no active initiators.

Code: Select all

# tgtadm --mode target --op show
Target 1: iqn.1999-01.wulf.tapir:odroid:iroot
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            SWP: No
            Thin-provisioning: No
            Backing store type: null
            Backing store path: None
            Backing store flags: 
        LUN: 1
            Type: disk
            SCSI ID: IET     00010001
            SCSI SN: beaf11
            Size: 34360 MB, Block size: 512
            Online: Yes
            Removable media: No
            Prevent removal: No
            Readonly: No
            SWP: No
            Thin-provisioning: No
            Backing store type: rdwr
            Backing store path: /home/targets/iroot.img
            Backing store flags: 
    Account information:
        turbo (outgoing)
    ACL information:
        192.168.174.145
For now I'll attribute the previous glitch to the recent crash. Would it help to edit initiatorname.iscsi in /etc/scsi/ so the SD card and iSCSI root use different IQNs when they mount the iroot target? At any rate, I proceeded to back things up. Still on the server, I did this with

Code: Select all

# cd /home/targets
# cp --reflink=always iroot.img iroot-2020.05.25
Note that since iroot.img is a regular file on a BTRFS formatted SSD then it is possible to make an instant backup with the reflink option. Leave this option out if not using BTRFS. If the backing store is an LVM2 partition or Ceph volume use a different way to copy it. The shortness of this journey, however, leaves no time to go into such details.

Having made a backup, I decided to mount the backup on the server to verify what is there. To do this, I determined the offset of the root partition within the disk image using fdisk.

Code: Select all

# fdisk -l iroot-2020.05.25
Disk iroot-2020.05.25: 32 GiB, 34359738368 bytes, 67108864 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x413f8794

Device             Boot Start      End  Sectors Size Id Type
iroot-2020.05.25p1       2048 67108863 67106816  32G 83 Linux
The partition begins at sector 2048 or with an offset of

2048*512=1048576

bytes. Mount and check it over loopback.

Code: Select all

# mkdir iroot
# mount -o offset=1048576 iroot-2020.05.25 iroot
# cd iroot
# ls
bin   dev  home  lost+found  mnt  proc  run   srv  tmp  var
boot  etc  lib   media       opt  root  sbin  sys  usr
# du -ks *
9480    bin
4   boot
4   dev
3452    etc
44  home
356836  lib
16  lost+found
4   media
4   mnt
41056   opt
4   proc
40  root
4   run
10000   sbin
4   srv
4   sys
24  tmp
594168  usr
262068  var
Since everything looks reasonable, I unmounted the backup with

Code: Select all

# cd /home/targets
# umount iroot
# rmdir iroot
Now that a backup has been made, my plan is to spend some time changing the configuration in dhcpcd.conf among other places to see if I can get the Pi with iSCSI root to survive the network cable being removed and reinserted without crashing. If I have any luck, I'll post back here; if not, I'll restore the backup just made and continue on as if nothing happened. Again, any ideas how to make a system with network mounted root filesystem more robust against intermittent network errors would be much appreciated.

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Tue May 26, 2020 3:48 pm

ejolson wrote:
Tue May 26, 2020 5:54 am
Again, any ideas how to make a system with network mounted root filesystem more robust against intermittent network errors would be much appreciated.
The fix turned out to be amazingly easy. I found it by reading the man page for dhcpcd.conf without any help from the dog developer. Just add the option nolink right after the persist option. The relevant lines in /etc/dhcpcd.conf are

Code: Select all

# Persist interface configuration when dhcpcd exits.
persistent
nolink

# Rapid commit support.
Now I can unplug and plug the network interface without crashing the iSCSI root. Woohoo! Problem solved on the first try. This also solves a long-standing problem I had with NFS root back before the days of quarantine with students kicking the loose cables under their desks in the computing lab. I better remember nolink, in case I'm wearing a face mask while trying to give a lecture next Fall.

I wonder if the official Pi Server setup uses the nolink option, something else or simply crashes when the network cable gets momentarily unplugged.

https://www.raspberrypi.org/blog/piserver/

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Wed May 27, 2020 4:23 am

I was dreaming about how perfect the iSCSI root was working when I suddenly realized the Pi was still using a 100 MB swap file very unlike a data center. Clearly the system needed a swap partition.

The usual way to create a swap partition on Raspbian is to make it before the root image on the SD card is expanded in essentially the same way that the extra boot partition was made earlier in this thread. An alternative is to later shrink the root partition so there is room on the SD card. Note that shrinking a partition is noticeably more difficult, because the file system can't be mounted when shrinking.

With iSCSI root there is a third option: Grow the the network block device. In the present case, this was easy as the backing store is just a regular file. What could go wrong?

While it would be amusing to find out what would happen if suddenly the network block device appeared bigger to the Pi while it was attached, I was already done with my adventure for the day--going for a short walk around the neighborhood. Therefore, I decided to first boot the Pi into the SD card before messing with the iSCSI targets.

After verifying the Pi was running from the SD card, I logged into the server and increased the size of the network block device using the commands

Code: Select all

# cd /etc/init.d
# ./tgt stop
# cd /home/targets
# dd bs=1024K seek=36864 count=0 if=/dev/zero of=iroot.img
# ls -lh
-rw-r--r-- 1 root root 32G May 26 05:39 iroot-2020.05.25
-rw-r--r-- 1 root root 36G May 27 03:58 iroot.img
# cd /etc/init.d
# ./tgt start
It is important that the seek position be larger than the current size of the image. Note that iroot.img is now 4GB larger than it was before.

Now return to the Pi 4B and boot it back into the iSCSI root.

Code: Select all

$ sudo -s
# systemctl reboot 3
After it reboots create the swap partition.

Code: Select all

$ sudo -s
# fdisk /dev/sda
Command (m for help): p
Disk /dev/sda: 36 GiB, 38654705664 bytes, 75497472 sectors
Disk model: VIRTUAL-DISK    
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x413f8794

Device     Boot Start      End  Sectors Size Id Type
/dev/sda1        2048 67108863 67106816  32G 83 Linux

Command (m for help): n
Partition type
   p   primary (1 primary, 0 extended, 3 free)
   e   extended (container for logical partitions)
Select (default p): p
Partition number (2-4, default 2): 
First sector (67108864-75497471, default 67108864): 
Last sector, +/-sectors or +/-size{K,M,G,T,P} (67108864-75497471, default 75497471): 

Created a new partition 2 of type 'Linux' and of size 4 GiB.

Command (m for help): t
Partition number (1,2, default 2): 2
Hex code (type L to list all codes): 82

Changed type of partition 'Linux' to 'Linux swap / Solaris'.

Command (m for help): p
Disk /dev/sda: 36 GiB, 38654705664 bytes, 75497472 sectors
Disk model: VIRTUAL-DISK    
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x413f8794

Device     Boot    Start      End  Sectors Size Id Type
/dev/sda1           2048 67108863 67106816  32G 83 Linux
/dev/sda2       67108864 75497471  8388608   4G 82 Linux swap / Solaris

Command (m for help): w
Format the swap partition.

Code: Select all

# mkswap -L ISWAP /dev/sda2
I added the swap partition to /etc/fstab to finish and also the option _netdev to both the IROOT and ISWAP mount for good luck. The new /etc/fstab look like

Code: Select all

proc        /proc proc defaults                 0 0
LABEL=TXTRA /boot vfat fmask=077                0 2
LABEL=IROOT /     ext4 defaults,noatime,_netdev 0 1
LABEL=ISWAP none  swap sw,_netdev               0 0
Finally, disable the original 100MB swap file by changing /etc/dphys-swapfile so the relevant lines read as

Code: Select all

CONF_SWAPSIZE=0
and turn off the swap file and mount the swap partition with

Code: Select all

# swapoff -a
# rm /var/swap
# swapon -a
# cat /proc/swaps
Filename       Type        Size      Used    Priority
/dev/sda2      partition   4194300   0       -2
The last command above verifies that swap on the network block device is enabled. I decided to reboot the system, just to make sure I hadn't broken anything.

Code: Select all

# systemctl reboot 3
Everything worked!

Once a physical SD card is full you generally need to buy a bigger one. With a network block device, it is possible to extend the size as needed. If one were planning to make repeated use of such a feature, it would likely be better to place the swap partition first and the root partition second. The only thing left is to watch some more Fraggle Rock.

incognitum
Posts: 467
Joined: Tue Oct 30, 2018 3:34 pm

Re: iSCSI Root Like a Data Center

Wed May 27, 2020 8:46 am

ejolson wrote:
Tue May 26, 2020 3:48 pm
I wonder if the official Pi Server setup uses the nolink option
Correct.

You can see what changes it has compared to standard Raspbian here: https://github.com/raspberrypi/piserver ... _cmds#L112

sdwilsh
Posts: 1
Joined: Sat May 30, 2020 11:33 pm

Re: iSCSI Root Like a Data Center

Sat May 30, 2020 11:51 pm

I was working on getting the Pi 4 to network boot as well for the past week and a half. I actually got it working without an SD card at all (after the initial setup), which is a little different from what you did.

I just finished writing it all up on https://shawnwilsher.com/2020/05/networ ... a-freenas/ and figured I'd share it here, since I came across this thread when I was debugging an issue (turned out to be a typo on my part).

ejolson
Posts: 4931
Joined: Tue Mar 18, 2014 11:47 am

Re: iSCSI Root Like a Data Center

Sun May 31, 2020 12:36 am

incognitum wrote:
Wed May 27, 2020 8:46 am
ejolson wrote:
Tue May 26, 2020 3:48 pm
I wonder if the official Pi Server setup uses the nolink option
Correct.

You can see what changes it has compared to standard Raspbian here: https://github.com/raspberrypi/piserver ... _cmds#L112
Thanks for the link. It looks like nolink is the only addition to the dhcpcd.conf file. I noticed that adding that option to the standard Raspbian image prevents dhcpd from starting the interface and acquiring a lease. If the network is already up, as happens with the iSCSI root initial RAM filesystem, does dhcpcd at least continue to maintain and renew the existing lease?

Return to “Networking and servers”