Extend a Personal Network to Azure: Backup Server Spot Instance

July 24, 2023 - 12 min read (2257 words)

Azure Spot Virtual Machines offer deep discounts of up to 90% off pay-as-you-go prices for interruptible workloads. As long as resiliency and fault tolerance can be built into the workload, this model can offer significant cost savings. In the workload described by this post series, the backup system is an excellent candidate for the use of spot instances. This post covers the creation of a spot instance server to replace the existing backup virtual machine.

This post will cover the creation of a spot virtual machine, preparing it for graceful shutdown upon eviction and configuring it to replace an existing Samba server within the project.

Table of Contents

Create then Initialize the Spot VM Instance

A new virtual machine needs to be created to utilize a spot instance for the backup server. Once created, it needs to be initialized and configured.

Create the Spot Virtual Machine

Spot VMs must be configured to the Run with Azure Spot discount option at the point of creation as shown in the screenshot below. Once this option is selected, the Eviction type and Eviction options sections become available. All other steps associated with the creation of a virtual machine are the same as the ones described in the previous post on the original Samba server.

For this workload, we need an eviction policy that stops and deallocates the virtual machine when the Azure region needs the underlying resources. Other workload types may elect to delete the virtual machine and its underlying resources during an eviction. However, we wish to restart the virtual machine when capacity becomes available once more. In a later post, we will develop an external orchestration mechanism that tests for the return of capacity and restarts the virtual machine following an eviction.

Create Spot VM Instance

Perform Base Image Initialization

Once the virtual machine has been allocated, we need to log in and perform some initial configuration steps.

  • Update all base packages
  • Perform initial local firewall configuration
  • Install MOTD modifications
  • Prepare for graceful shutdown in the event of an eviction (manual steps covered below)

As these steps are fairly repetitive, I have taken to scripting them.

A complete script for performing these steps is available here.

Spot Eviction Readiness

As a spot virtual machine running on excess capacity, the virtual machine needs to be ready for eviction by Azure to create capacity for premium workloads with as little as 30 seconds notice. As a result, we need to consistently query the Scheduled Event API and initiate a graceful shutdown in the event an eviction has been initiated by Azure.

The Scheduled Event API is hosted on a non-routable address within the virtual network: 169.254.169.254. We may query this endpoint as often as once per second per the documentation. However, querying and then processing the results require CPU and memory. In our workload, we will query every 10 seconds in the final configuration.

The Eviction Query Script

The core logic of the eviction event polling and graceful shutdown mechanism is a bash script that utilizes curl to query the endpoint. The JSON results of the query are piped into grep to look for instances of the Preempt event. If a preempt event is discovered, a notice is passed to system users and an operating system shutdown is started.

The script requires root permissions to run and should be stored in /usr/local/sbin as a result.

#!/usr/bin/env bash

ENDPOINT_IP="169.254.169.254";
API_VERSION="2020-07-01";

HEADER="Metadata:true";
ENDPOINT_URL="http://${ENDPOINT_IP}/metadata/scheduledevents?api-version=${API_VERSION}";

if [ "$(curl -s -H ${HEADER} ${ENDPOINT_URL} | grep -c Preempt)" -ge 1 ]
    then
        echo "Azure preempt event found... Shutting down.";
        wall "Azure preempt event found... Shutting down cleanly prior to eviction.";
        sleep 5;
        shutdown now "Shutting down. Virtual machine is being evicted.";
    else
        echo "No Azure preempt event found.";
fi

A current version of this complete script can be found here.

The Eviction Query Crontab Configuration

With the logic of the script in place, the next step is to schedule its execution. This is accomplished by placing a crontab snippet into the /etc/cron.d/ directory.

Cron does not allow for scheduling below the minute level. In order to achieve execution of the script every 10 seconds, a little trickery using the sleep command can be performed. All six jobs are kicked off simultaneously with varying levels of sleep in front of script execution.

To support logging to syslogd, both the stdout and stderr outputs of the script are directed to the logger command with a tag of preempt-script. Outputs from the script will be available in /var/log/syslog.

# /etc/cron.d/preempt-query: crontab entries for the vm preempt event query script

SHELL=/bin/sh
PATH=/usr/sbin:/usr/bin:/sbin:/bin

* * * * * root ( /usr/local/sbin/query-for-preempt-event.sh ) 2>&1 | logger -t preempt-query
* * * * * root ( sleep 10 ; /usr/local/sbin/query-for-preempt-event.sh ) 2>&1 | logger -t preempt-query
* * * * * root ( sleep 20 ; /usr/local/sbin/query-for-preempt-event.sh ) 2>&1 | logger -t preempt-query
* * * * * root ( sleep 30 ; /usr/local/sbin/query-for-preempt-event.sh ) 2>&1 | logger -t preempt-query
* * * * * root ( sleep 40 ; /usr/local/sbin/query-for-preempt-event.sh ) 2>&1 | logger -t preempt-query
* * * * * root ( sleep 50 ; /usr/local/sbin/query-for-preempt-event.sh ) 2>&1 | logger -t preempt-query

A current version of this complete file can be found here.

Simulate an Eviction

To prove our work, the next step is to simulate an eviction using the following command from outside the virtual machine. At this point, the spot instance will receive the Preempt event from Azure through the query script and initiate a graceful shutdown. If you are logged into the server via ssh at the time, you will see a wall message in your terminal prior to the system shutting down.

az vm simulate-eviction --resource-group personal-network --name ubuntu-backup-server-spot

Once the system shuts down, we will need to return to the Azure portal to start back up as we do not yet have an external reallocation orchestration in place yet.

Perform Initial Samba Configuration

Following the same steps used in the previous post, Samba needs to be installed with local users and Samba users:

  • Install Samba
  • Create Local Samba User and Group
  • Create Samba Passwords for Use in Shares
  • Configure /etc/samba/smb.conf

To save a little time, I used scp to move the smb.conf file from the original server.

Shutdown the Existing Backup Server

With intention of detaching and moving the backup data disk to the new spot instance, we need to use the portal to stop the original server. Navigate to the old server in the portal and select Stop from the main toolbar.

Snapshot Backup Data Disk

For safety, prior to performing any operations on the data disk, create a snapshot by navigating to the disk in the portal and selecting Create snapshot. Once the disk operations have finished and we have proven everything is working, this snapshot can be removed.

Detach the Data Disk from the Original Backup Server

Within the portal while navigated to the original server, select Settings > Disks and scroll to the Data disks section. Scroll to the far right of the screen and select the Detach button to detach the disk from the virtual machine. Select Save to perform the operation.

Attach the Data Disk to the Spot VM

The next steps can be performed while the new spot server is running.

Navigate to the new spot virtual machine in the portal and select Settings > Disks. Scroll to the Data disks section and select Attach existing disks. Select the existing data disk from the dropdown menu and the click Save from the toolbar. The disk will be presented to the virtual machine live.

Log into the server via ssh. Run a ls command to find the new disk device in the operating system. Recall that it has a single primary partition formatted with ext4 from when it was originally created.

ls -la /dev/sd*
brw-rw---- 1 root disk 8,  0 Jul 24 18:27 /dev/sda
brw-rw---- 1 root disk 8,  1 Jul 24 18:27 /dev/sda1
brw-rw---- 1 root disk 8, 14 Jul 24 18:27 /dev/sda14
brw-rw---- 1 root disk 8, 15 Jul 24 18:27 /dev/sda15
brw-rw---- 1 root disk 8, 16 Jul 24 18:27 /dev/sdb
brw-rw---- 1 root disk 8, 17 Jul 24 18:27 /dev/sdb1
brw-rw---- 1 root disk 8, 32 Jul 24 19:18 /dev/sdcbrw-rw---- 1 root disk 8, 33 Jul 24 19:18 /dev/sdc1

In this case, the device is /dev/sdc and the partition is /dev/sdc1. Run lsblk to verify the size of the partition and that it does not yet have a mount point.

lsblk
NAME    MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
loop0     7:0    0  63.4M  1 loop /snap/core20/1974
loop1     7:1    0  53.3M  1 loop /snap/snapd/19457
loop2     7:2    0  63.4M  1 loop /snap/core20/1950
loop3     7:3    0 111.9M  1 loop /snap/lxd/24322
sda       8:0    0    30G  0 disk
├─sda1    8:1    0  29.9G  0 part /
├─sda14   8:14   0     4M  0 part
└─sda15   8:15   0   106M  0 part /boot/efi
sdb       8:16   0    16G  0 disk
└─sdb1    8:17   0    16G  0 part /mnt
sdc       8:32   0     1T  0 disk└─sdc1    8:33   0  1024G  0 part

Create a backup directory on the file system as a mount point and mount the partition.

sudo mkdir /backup
sudo mount -t ext4 -o rw /dev/sdc1 /backup

List the files, permissions and user/group ownership on the newly mounted device. Note that the group ownership is incorrect on the newly mounted filesystem. This is due to group IDs being different between the new and old system.

ls -la /backup/
total 32
drwxr-xr-x  5 root    root        4096 Jul 18 04:31 .
drwxr-xr-x 22 root    root        4096 Jul 24 19:22 ..
drwxr-xr-x  3 smbuser aad_admins  4096 Jul 24 14:52 applebackupsdrwxr-xr-x  3 smbuser aad_admins  4096 Jul 18 04:45 linuxbackupsdrwx------  2 root    root       16384 Jul 18 03:38 lost+found

To correct this problem, change group ownership recursively with the following commands:

sudo chown -R smbuser:smbgroup /backup/applebackups
sudo chown -R smbuser:smbgroup /backup/linuxbackups

Run blkid to identify the UUID of the new filesystem. Using the UUID is the only reliable way to ensure a clean mount in between boots on a virtual machine.

sudo blkid
/dev/sdb1: UUID="e562be7f-ffa0-46df-b57a-16e0b265aea1" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="b56e4a70-01"
/dev/sdc1: UUID="44accf23-56c1-486a-af6b-a2146d6a8a63" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="98e21008-01"/dev/sda15: LABEL_FATBOOT="UEFI" LABEL="UEFI" UUID="B6C3-B75F" BLOCK_SIZE="512" TYPE="vfat" PARTUUID="a295d76d-648c-4a4b-a3e0-b6307dd6e209"
/dev/sda1: LABEL="cloudimg-rootfs" UUID="1c12acfb-8f0c-440f-b6b7-6c22c1f36e1e" BLOCK_SIZE="4096" TYPE="ext4" PARTUUID="ee1583fb-fd94-4b13-98b6-5f25e7fa4580"
/dev/loop1: TYPE="squashfs"
/dev/loop2: TYPE="squashfs"
/dev/loop0: TYPE="squashfs"
/dev/loop3: TYPE="squashfs"
/dev/sda14: PARTUUID="b857fc35-18dc-48df-b3f0-ae83aa33b319"

Now that things are working as expected with the newly mounted filesystem, we can modify /etc/fstab to ensure the filesystem will mount on boot.

# CLOUD_IMG: This file was created/modified by the Cloud Image build process
UUID=1c12acfb-8f0c-440f-b6b7-6c22c1f36e1e       /        ext4   discard,errors=remount-ro       0 1
UUID=B6C3-B75F  /boot/efi       vfat    umask=0077      0 1
/dev/disk/cloud/azure_resource-part1    /mnt    auto    defaults,nofail,x-systemd.requires=cloud-init.service,_netdev,comment=cloudconfig       0       2
UUID=44accf23-56c1-486a-af6b-a2146d6a8a63 /backup ext4 rw 0 0

Reboot with sudo shutdown -r to prove everything is working and then validate the filesystem after logging in following the reboot.

Update the Local Firewall

Next update the firewall rules to allow Samba traffic with the following commands:

sudo ufw allow samba
sudo ufw status numbered

Start the Samba Service

At this point, we are ready to start the Samba service and connect with a client to prove everything is working as expected.

sudo systemctl start smbd
sudo systemctl status smbd

Create a CNAME Record on the Private DNS Zone

My host names are getting long and this exercise of changing out a key server reminded me that there are more convenient methods to setting up name resolution now that private DNS zone has been established on the network. While servers auto-register themselves with the DNS zone, it is also possible to create manual entries. As a result, I created a CNAME record to point to the new server’s A record in the private zone. In the future, I will configure all backup clients to use backup.private.jpatrickfulton.com rather than the FQDN of the backup server, giving me the freedom to switch out the implementation server without revisiting client configurations again.

Create CNAME Record

Configure the Backup Clients

Now that the server has been exchanged for a spot instance and a new internal DNS name is in place, the local network backup clients require reconfiguration.

Reconfigure macOS Clients

From a Finder window, select Go > Connect to server to open a dialog to allow macOS to select the new server and set up authentication.

macOs Connect to Server

Enter credentials and mount the applebackups share.

macOS Connect to Share

Please note that if existing backups were encrypted with a password, they may need to be recreated as the password is matched to the server hostname.

Reconfigure Linux Clients

The Linux clients are much more forgiving. From Ubuntu Desktop, select Activities and search for Backups.

Ubuntu Backup Utility

Select Preferences from the menu.

Ubuntu Backup Preferences

Click on the Location field and then update the smb URL.

Ubuntu Backup Location

Clean up by Deleting the Original Server and Snapshot

Finally, we just need to perform some cleanup to be economical with our cloud spend. Delete the old backup virtual machine. This operation will take its OS disk and network interface with it. The network security group associated with the NIC needs to be deleted manually. The disk snapshot may also be removed if everything is working as expected.

Delete Original Server
Profile picture

Written by J. Patrick Fulton.


Azure IaaS virtual machine spot instance interruptible workload samba backup bash Linux Ubuntu