Using DeployStudio Across Subnets—a Path not Taken

October 9, 2012 Brian Cunnie

At Pivotal Labs we use DeployStudio to rapidly image machines over the network. It was an excellent solution when the DeployStudio server and the client were on the same subnet. It did not work when they were on different subnets.

We found that, with a combination of clever use of tcpdump, a carefully-crafted dhcpd configuration file, and a judicious set of firewall exceptions, we were able to extend DeployStudio so that it worked across subnets.

Unfortunately, it was an epic fail: every third install would cause our firewall (m0n0wall 1.8.0b512) to lock up. We have put the project on ice until we get a new firewall.

Audience

This blog post is intended for IT organizations with the following characteristics

  • use DeployStudio to deploy OS X workstations
  • have multiple subnets
  • are uncomfortable having a DeployStudio server span multiple networks (most often these are security concerns; by compromising the DeployStudio server, a hacker would gain access to all the networks) (a DeployStudio server must run several services, at least one of which, NFS, requires discipline to implement in a secure manner)
  • use an ISC DHCP server
  • are willing to put their firewall to the test

The easy way

See Ryan’s comments below. With a few lines of Cisco configuration (assuming you have a Cisco router), you can easily configure DeployStudio boots across subnets.

The rest of this blog post is the much more difficult path that I took, and I don’t recommend it unless you really enjoy doing things the hard way.

The Hard Way: Start with tcpdump

To make DeployStudio work across subnets, you first need to use tcpdump to capture how it works within a subnet. In this case, we used a laptop (kate-enet), and our DeployStudio server (deploystudio).

First, we started the capture. We captured to a file so that we could examine the output at our leisure. We ran the following command on our deploystudio server:

sudo tcpdump -w /tmp/kate.tcp -s 1536 host kate-enet

Next, we started a network install:

  • we turned on kate-enet (a 13″ MacBook Air laptop with a thunderbolt ethernet adapter)
  • we held down the option-key so that we were presented with a choice of boot options
  • we chose the network install
  • when DeployStudio runtime screen came up, we ctrl-c’d the tcpdump—we had what we needed.

Then we examined the tcpdump file using the following command:

sudo tcpdump -r /tmp/kate.tcp -vvv | less

There were two packets we were particularly interested in:

deploystudio.sf.pivotallabs.com.bootps > kate-enet.sf.pivotallabs.com.bootpc: [bad udp cksum 2b5a!] BOOTP/DHCP, Reply, length 319, Flags [none] (0x0000)
      Client-IP kate-enet.sf.pivotallabs.com
      Client-Ethernet-Address 40:6c:8f:3d:e6:b4 (oui Unknown)
      Vendor-rfc1048 Extensions
        Magic Cookie 0x63825363
        DHCP-Message Option 53, length 1: ACK
        Server-ID Option 54, length 4: deploystudio.sf.pivotallabs.com
        Vendor-Class Option 60, length 9: "AAPLBSDPC"
        Vendor-Option Option 43, length 56: 1.1.1.4.2.127.209.7.4.130.0.4.56.8.4.130.0.4.56.9.35.130.0.4.56.30.49.48.46.56.95.109.97.99.95.109.105.110.105.95.115.101.114.118.101.114.45.50.48.49.50.45.48.56.48.54
        END Option 255, length 0

And

deploystudio.sf.pivotallabs.com.bootps > kate-enet.sf.pivotallabs.com.bootpc: [bad udp cksum 254b!] BOOTP/DHCP, Reply, length 379, Flags [none] (0x0000)
      Client-IP kate-enet.sf.pivotallabs.com
      Server-IP deploystudio.sf.pivotallabs.com
      Client-Ethernet-Address 40:6c:8f:3d:e6:b4 (oui Unknown)
      sname "deploystudio.sf.pivotallabs.com"
      file "/private/tftpboot/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/i386/booter"
      Vendor-rfc1048 Extensions
        Magic Cookie 0x63825363
        DHCP-Message Option 53, length 1: ACK
        Server-ID Option 54, length 4: deploystudio.sf.pivotallabs.com
        Vendor-Class Option 60, length 9: "AAPLBSDPC"
        RP Option 17, length 93: "nfs:10.80.28.64:/Library/NetBoot/NetBootSP0:10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg"
        Vendor-Option Option 43, length 21: 1.1.2.8.4.130.0.4.56.130.10.78.101.116.66.111.111.116.48.53.48
        END Option 255, length 0

Note:

  • You can ignore any ‘bad cksum’ messages: those messages are an artifact of the checksums being calculated by the ethernet hardware (TCP checksum offloading) instead of by the kernel.
  • deploystudio responds to DHCP queries even though it is not a DHCP server. It is not dishing out IP addresses; it is merely providing additional data for netbooting to work.

There are 4 crucial pieces of data that you must capture.

  • The file directive
  • The RP Option 17
  • The two Vendor-Option Option 43

We then added the information we had culled from the tcpdump to our dhcpd.conf file (special thanks to Pepijn Oomen and Bennett Perkins; see bibliography):

class "netboot" {
    match if substring (option vendor-class-identifier, 0, 9) = "AAPLBSDPC";
    option dhcp-parameter-request-list 1,3,17,43,60;

    if (option dhcp-message-type = 1) {
        option vendor-class-identifier "AAPLBSDPC";
        option vendor-encapsulated-options
            08:04:81:00:00:89;    # bsdp option 8 (length 04) -- selected image id;
    } elsif (option dhcp-message-type = 8) {
        option vendor-class-identifier "AAPLBSDPC";
        if (substring(option vendor-encapsulated-options, 0, 3) = 01:01:01) {
            log(debug, "bsdp_msgtype_list");

            # bsdp image list message:
            # one image, plus one default image (both are the same)
            option vendor-encapsulated-options
                01:01:01:04:02:7f:d2:07:04:82:00:04:38:09:23:82:00:04:38:1e:31:30:2e:38:5f:6d:61:63:5f:6d:69:6e:69:5f:73:65:72:76:65:72:2d:32:30:31:32:2d:30:38:30:36;

        } else {
            log(debug, "bspd_msgtype_select");

            # details about the selected image
            #
            option vendor-encapsulated-options
                01:01:02:08:04:82:00:04:38:82:0a:4e:65:74:42:6f:6f:74:30:35:30;

            next-server deploystudio.sf.pivotallabs.com;
            filename "/private/tftpboot/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/i386/booter";
            option root-path = "nfs:10.0.0.64:/Library/NetBoot/NetBootSP0:10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg";
        }
    }
}

Resist the temptation to substitute a hostname for the NFS server’s IP address; (i.e. leave it “nfs:10.0.0.64”; do not put “nfs:deploystudio.sf.pivotallabs.com”). IP addresses will work; hostnames won’t.

We used ruby (irb) to convert the dotted-decimal strings in tcpdump to colon-hexadecimal in dhcpd.conf. In the following example, we convert “1.1.2.8.4.130.0.4.56.130.10.78.101.116.66.111.111.116.48.53.48”:

 bc$ irb
1.9.3p194 :001 > string="1.1.2.8.4.130.0.4.56.130.10.78.101.116.66.111.111.116.48.53.48"
 => "1.1.2.8.4.130.0.4.56.130.10.78.101.116.66.111.111.116.48.53.48"
1.9.3p194 :002 > string.split(".").each { |n| printf("%02x:",n) }; p
01:01:02:08:04:82:00:04:38:82:0a:4e:65:74:42:6f:6f:74:30:35:30: => nil

Firewall Rules

If you have a firewall arbitrating traffic between the subnets, you’ll need to allow all inbound traffic to your DeployStudio server. Additionally, if your firewall can’t snoop TFTP traffic, you’ll need to allow outbound UDP traffic on unreserved ports (1024 – 65535).

Troubleshooting

If you’re having problems, you need to check that your TFTP and NFS are working, preferably from a machine that’s on the subnet of the client which your trying to image.

TFTP

In our example, we know that our tftp server is deploystudio.sf.pivotallabs.com, and the file we’re downloading is /private/tftpboot/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/i386/booter. Let’s try from the command line:

bc $ tftp deploystudio.sf.pivotallabs.com
tftp> get /private/tftpboot/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/i386/booter
Received 993680 bytes in 18.3 seconds

NFS

Testing NFS is a little tricky because the NFS path is slightly mangled. Specifically, a “:” is substituted for the second-to-last “/” in the pathname. For example, the dhcp root-path directive “nfs:10.80.28.64:/Library/NetBoot/NetBootSP0:10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg”
is translated to a pathname of “/net/10.80.28.64/Library/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg” for testing purposes on a client machine. We take advantage of automount running on a typical OS X client. First do an ls to make sure we can see the file, then do a cp to make sure we can read the file:

 ls /net/10.80.28.64/Library/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg
 cp /net/10.80.28.64/Library/NetBoot/NetBootSP0/10.8_mac_mini_server-2012-0806.nbi/NetInstall.dmg /dev/null

Performance

The time required to image a machine will more than double. A typical install will take 40 minutes or more.

Initial Boot-up

Certain operations are much slower. Specifically, the time between selecting netboot server and being presented with the DeployStudio runtime screen takes approximately 7 minutes. We have studied that lag, and over 4 minutes is due to abysmal (3.8kBps) TFTP throughput. We are unclear why there is such a gross lag; running the same tftp on the command line completes 20x faster (74.7kBps).

We have a firewall that negotiates traffic between our subnets, and we are aware that TFTP provides challenges for firewalls (it re-negotiates its destination port) (Cisco firewalls have special directives to handle TFTP traffic appropriately).

Bibliography

About the Author

Biography

More Content by Brian Cunnie
Previous
Pivot Pong Grand Finals
Pivot Pong Grand Finals

Helps Capybara WebKit + Twitter Bootstrap icon asplodes One of my integration tests started to fail (Broke...

Next
New Report Looks at Big Data's Impact on Health Care
New Report Looks at Big Data's Impact on Health Care

While proposed solutions are a matter of fierce debate, there’s few who would argue that the United States ...

Enter curious. Exit smarter.

Register Now