The Noob's Guide to a VMware NSX-T/PKS Home Lab

William Lam recently posted a tutorial walkthrough series for NSX-T and PKS on virtuallyghetto. I used this as the foundation for my home lab setup, but it was a struggle to get things working, and without the help of a very clever VMware SE, I probably wouldn't have been able to figure it out. .

Warning: NSX-T is complicated.

This post is an attempt to fill in some of the gaps in the series with information that wasn't initially obvious to me - and may not be obvious to you.

Full disclosure: I am not a VMware certified anything.

Part 1 of the PKS series is relatively straightforward:

Getting started with VMware Pivotal Container Service (PKS) Part 1: Overview

Part 2 involves prepping your PKS workstation VM. The folks at PKS are fond of Ubuntu, and so the tutorial uses an Ubuntu VM, but any machine that can run the required tools will do. I prepped both my Macbook Pro and my CentOS7 cluster-builder control station with the necessary tools, as per the article:

Getting started with VMware Pivotal Container Service (PKS) Part 2: PKS Client

Part 3 is where things start to get deep into the weeds, and fast. The article on NSX-T makes reference to having a working NSX-T 2.x lab, and references the automated powershell script for setting one up. It may work for you, but I wanted to understand how all the parts fit together and for me doing it by hand is the ~~best~~ only way to learn.

Getting started with VMware Pivotal Container Service (PKS) Part 3: NSX-T

I was stuck on Step #3... for awhile. A kindly VMware SE walked me through it, and with his help, it started to make sense.

So here I will augment Mr. Lam's fine work with my own noobs guide to installing an NSX-T/PKS home lab...

Prep

I am assuming you have downloaded all the depencencies from Lam's article.
It would be wise to review the articles first to get a sense of all the steps. This is only a few extra tips in addition to the "Lam Series" - I relied heavily on his step-by-steps at all stages.

Warning

These topics probably won't mean anything yet, but take note anyway! It could save you hours:

Beware of the "Down". I mistook the status of Down in the transport nodes as failed installation or configuration. After wasting countless hours experimenting with variations on what appeared to be correctly configured, I was informed, this is by design. The status won't go green until a cluster has been deployed and their are k8s nodes using the VTEP.

>= 1600 MTU is required. Make sure to set all of the switches (VSS and VDS) involved to 1600 MTU or greater as jumbo frames is required for correct NSX-T operation.

The Edge needs to be LARGE... one of the few gotchas I didn't hit, but not clearly called out in the article: PKS requires the Edge to be 8 CPU and 16GB Ram, it is a hard limit and when you get to the very end and try to install PKS, the NSX-T errand will stop you if your edge VM is too small.

Ok... keep those things in mind...

Now let's start with a diagram:

Context: The Home Lab

In this post I take the example scenario described in Lam's series and implement it specifically in my home lab setup. When I first reviewed the diagram from the article, it wasn't clear to me how the two VLANs (3250 and 3251) might relate to the physical network, so I am presenting the entire picture here in an attempt to clear that up for anyone else who finds it fuzzy or confusing.

My friendly neighbourhood VMware SE simplified the configuration to better suit my lab setup ;)

What Mr. Lam references as his management network of 3251 (I'm sure he has many), is actually just my primary home network of 192.168.1.0/24.

His 3250 network isn't actually needed in my setup.

Put simply (very simply):

The overlay transport zone and dedicated port group support the overlay networks and traffic in the k8s environment. This VTEP network allows all of the k8s VMs to communicate in their own network spaces on top of the ESXi hosts that are the "transport nodes" for this virtual network (that sits on top of vSpheres already virtual network - head hurt yet?)
The VLAN transport zone and dedicated port group and uplink represent the client traffic entry point into both the NSX-T k8s dedicated load balancers and the k8s managmenet nodes network.

The labs cluster is used for management and contains two physical ESXi hosts: esxi-5 and esxi-6. These are the compute resources dedicated to the NSX-T/ESXi virtual lab. It is referred to as the management cluster.

The pks-lab cluster is made up of three nested ESXi hosts: vesxi-1, vesxi-2 and vesxi-3. It is referred to as the compute cluster, and it is where PKS will place all the k8s nodes and related artifacts.

Specs for each physical ESXi host: i7 - 7700, 64GB Ram, 512GB m2 Nvme SSD local datastores. I run all three of my Virtual ESXi hosts on esxi-6 along with my nsx-edge. My vesxi virtual ESXi hosts all share the same single iSCSI based datastore, which is actually just a CentOS7 iSCSI target hosting up a VMDK from an SSD running on a Mac Mini, over my primary Gigabit ethernet (not even dedicated). And still I average nearly 100 MB/s write speed. Remarkable.

Before we dive into NSX-T we should tackle the matter of setting up some virtual ESXi hosts and a dedicated vSphere cluster:

Nested ESXi for Dummies

Deploying the virtual ESXi hypervisors as VMs is actually fairly straightforward until you get to the part about networking with VLANs. If you use VLANs, and wish to make them available to your virtual ESXi hosts, make sure to follow this tip.

The physical ESXi hosts need to have access to a dedicated NSX Tunnel portgroup (It is an NSX requirement to have a dedicated vnic or pnic for this). In my home lab I use standard VSS switches on my physical ESXi boxes so I am not dependent on vSphere - but I used a vSphere VDS for my pks-lab cluster of VESXi hosts I am dedicating to NSX... so I had to create two NSX Tunnel portgroups (one on each physical ESXi hosts in the local VSS Switch0, and one on my VDS that I will use in my virtual ESXi hosts - VLAN 0. The NSX Tunnel basically just provides a level 2 dedicated portgroup to NSX upon which NSX can work it's layer 7 magic.

As per the diagram above setup your ESXi VMs so that their network adapters are as follows:

Network Adapter 1: VLAN Trunk
Network Adapter 2: VM Network
Network Adapter 3: NSX Tunnel

These will then map to vmnic0, vmnic1 and vmnic2 respectively.

This is important later on when mapping our uplink ports.

I like to setup my virtual ESXi hosts with /etc/ssh/keys-root/authorized_keys for passwordless ssh, but this isn't required if you enjoy entering passwords.

Once the ESXi hosts have been uploaded and installed they should be added to a dedicated cluster, such as the pks-lab in this example.

Now would be a good time to enable DRS on your newly created pks-lab cluster. If you forget to do this, as I did, you will regret it when you deploy, and then have to redeploy, your first k8s cluster with PKS, which leverages and depends on DRS for balancing of the k8s node VMs over ESXi hosts.

The following is a screenshot of my vSphere setup for the labs and pks-lab cluster environments, as well as the Networks involved.

#!/bin/bash

ovftool \
--name=vesxi-1 \
--X:injectOvfEnv \
--X:logFile=ovftool.log \
--allowExtraConfig \
--datastore=datastore-ssd \
--network="VLAN Trunk" \
--acceptAllEulas \
--noSSLVerify \
--diskMode=thin \
--powerOn \
--overwrite \
--prop:guestinfo.hostname=vesxi-1 \
--prop:guestinfo.ipaddress=192.168.1.81 \
--prop:guestinfo.netmask=255.255.255.0 \
--prop:guestinfo.gateway=192.168.1.1 \
--prop:guestinfo.dns=8.8.8.8 \
--prop:guestinfo.domain=onprem.idstudios.io \
--prop:guestinfo.domain=pool.ntp.org \
--prop:guestinfo.password=SuperDuckPassword \
--prop:guestinfo.ssh=True \
../../../pks/nsxt/Nested_ESXi6.5u1_Appliance_Template_v1.0.ova \
vi://[email protected]:[email protected]/?ip=192.168.1.246

ovftool \
--name=vesxi-2 \
--X:injectOvfEnv \
--X:logFile=ovftool.log \
--allowExtraConfig \
--datastore=datastore-ssd \
--network="VLAN Trunk" \
--acceptAllEulas \
--noSSLVerify \
--diskMode=thin \
--powerOn \
--overwrite \
--prop:guestinfo.hostname=vesxi-2 \
--prop:guestinfo.ipaddress=192.168.1.82 \
--prop:guestinfo.netmask=255.255.255.0 \
--prop:guestinfo.gateway=192.168.1.1 \
--prop:guestinfo.dns=8.8.8.8 \
--prop:guestinfo.domain=onprem.idstudios.io \
--prop:guestinfo.domain=pool.ntp.org \
--prop:guestinfo.password=SuperDuckPassword \
--prop:guestinfo.ssh=True \
../../../pks/nsxt/Nested_ESXi6.5u1_Appliance_Template_v1.0.ova \
vi://[email protected]:[email protected]/?ip=192.168.1.246

ovftool \
--name=vesxi-3 \
--X:injectOvfEnv \
--X:logFile=ovftool.log \
--allowExtraConfig \
--datastore=datastore-ssd \
--network="VLAN Trunk" \
--acceptAllEulas \
--noSSLVerify \
--diskMode=thin \
--powerOn \
--overwrite \
--prop:guestinfo.hostname=vesxi-3 \
--prop:guestinfo.ipaddress=192.168.1.83 \
--prop:guestinfo.netmask=255.255.255.0 \
--prop:guestinfo.gateway=192.168.1.1 \
--prop:guestinfo.dns=8.8.8.8 \
--prop:guestinfo.domain=onprem.idstudios.io \
--prop:guestinfo.domain=pool.ntp.org \
--prop:guestinfo.password=SuperDuckPassword \
--prop:guestinfo.ssh=True \
../../../pks/nsxt/Nested_ESXi6.5u1_Appliance_Template_v1.0.ova \
vi://[email protected]:[email protected]/?ip=192.168.1.246

Remember to add two additional network adapters so you have the following configuration for each of the nested ESXi hosts:

Network Adapter 1: VLAN Trunk
Network Adapter 2: VM Network
Network Adapter 3: NSX Tunnel

The Network Adapters on your Virtual ESXi hosts should look as follows:

Virtual SAN

You'll also need to setup some sort of shared datastore solution for the Virtual ESXi hosts. I stopped short of setting up vSAN and opted for a simpler approach. If you don't have access to a SAN in your home lab, you can create one fairly easily with a simple VM on a hypervisor.

Make sure it is on a seperate ESXi host as you can't host an iSCSI datastore from a VM running on the same hypervisor that intends to mount the target.

I used this guide from RedHat to setup a CentOS7 iSCSI target in a VM.

You can then configure your iSCSI adapter on your ESXi host to see the LUNs made available from your iSCSI VM, format them with vmfs, and just like that you have a SAN.

Or you could use vSAN...

NSX-T 2.1 Install for Dummies

I followed the NSX-T install document and it was fairly straightforward up to the point of the Edge and transport nodes.

You should read this guide as I won't be repeating all of the information involved in the NSX-T setup, just the bits that aren't entirely clear from the docs and are only covered off by the automated script referenced in Mr. Lam's article.

Here is the ovftool script I used (several times over)... mostly taken right from the docs.

Step 1 - NSX Manager

#!/bin/bash

ovftool \
--name=nsx-manager \
--X:injectOvfEnv \
--X:logFile=ovftool.log \
--allowExtraConfig \
--datastore=datastore-m2 \
--network="VM Network" \
--acceptAllEulas \
--noSSLVerify \
--diskMode=thin \
--powerOn \
--prop:nsx_role=nsx-manager \
--prop:nsx_ip_0=192.168.1.60 \
--prop:nsx_netmask_0=255.255.255.0 \
--prop:nsx_gateway_0=192.168.1.1 \
--prop:nsx_dns1_0=192.168.1.10 \
--prop:nsx_domain_0=idstudios.local \
--prop:nsx_ntp_0=pool.ntp.org \
--prop:nsx_isSSHEnabled=True \
--prop:nsx_allowSSHRootLogin=True \
--prop:nsx_passwd_0=SUP3rD^B3r_2!07 \
--prop:nsx_cli_passwd_0=SUP3rD^B3r_2!07 \
--prop:nsx_hostname=nsx-manager \
../../pks/nsxt/nsx-unified-appliance-2.1.0.0.0.7395503.ova \
vi://[email protected]:[email protected]/?ip=192.168.1.245

Nothing complicated about this one.

Log into NSX Manager and poke around.

Step 2 - NSX Controller(s)

Next up the NSX controllers. You really only need one in the lab.

#!/bin/bash

ovftool \
--overwrite \
--name=nsx-controller \
--X:injectOvfEnv \
--X:logFile=ovftool.log \
--allowExtraConfig \
--datastore=datastore-ssd \
--network="VM Network" \
--noSSLVerify \
--diskMode=thin \
--powerOn \
--prop:nsx_ip_0=192.168.1.61 \
--prop:nsx_netmask_0=255.255.255.0 \
--prop:nsx_gateway_0=192.168.1.1 \
--prop:nsx_dns1_0=192.168.1.10,192.168.1.2,8.8.8.8 \
--prop:nsx_domain_0=idstudios.local \
--prop:nsx_ntp_0=pool.ntp.org \
--prop:nsx_isSSHEnabled=True \
--prop:nsx_allowSSHRootLogin=False \
--prop:nsx_passwd_0=SUP3rD^B3r_2!07 \
--prop:nsx_cli_passwd_0=SUP3rD^B3r_2!07 \
--prop:nsx_cli_audit_passwd_0=SUP3rD^B3r_2!07 \
--prop:nsx_hostname=nsx-controller \
../../pks/nsxt/nsx-controller-2.1.0.0.0.7395493.ova \
vi://[email protected]:[email protected]/?ip=192.168.1.246

At this point it is best to refer to the NSX-T installation guide instructions on creating management clusters and controller clusters and the like. See Join NSX Clusters with the NSX Manager for details and follow the guide up to the NSX Edge installation. It involves sshing into the manager and controller and executing a few arcane token based join commands that are common among clustering technology.

I tried to follow the documentation right to the end but was unable to sort out the transport settings. It was only with the kind help of the VMware SE who guided me through the process, described below...

Step 3 - NSX Edge

So you don't actually need to have the edge OVA file. This is the manual way to install the Edge:

(logged into NSX Manager)

Add your vCenter as a compute manager under Fabric>Compute Managers.
Now you can deploy an Edge VM directly from within the UI via Fabric>Nodes.

However I prefer to install it with ovftool because I can enable SSH right out of the box. And doing anything more then once gets tiresome in web forms.

#!/bin/bash

ovftool \
--name=nsx-edge \
--deploymentOption=large \
--X:injectOvfEnv \
--X:logFile=ovftool.log \
--allowExtraConfig \
--datastore=datastore-ssd \
--net:"Network 0=VM Network" \
--net:"Network 1=NSX Tunnel" \
--net:"Network 2=NSX Tunnel" \
--net:"Network 3=NSX Tunnel" \
--acceptAllEulas \
--noSSLVerify \
--diskMode=thin \
--powerOn \
--prop:nsx_ip_0=192.168.1.65 \
--prop:nsx_netmask_0=255.255.255.0 \
--prop:nsx_gateway_0=192.168.1.1 \
--prop:nsx_dns1_0=8.8.8.8 \
--prop:nsx_domain_0=onprem.idstudios.io \
--prop:nsx_ntp_0=pool.ntp.org \
--prop:nsx_isSSHEnabled=True \
--prop:nsx_allowSSHRootLogin=True \
--prop:nsx_passwd_0=SuperDuckPassword \
--prop:nsx_cli_passwd_0=SuperDuckPassword \
--prop:nsx_hostname=nsx-edge \
../../../pks/nsxt/nsx-edge-2.1.0.0.0.7395502.ova \
vi://[email protected]:[email protected]/?ip=192.168.1.246

You'll want to put your Edge VM in the management cluster (in my example labs).

The networking is of particular importance. It is important that the resulting network adapters and associated port groups match up with those shown:

Step 4 - Transport Zones and Transport Nodes

It was never clear to me intially from the documentation how you layout your transport zones. Lam's Part 3 assumes you already have your Host and Edge transport nodes configured, along with your transport zones, so it doesn't provide much guidance. It is actually fairly simple.

At this stage we basically need to:

Create two transport zones
Create two uplink profiles
Create a VTEP IP Pool

Transport Zones

My setup has two transport zones defined:

pks-lab-overlay-tz

and

pks-lab-vlan-tz

Note the N-DVS logical switch names we assign (and create) here as part of our transport zones. These will be referenced again when we configure our Edge Transport Node a bit further on.

Uplink Profiles

Here is another area where the stock documentation caused me confusion. In this example configuration the settings for the uplink profiles are very specific with respect to the network adapter settings.

Create two uplink profiles under Fabric>Profiles:

host-uplink
edge-uplink

(You can guess how they will be respectively used)

Ensure the host-uplink settings are as follows:

Pay particular attention to the Active Uplinks field as that must be set to vmnic2 as this is the NSX Tunnel portgroup on our Virtual ESXi hosts. It would be our dedicated phyiscal nic required by NSX if we were deploying to physical ESXi hosts. Later on we will reference this when we setup our N-DVS in our host transport node.

Ensure the edge-uplink settings are as follows:

Pay particular attention to the Active Uplinks field as that must be set to fp-eth1 as this is the internal name of the network interface within the Edge VM. Later on we will reference this when we setup our N-DVS in our Edge transport node.

VTEP IP Pool

And a VTEP_IP_POOL IP Pool of 10.20.30.30 to 10.20.30.60.

This IP Pool will be used to create virtual tunnel end points on each of the ESXi hosts transport nodes, as well as on the Edge transport node (on the Edge VM), as per the diagram. Remember it is used as network transport for the overlay networks used by the k8s nodes that make up the kubernetes clusters.

Step 5 - Host Transport Nodes

In NSX Manager under Fabric>Nodes>Hosts, select your compute manager from the Managed By: drop down, and then expand your target cluster (in my example: pks-lab). If you select the checkbox at the cluster level, the option to Configure Cluster will enable.

Note that we allocated the VTEP_IP_POOL for this, and we will do it again when we configure the Edge transport node.

By setting up Configure Cluster it will now automatically create and configure any new ESXi hosts added to the pks-lab cluster as transport nodes. And our host transport nodes are now created.

Step 6 - The Edge Transport Node

You'll notice in the diagram that in addition to the VTEP addresses allocated to each of the vesxi hosts, one is also allocated to the Edge VM transport node.

So now we must create the Edge Transport Node:

Under Fabric>Nodes>Transport Nodes click Add. Enter edge-transport-node as the name, and for the Node chose your Edge VM.

Select both of your transport zones. Your General tab should appear as shown below:

There will be two N-DVS entries, one for each of the transport zones we created, and referencing the associated logical switches.

pks-lab-overlay-switch

Pay careful attention to the Virtual NICs mapping. For the Overlay switch it should map from fp-eth0 to fp-eth1.

nsxlab-vlan-switch

Pay careful attention to the Virtual NICs mapping. For the VLAN switch it should map from fp-eth1 to fp-eth1.

At this point the Edge Transport Node should be configured and we can rejoin with Lam's guide on configuring NSX-T:

Getting started with VMware Pivotal Container Service (PKS) Part 3: NSX-T

Everything should be in place to proceed with his direction, with a few adjustments:

Remember that his 3251 VLAN is really our primary 192.168.1.0 home network, and we don't really need both 3250 and 3251 (though it is likely a good practice).
Remember to map all of his 172.30.51.0/24 addresses to what we implement as 192.168.1.0/24. And that we don't use 172.30.50.0/24 at all.
His pfSense router is really just our primary home gateway to the internet and the both VLANs aren't actually needed in our configuration, we use 192.168.1.8 on the main network as our uplink-1 uplink port address associated with our T0 router:

Although this eliminates the need for the VLAN 3250 we do need to setup the static routes on our router to enable access to two k8s networks he uses - 10.10.0.0/24 (the T1 k8s management router) and 10.20.0.0/24 (the IP Pool assigned for pks loadbalancers).

At the end of the article there are some useful tips about validating the NSX-T setup.

SSH into the vesxi hosts one by one and ensure the following:

esxcli network ip interface ipv4 get

You should see vmk10 and vmk50 appear similar to this:

Name   IPv4 Address  IPv4 Netmask   IPv4 Broadcast   Address Type  Gateway  DHCP DNS
-----  ------------  -------------  ---------------  ------------  -------  --------
vmk0   192.168.1.83  255.255.255.0  192.168.1.255    STATIC        0.0.0.0     false
vmk10  10.20.30.30   255.255.255.0  10.20.30.255     STATIC        0.0.0.0     false
vmk50  169.254.1.1   255.255.0.0    169.254.255.255  STATIC        0.0.0.0     false

Remember that those 10.20.30.x addresses came from our VTEP pool.

Verify that the logical switches are shown on our ESXi hosts:

esxcli network ip interface list

vmk10
  Name: vmk10
  MAC Address: 00:50:56:61:f0:f3
  Enabled: true
  Portset: DvsPortset-1
  Portgroup: N/A
  Netstack Instance: vxlan
  VDS Name: nsxlab-overlay-switch
  VDS UUID: 59 e6 25 ac 27 00 45 6f-a5 92 77 7d d4 ce 87 ae
  VDS Port: 10
  VDS Connection: 1523044214
  Opaque Network ID: N/A
  Opaque Network Type: N/A
  External ID: N/A
  MTU: 1600
  TSO MSS: 65535
  Port ID: 67108868

vmk50
  Name: vmk50
  MAC Address: 00:50:56:68:e4:33
  Enabled: true
  Portset: DvsPortset-1
  Portgroup: N/A
  Netstack Instance: hyperbus
  VDS Name: nsxlab-overlay-switch
  VDS UUID: 59 e6 25 ac 27 00 45 6f-a5 92 77 7d d4 ce 87 ae
  VDS Port: c6029e74-9952-4960-8d3a-87caafaf4fa5
  VDS Connection: 1523044215
  Opaque Network ID: N/A
  Opaque Network Type: N/A
  External ID: N/A
  MTU: 1500
  TSO MSS: 65535
  Port ID: 67108869

You can see the nsxlab-overlay-switch we defined in our uplink profile and associated with our edge transport node and host transport nodes.

And from any of the Virtual ESXi hosts you should be able to ping all the other VTEP addresses:

vmkping ++netstack=vxlan 10.20.30.30 # vesxi-1
vmkping ++netstack=vxlan 10.20.30.31 # vesxi-2
vmkping ++netstack=vxlan 10.20.30.32 # vesxi-3
vmkping ++netstack=vxlan 10.20.30.33 # edge vm

With NSX-T configured we can now proceed with the remainder of the series.

Part 4 involves setting up the Ops Manager and BOSH...

VERY IMPORTANT to make sure that the version of the Pivotal Ops Manager you install is version Build 249 which maps to version 2.0.5. The name of the OVA you want is pcf-vsphere-2.0-build.249.ova. Anything newer won't work and you'll be left pulling out hair and bosh sshing around sifting through failure logs without the help of journald because, well, the stemcells are trusty and it's old stuff in there, and trust me... just get Build 249.

Getting started with VMware Pivotal Container Service (PKS) Part 4: Ops Manager & BOSH

Part 5 and Part 6 go fairly smoothly from here, as long as you made sure to install Build 249 of the Ops Manager.

If all goes well you'll be kubectling away with PKS.

But if you are like me you'll repeat this entire thing a few dozen times first :)

Troubleshooting

1 of 3 post-start scripts failed. Failed Jobs: ncp. Successful Jobs: bosh-dns, kubelet.

If you get all the way to the end, and go to deploy a k8s cluster only to have this:

Using environment '192.168.1.201' as client 'ops_manager'

Task 22

Task 22 | 22:36:11 | Preparing deployment: Preparing deployment (00:00:03)
Task 22 | 22:36:17 | Preparing package compilation: Finding packages to compile (00:00:00)
Task 22 | 22:36:17 | Creating missing vms: master/96cb1deb-bc4e-44b3-ac69-220cb0935bf8 (0)
Task 22 | 22:36:17 | Creating missing vms: worker/033088fc-5f91-4b57-b9e3-3cc718031e3b (0)
Task 22 | 22:36:17 | Creating missing vms: worker/5b8d175b-927e-48c1-a97c-f8b7ef099be5 (2)
Task 22 | 22:36:17 | Creating missing vms: worker/d806a4f4-6a1e-4f9c-a2ba-08173699e830 (1)
Task 22 | 22:37:21 | Creating missing vms: worker/5b8d175b-927e-48c1-a97c-f8b7ef099be5 (2) (00:01:04)
Task 22 | 22:37:23 | Creating missing vms: worker/d806a4f4-6a1e-4f9c-a2ba-08173699e830 (1) (00:01:06)
Task 22 | 22:37:28 | Creating missing vms: master/96cb1deb-bc4e-44b3-ac69-220cb0935bf8 (0) (00:01:11)
Task 22 | 22:37:32 | Creating missing vms: worker/033088fc-5f91-4b57-b9e3-3cc718031e3b (0) (00:01:15)
Task 22 | 22:37:32 | Updating instance master: master/96cb1deb-bc4e-44b3-ac69-220cb0935bf8 (0) (canary) (00:01:36)
Task 22 | 22:39:08 | Updating instance worker: worker/033088fc-5f91-4b57-b9e3-3cc718031e3b (0) (canary) (00:03:49)
Task 22 | 22:42:57 | Updating instance worker: worker/5b8d175b-927e-48c1-a97c-f8b7ef099be5 (2) (00:06:20)
Task 22 | 22:49:17 | Updating instance worker: worker/d806a4f4-6a1e-4f9c-a2ba-08173699e830 (1) (00:07:15)
                  L Error: Action Failed get_task: Task 2b6efc6b-0a14-4cf5-7f1e-067d80563cce result: 1 of 3 post-start scripts failed. __Failed Jobs: ncp. Successful Jobs: bosh-dns, kubelet.__
Task 22 | 22:56:32 | Error: Action Failed get_task: Task 2b6efc6b-0a14-4cf5-7f1e-067d80563cce result: 1 of 3 post-start scripts failed. Failed Jobs: ncp. Successful Jobs: bosh-dns, kubelet.

Task 22 Started  Wed Apr 11 22:36:11 UTC 2018
Task 22 Finished Wed Apr 11 22:56:32 UTC 2018
Task 22 Duration 00:20:21
Task 22 error

Capturing task '22' output:
  Expected task '22' to succeed but state is 'error'

Exit code 1

Causes

Not enabling the NSX-T Errand that tags everything (which is off by default because Flannal is the default networking stack for PKS). This is set when configuring PKS itself in Ops Manager, in the PKS Tile settings under Errands and NSX-T Validation errand should be set to "On" or NSX-T will require advanced manual tagging (that is as much as I know) and without it will not work out-of-the-box. The errand is actually a requirement of NSX-T, unless you are a big brained VMware SE.
Not using PCF Ops Manager for vSphere build 249.

The Logs

If things go sideways and you are inclined to drill into why...

bosh ssh -d <bosh deployment instance id> <worker/master/node id>

As mentioned, you won't get any help from journald, but you will find all the logs cleverly hidden at:

/var/vcap/sys/log

This PKS guide to diagnostic tools has some good tips.

navinrio / the-noobs-guide-to-nsxt-pks Goto Github PK

the-noobs-guide-to-nsxt-pks's Introduction