Giter Club home page Giter Club logo

gantry's Introduction

Warning: Gantry hasn't been tested or worked on for many months.


Travis CI status badge

Gantry

a structure built on a rocket launch pad to facilitate assembly and servicing

NLnet Labs Gantry is a tool for deploying and testing network routers in the cloud, built to support the NLnet Labs Routinator project.

This project exists to answer the question "Does the Routinator work with real routers?" and to make it easy to keep checking that answer as the Routinator grows and the set of test routers and router versions increases.

Here be dragons!

Warning: This is very definitely an early work-in-progress, it has bugs, it's incomplete, and running and hacking on it will require effort. This is not yet for the faint of heart.

Quick Start

Assuming you are running on Ubuntu 18.10 and have Docker, a DigitalOcean account, and a private Docker registry containing one or more supported virtual router images with any necessary licenses, then:

$ cp gantry.cfg.example gantry.cfg
$ vi gantry.cfg                    # edit the values to match your setup
$ ./gantry deploy routinator
$ ./gantry logs routinator
...
RTR: Listening on 0.0.0.0:3323.
$ ./gantry deploy vr-sros:16.0.R6  # or the router that you wish to deploy

Sit back and drink a coffee while the rocket launches!

...
TASK [ON ROUTER vr-sros-16.0.R6 @ 134.209.202.139 : WAIT FOR CONNECTION ESTABLISHED ...
ok: [134.209.202.139 ]

TASK [debug] *********************************************************************************
ok: [134.209.202.139 ] => {
    "result.stdout_lines": [
        [
            "===============================================================================",
            "Rpki Session Information",
            "===============================================================================",
            "IP Address         : 134.209.198.136",
            "-------------------------------------------------------------------------------",
            "Port               : 3323               Oper State         : established",
            "UpTime             : 0d 00:00:02        Flaps              : 0",
            "Active IPv4 records: 6986               Active IPv6 records: 1258",
            "===============================================================================",
            "No. of Rpki-Sessions : 1",
            "==============================================================================="
        ]
    ]
}

Configuring and testing with data directory Ansible fragments

Gantry can execute Ansible based router post-deployment configuration steps, and optional test suites, using Ansible fragments that you supply in the Gantry data directory.

Test execution takes place inside the Gantry Docker container and so only Ansible playbooks accessible to the container via the Gantry data directory can be executed. By default the host directory /tmp/gantry is mapped into the container. You can override this location using the --data-dir <path> command line option.

  • Any config-*.yml files in the Gantry data directory will be included as task sets to be executed post deployment.
  • All test-*.yml files will be executed by ./gantry test all.
  • Individual playbooks in the Gantry data directory can be executed using ./gantry test <filename>, .e.g ./gantry test test-compare-vrps.

Upgrading

When using the ./gantry wrapper script the Gantry Docker image is fetched the first time you use it. To upgrade it after that, assuming that a newer version of Gantry has been built over on Docker Hub, you can issue the following command:

$ ./gantry upgrade
latest: Pulling from nlnetlabs/gantry
Digest: sha256:16c8559eed1543a4cbc8e3324aae131cb0e6246df0668b41bb13dbd8a99c6c40
Status: Downloaded newer image for nlnetlabs/gantry:latest

Preparing router images

Gantry contains support for building and publishing router images to a private Docker registry. Assuming that you have in your possession the router image file and any required license file you can use Gantry to build and publish it.

Below is an example of how to build and publish a Nokia SROS 16.0.R6 router image:

$ mkdir /tmp/gantry
$ cp /path/to/your/19.0.R6/sros-vm.qcow2 /tmp/gantry/sros-19.0.R6.qcow2
$ cp /path/to/your/19.0.R6/license.txt /tmp/gantry/sros-19.0.R6.qcow2.license
$ ./gantry registry publish
2019-12-09 11:05:21 +0000: Publishing builds a vrnetlab router image and publishes it to your private Docker registry.
2019-12-09 11:05:21 +0000: The router images and licenses must be supplied by you.
2019-12-09 11:05:21 +0000: Do you wish to proceed? [y/N] y
2019-12-09 11:05:22 +0000:
2019-12-09 11:05:22 +0000: Which of the following router image types do you want to publish?
    csr
    nxos
    openwrt
    routeros
    sros
    veos
    vmx
    vqfx
    vrp
    vsr1000
    xrv
    xrv9k
2019-12-09 11:05:22 +0000: Router type, e.g. sros: sros
2019-12-09 11:05:23 +0000:
2019-12-09 11:05:23 +0000: Please copy or sym link your .qcow2 router image, and optionally a .qcow2.license file, into:
2019-12-09 11:05:23 +0000: /tmp/gantry
2019-12-09 11:05:23 +0000:
2019-12-09 11:05:23 +0000: Please read CAREFULLY the instructions that will be printed next on your screen.
2019-12-09 11:05:23 +0000: If you do not name the copied/linked files correctly the build will FAIL.
2019-12-09 11:05:23 +0000: Do you wish to proceed? [y/N] y
2019-12-09 11:05:24 +0000:
vrnetlab / Nokia VSR SROS
=========================
This is the vrnetlab docker image for Nokia VSR / SROS.
...
... skip vrnetlab router specific README ...
...
2019-12-09 11:05:24 +0000: Do you wish to proceed? [y/N] y
...
... skip make and Docker build process output ...
...
2019-12-09 11:05:28 +0000:
2019-12-09 11:05:29 +0000: The image build has finished. Does the following identify the image that was built?
2019-12-09 11:05:29 +0000: vrnetlab/vr-sros:19.10.R1
2019-12-09 11:05:29 +0000: Do you wish to proceed? [y/N] y
...
... skip Docker push output ...
...
2019-12-09 11:06:07 +0000: Done.

Status

Deployment of components to test:

  • NLnet Labs Krill
  • NLnet Labs Routinator
  • Nokia/Alcatel SROS 16.0.R6 virtual router
  • Juniper VMX 18.2R1.9 virtual router
  • Cisco IOS XRv 9000 virtual router
  • Cisco CSR 1000 virtual router
  • Other RPKI caches (TBD)

Configure virtual routers to populate their RPKI VRP database from the Routinator:

  • Nokia/Alcatel SROS 16.0.R6
  • Juniper VMX 18.2R1.9
  • Cisco IOS XRv 9000 virtual router
  • Cisco CSR 1000 virtual router
  • Configure virtual routers to populate their RPKI VRP database from other RPKI caches (TBD)
  • Configure RPKI caches to use Krill as their data source

Test scenarios:

  • Symmetric diff of virtual router VRP database to that of the VRP source (done with Routinator)
  • Others (TBD)

Architecture

Gantry depends heavily on the vrnetlab project which is used to build the virtual router Docker images that are deployed and tested. Currently the vr-sros image build is slightly patched to enable outbound connectivity to the Routinator. With a better understanding of vrnetlab and routers the patch might turn out to be unnecessary, otherwise I would like to see if it is something that makes sense to somehow contribute pack to the vrnetlab project.

Infrastructure is spun up using Docker Machine and DigitalOcean.

Router images are provisioned using Docker Machine, Docker and Ansible.

Terraform is used to deploy a private Docker registry to store the router images. The reg tool is used to work with the registry.

Docker, Docker Hub and Bash are used to wrap the project up and make it easy to use. The "simple" Bash wrapper script has already grown beyond the initial expectation and is overdue for a rewrite in Python.

The manner in which different routers with different VM size requirements and post-deployment setup commands is supported will likely evolve, at present it's a bit of an Ansible/Docker/Bash hack.

Help

For questions, suggestions, and contributions please use GitHub issues and pull requests.

Consulting the Gantry --help output is a good way to get a feel for what Gantry can do and how to do it:

$ ./gantry --help
Gantry: "a structure built on a rocket launch pad to facilitate assembly and servicing"

NLnet Labs Gantry is a tool for deploying and testing network routers in the cloud, built to support the NLnet Labs Routinator project.

Usage: gantry help|--help

Component management commands:
       gantry deploy   <COMPONENT> [<COMPONENT>..] [--region <REGION:default=ams3>] 
       gantry docker   <COMPONENT> ..commands..
       gantry exec     <COMPONENT> ..commands..
       gantry ip       <COMPONENT>
       gantry logs     <COMPONENT> [--follow]
       gantry ssh      <COMPONENT> [--host]
       gantry status
       gantry undeploy <COMPONENT> [<COMPONENT>..] [--force]

Docker registry commands:
       gantry registry ls|deploy|publish
       gantry registry rm <repo>/<image>:<tag>

Test suite commands:
       gantry test     <COMPONENT>|all [<SINGLE_PLAYBOOK_YML_FILE_IN_DATA_DIR>]

Wrapper commands:
       gantry shell
       gantry upgrade

Wrapper options:
       gantry --data-dir <PATH/TO/YOUR/DATA/FILES:default=/tmp/gantry>
       gantry --version

Where COMPONENT can be one of: (deploy and undeploy also accept special component 'all')
       COMPONENT            VENDOR
       routinator           NLnet Labs
       vr-csr:16.09.02      Cisco CSR1000v
       vr-sros:16.0.R6      Nokia/Alcatel SROS
       vr-vmx:18.2R1.9      Juniper vMX

Note: The list of COMPONENTs shown is the set for which specific playbooks exist in the playbooks/ directory. You will need the appropriate virtual router image published in your Docker registry in order to actually deploy one of these routers.

Upgrading the Routinator version

The Routinator image is pulled from the private Docker registry, NOT from the public registry.

You can upgrade it like so:

$ docker pull nlnetlabs/routinator:latest
$ docker tag nlnetlabs/routinator:latest docker-reg.do.nlnetlabs.nl/nlnetlabs/routinator:latest
$ docker push <YOUR_REGISTRY_FQDN>/nlnetlabs/routinator:latest

The registry FQDN and login details (you might need to do docker login before you can push) can be found in your gantry.cfg file.

Hacking

If you know what you are doing and want to take full control you can dive into the Gantry wrapper container shell prompt:

$ ./gantry shell
Entering shell mode..

root@a18e60b24e33:/opt/nlnetlabs/gantry# 

TO DO

Finish this README.

gantry's People

Contributors

dependabot[bot] avatar gantryci avatar partim avatar ximon18 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gantry's Issues

VRP comparison

1, Query the actual router VRP database and compare the content to that of the Routinator.
2. Repeat the comparison after the data has changed in the Routinator. Does the router pick up the changes and does its database match the Routinator data?

This probably requires that I know the serial number of the Routinator data and the router data.

For the Routinator getting the serial number depends on NLnetLabs/routinator#73, though it might be possible to get the serial number from the Routinator logs.

For the router, at least Alcatel/Lucent/Nokia SROS 16.0.R6 exposes the serial via the console:

A:vSIM>show router origin-validation rpki-session detail
...
Serial ID          : 53149              Session ID         : 0

Use environment no more (otherwise the interface prompts for the next page) and show router origin-validation database to get the actual VRPs.

I'm not sure if I can get it via NETCONF or if doing so would be any more standard across routers than using the proprietary console.

Routinator serial number fetch broken

Routinator 0.4+ now deploys but the playbook logic for obtaining the most recent Routinator serial number no longer works:

fatal: [routinator]: FAILED! => {"msg": "Unexpected templating type 
error occurred on ({{ result.stderr_lines | select('match', '^New serial is ([0-9]+)')
| list | last | regex_search('([0-9]+)', '\\\\1') | first }}): expected string or 
bytes-like object"}

Create architecture diagrams

E.g. sequence diagrams, network/deployment topology, visualize the relationship between the various tools and technologies, etc.

Do not log the router credentials

Currently the router username and password are exposed in the Gantry output in the public Travis CI build logs. This wasn't thought to be an issue because the router is only temporarily deployed. However, if the same username and password are used for each build then the credentials from a previous build can be observed and used to connect to a newly deployed router in a subsequent build.

Wait for Routinator

I think that the new support for parallel deployment of Routinator and routers can lead to a situation where the test checks the Routinator log for a serial number announcement before it exists in the log. This is I assume why Travis CI build #34 failed. The test should wait for the Routinator to announce the first serial number.

Decide testing strategy

When running tests against multiple routers, and later potentially multiple RPKI caches and even extending to use publication servers such as NLnet Labs Krill, what should happen if a single deployment or a test against a single component fails? Should tests be run if one or more components fail to deploy?

Relates to #22.

Re-write CLI wrapper Bash script in Python

The cli Bash script has grown beyond the intended scope and is suffering from lack of proper types, e.g get_deployment_report returns a pipe separated string instead of an array of object, and should be test-driven/unit-tested. My suggestion would be to re-write it in Python in a test-driven manner.

Use RTR instead of HTTP with Routinator

Fetching VRPs from Routinator via /json and the serial number from /status could result in the received ROAs being for a different serial number than was scraped from /status.

It might be better to use the RTR protocol instead as that communicates the serial number with the VRPs and notifies the client when new data is ready. See RFC-8210.

One way to use the RTR protocol could be to use Python RTRLib.

Finish the README.

The current version was a first incomplete version with a TO DO placeholder inside it.

Unable to SSH in to Cisco CSR1000v router.

13:52 $ ./gantry deploy routinator vr-csr:16.09.02
...
<SNIP>
...
[vr-csr:16.09.02] PLAY RECAP *********************************************************************
[vr-csr:16.09.02] vr-csr-16.09.02            : ok=10   changed=3    unreachable=0    failed=0   

14:10 $ ./gantry docker vr-csr:16.09.02 ps
CONTAINER ID        IMAGE                                                 COMMAND                  CREATED             STATUS                    PORTS               NAMES
232245a3d41b        docker-reg.do.nlnetlabs.nl/vrnetlab/vr-csr:16.09.02   "/launch.py --userna…"   14 minutes ago      Up 14 minutes (healthy)                       router

14:10 $ ./gantry ssh vr-csr:16.09.02
2019-05-10 12:10:53 +0000: You are about to be connected to the proprietary interface of router vr-csr:16.09.02.
2019-05-10 12:10:53 +0000: When prompted enter password: ********
2019-05-10 12:10:53 +0000: Do you wish to proceed? [y/N] y

<HANGS>
<CTRL+C>
^C2019-05-10 12:11:14 +0000: Aborting..

14:11 $ ./gantry logs vr-csr:16.09.02
...
<SNIP>
...
2019-05-10 11:59:49,190: vrnetlab   DEBUG    writing to serial console: end
2019-05-10 11:59:49,191: vrnetlab   TRACE    waiting for '#' on serial console
2019-05-10 11:59:49,237: vrnetlab   TRACE    read from serial console: transport input all
            ^
% Invalid input detected at '^' marker.

csr1000v#
2019-05-10 11:59:49,237: vrnetlab   DEBUG    writing to serial console: copy running-config startup-config
2019-05-10 11:59:49,237: vrnetlab   DEBUG    writing to serial console: 
2019-05-10 11:59:49,238: launch     INFO     Startup complete in: 0:03:36.458781

Complete Cisco log attached: vr-csr:16.09.02.log

The router image was built and configured by vrnetlab, see:

Error creating machine: Error installing docker

Transient Docker Machine failure when running ./gantry deploy routinator with the latest Gantry Docker image.

[routinator] Installing Docker...
[routinator] The default lines below are for a sh/bash shell, you can specify the shell you're using, with the --shell flag.
[routinator] 
[routinator] [routinator] Error creating machine: Error running provisioning: error installing docker: 

This problem cannot be fixed in-place by re-issuing'./gantry deploy routinator as that results in:

[routinator] [routinator] Error checking TLS connection: Error checking and/or regenerating the certs: There was an error validating certificates for host "188.166.36.130:2376": dial tcp 188.166.36.130:2376: connect: connection refused

The only solution with Gantry at present is undeploy followed by redeploy.

  1. The issue should be worked around transparently by Gantry.2
  2. Re-deployment should be able to proceed from the current status, not fail due to a previous error.
  3. Undeployment should not be necessary as this incurs additional cloud costs to re-create the underlying VM, even though in this case the underlying VM is fine.

The issue happened again on a second attempt, but not on the third attempt.

Design router test suite extension mechanism

Router tests will need to be added, e.g. as in #4, but ideally not written separately per router and not added to the base router post sanity check tests in the existing Ansible templates. How best to go about this? How portable are origin validation settings and outputs across routers? Would NETCONF help or are they bespoke extensions to NETCONF?

Nokia VR SROS test broken

Build #31.1 failed with this issue:

[vr-sros:16.0.R6] TASK [SHOW RPKI DATABASE SUMMARY] **********************************************
[vr-sros:16.0.R6] Tuesday 07 May 2019  14:27:52 +0000 (0:00:00.079)       0:06:46.407 *********** 
[vr-sros:16.0.R6] fatal: [vr-sros-16.0.R6]: FAILED! => {"changed": false, "msg": "Connection type ssh is not valid for this module"}

Factor common logic into included/imported Playbook.

Currently the Routinator playbook and the SROS router playbook have very similar logic for informing Ansible of Docker Machine hosts. This should be extracted and generalized. Alternatively it could be replaced by a pre-Ansible step that generates the Ansible inventory based on Docker Machine output, e.g. using something like https://github.com/nathanleclaire/dockerfiles/blob/master/ansible/machine.py. A dynamic inventory plugin might be the best way to go as it's the Ansible way.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.