Giter Club home page Giter Club logo

srl-telemetry-lab's Introduction

Nokia SR Linux Streaming Telemetry Lab

SR Linux has first-class Streaming Telemetry support thanks to 100% YANG coverage of state and config data. The holistic coverage enables SR Linux users to stream any data off of the NOS with on-change, sample, or target-defined support. A discrepancy in visibility across APIs is not about SR Linux.

This lab represents a small Clos fabric with Nokia SR Linux switches running as containers. The lab topology consists of a Clos topology, plus a Streaming Telemetry stack comprised of gnmic, prometheus and grafana applications.

pic1

In addition to the telemetry stack, the lab also includes a modern logging stack comprised of promtail and loki.

Goals of this lab:

  1. Demonstrate how a telemetry stack can be incorporated into the containerlab topology file.
  2. Explain SR Linux holistic telemetry support.
  3. Provide practical configuration examples for the gnmic collector to subscribe to fabric nodes and export metrics to Prometheus TSDB.
  4. Introduce advanced Grafana dashboarding with FlowChart plugin rendering port speeds and statuses.
  5. Give a sneak peek of the modern logging telemetry stack with Loki and Promtail to consume Syslog data from SR Linux nodes.

Deploying the lab

The lab is deployed with the containerlab project, where st.clab.yml file declaratively describes the lab topology.

# change into the cloned directory
# and execute
containerlab deploy --reconfigure

To remove the lab:

containerlab destroy --cleanup

Accessing the network elements

Once the lab has been deployed, the different SR Linux nodes can be accessed via SSH through their management IP address, given in the summary displayed after the execution of the deploy command. It is also possible to reach those nodes directly via their hostname, defined in the topology file. Linux clients cannot be reached via SSH, as it is not enabled, but it is possible to connect to them with a docker exec command.

# reach a SR Linux leaf or a spine via SSH
ssh admin@leaf1
ssh admin@spine1

# reach a Linux client via Docker
docker exec -it client1 bash

Fabric configuration

The DC fabric used in this lab consists of three leaves and two spines interconnected as shown in the diagram.

pic

Leaves and spines use Nokia SR Linux IXR-D2 and IXR-D3L chassis respectively. Each network element of this topology is equipped with a startup configuration file that is applied at the node's startup.

Once booted, network nodes will come up with interfaces, underlay protocols and overlay service configured. The fabric is running Layer 2 EVPN service between the leaves.

Verifying the underlay and overlay status

The underlay network runs eBGP, while iBGP is used for the overlay network. The Layer 2 EVPN service is configured as explained in this comprehensive tutorial: L2EVPN on Nokia SR Linux.

By connecting via SSH to one of the leaves, we can verify the status of those BGP sessions.

A:leaf1# show network-instance default protocols bgp neighbor
------------------------------------------------------------------------------------------------------------------
BGP neighbor summary for network-instance "default"
Flags: S static, D dynamic, L discovered by LLDP, B BFD enabled, - disabled, * slow

+-----------+---------------+---------------+-------+----------+-------------+--------------+--------------+---------------+
| Net-Inst  |     Peer      |     Group     | Flags | Peer-AS  |   State     |    Uptime    |   AFI/SAFI   | Rx/Active/Tx] |
+===========+===============+===============+=======+==========+=============+==============+==============+===============+
| default   | 10.0.2.1      | iBGP-overlay  | S     | 100      | established | 0d:0h:0m:27s | evpn         | [4/4/2]       |
| default   | 10.0.2.2      | iBGP-overlay  | S     | 100      | established | 0d:0h:0m:28s | evpn         | [4/0/2]       |
| default   | 192.168.11.1  | eBGP          | S     | 201      | established | 0d:0h:0m:34s | ipv4-unicast | [3/3/2]       |
| default   | 192.168.12.1  | eBGP          | S     | 202      | established | 0d:0h:0m:33s | ipv4-unicast | [3/3/4]       |
+-----------+---------------+---------------+-------+----------+-------------+--------------+--------------+---------------+

Telemetry stack

As the lab name suggests, telemetry is at its core. The following telemetry stack is used in this lab:

Role Software
Telemetry collector gnmic
Time-Series DB prometheus
Visualization grafana

gnmic

gnmic is an Openconfig project that allows to subscribe to streaming telemetry data from network devices and export it to a variety of destinations. In this lab, gnmic is used to subscribe to the telemetry data from the fabric nodes and export it to the prometheus time-series database.

The gnmic configuration file - gnmic-config.yml - is applied to the gnmic container at the startup and instructs it to subscribe to the telemetry data and export it to the prometheus time-series database.

Prometheus

Prometheus is a popular open-source time-series database. It is used in this lab to store the telemetry data exported by gnmic. The prometheus configuration file - configs/prometheus/prometheus.yml - has a minimal configuration and instructs prometheus to scrape the data from the gnmic collector with a 5s interval.

Grafana

Grafana is another key component of this lab as it provides the visualisation for the collected telemetry data. Lab's topology file includes grafana node and configuration parameters such as dashboards, datasources and required plugins.

Grafana dashboard provided by this repository provides multiple views on the collected real-time data. Powered by flowchart plugin it overlays telemetry sourced data over graphics such as topology and front panel views:

pic3

Using the flowchart plugin and real telemetry data users can create interactive topology maps (aka weathermap) with a visual indication of link rate/utilization.

pic2

Access details

Using containerlab's ability to expose ports of the containers to the host, the following services are available on the host machine:

Traffic generation

When the lab is started, there is not traffic running between the nodes as the clients are sending any data. To run traffic between the nodes, leverage traffic.sh control script.

To start the traffic:

  • bash traffic.sh start all - start traffic between all nodes
  • bash traffic.sh start 1-2 - start traffic between client1 and client2
  • bash traffic.sh start 1-3 - start traffic between client1 and client3

To stop the traffic:

  • bash traffic.sh stop - stop traffic generation between all nodes
  • bash traffic.sh stop 1-2 - stop traffic generation between client1 and client2
  • bash traffic.sh stop 1-3 - stop traffic generation between client1 and client3

As a result, the traffic will be generated between the clients and the traffic rate will be reflected on the grafana dashboard.

flowchart.mp4

Logging stack

The logging stack leverages the promtail->Loki pipeline, where promtail is a log agent that extracts, transforms and ships logs to Loki, a log aggregation system.

In this nice promtail->Loki pipeline another element is ingested - namely Syslog-NG, whos only purpose is to receive Syslog RFC3164 messages from SR Linux and transform it to RFC5424 format that promtail requires on its input. When SR Linux switches to Syslog RFC5424, this element will be removed from the pipeline.

The logging infrastructure logs every message from SR Linux that is above Info level. This includes all the BGP messages, all the system messages, all the interface state changes, etc. The dashboard provides a view on the collected logs and allows filtering on a per-application level.

srl-telemetry-lab's People

Contributors

bclasse avatar dbitech avatar hellt avatar ktodts avatar steiler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

srl-telemetry-lab's Issues

enhance grafana with env vars and bind datasources and dashboards

grafana:
   container_name: grafana
   environment:
     GF_SECURITY_ADMIN_USER: admin
     GF_SECURITY_ADMIN_PASSWORD: admin
     GF_INSTALL_PLUGINS: natel-discrete-panel
     GF_PATHS_PROVISIONING: /etc/grafana/provisioning
   ports:
     - '3000:3000'
   image: grafana/grafana:7.0.3
   volumes:
      - $PWD/datasource.yaml:/etc/grafana/provisioning/datasources/datasource.yaml:ro
      - $PWD/dashboards.yaml:/etc/grafana/provisioning/dashboards/dashboards.yaml:ro
      - $PWD/dashboards:/var/tmp/dashboards

Typo in the topology file

In the topology file, following typo needs to be fixed.

ipv4-subnet -> ipv4_subnet
mgmt-ipv4 -> mgmt_ipv4

Grafana Security Vulnerability on port 3000/tcp

IT scans are detecting vulnerability severity 5 against "Grafana Anonymous Access Enabled" on port 3000/tcp.

Is there anything we can do about this - even post deployment configuration?

Prometheus Security Vulnerability on 9090/tcp

IT scans are detecting vulnerability severity 4 against "Prometheus Sensitive Information Disclosure" on port 9090/tcp.

Is there anything we can do about this - even post deployment configuration?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.