vmware-archive / nerpa Goto Github PK

Network Programming with Relational and Procedural Abstractions (NERPA)

License: MIT License

Shell 2.80% P4 10.60% Rust 62.94% C 0.15% Python 1.91% CMake 0.42% C++ 20.83% Makefile 0.34%

nerpa's Introduction

NERPA

Nerpa, short for "Network Programming with Relational and Procedural Abstractions", is a programming framework to simplify the management of a programmable network. It implements an incremental control plane and allows for tighter integration and co-design of the control plane and data plane.

In our current vision for Nerpa, we interoperate between an Open vSwitch Database (OVSDB) management plane; a Differential Datalog (DDlog) program as control plane; and a P4 program for the data plane. This diagram shows how those pieces interact.

The Nerpa framework involves several components, located in different subdirectories. This repo is organized as follows:

nerpa_controlplane: Each subdirectory corresponds with a Nerpa program, with its input files.

DDlog program: Serves as the control plane.
P4 program: Serves as the dataplane program. Used by p4info2ddlog to generate DDlog output relations.
OVSDB schema: Used to set up an OVSDB management plane. The ovsdb2ddlog tool uses this to generate input relations. A schema also defines switches, which we add and remove to the network.

nerpa_controller: An intermediate Rust program runs the DDlog program using the generated crate. It uses the management plane to adapt the DDlog program's input relations. It also pushes the output relations' rows into tables in the P4 switch using P4runtime. Notice that the controller's Cargo.toml is uncommitted. This is generated using the p4info2ddlog tool, to import the correct crate dependencies.
ovs: Rust bindings to Open vSwitch libraries.
p4ext: API above P4Runtime for convenience.
p4info2ddlog: Script that reads a P4 program's P4info and generates DDlog relations for the dataplane program.
proto: Protobufs for P4 and P4Runtime, used to generate Rust code.

The above pieces fit together as follows in the tutorial Nerpa program:

Steps

Building Dependencies

Clone the repository and its submodules.

git clone --recursive [email protected]:vmware/nerpa.git

Install Rust using the appropriate instructions, if uninstalled.
The required version of grpcio requires CMake >= 3.12. The Ubuntu default is 3.10. Here are installation instructions for Ubuntu.
We have included an script for Ubuntu that builds all dependencies. This builds all other dependencies and sets necessary environment variables. On a different operating system, you can individually execute the steps. Following the build dependency script's organization ensures compatibility with the build and runtime scripts.

. scripts/build-dependencies.sh

Tutorial

After building all dependencies, you can write Nerpa programs. We recommend following the tutorial for a step-by-step introduction to Nerpa. Individual steps for setup are also documented below.

Build

The Nerpa program called example would consist of the following files. For organization, these files should be placed in the same subdirectory of nerpa_controlplane and given the same name, as follows:

nerpa_controlplane/example/example.dl // DDlog program for the controlplane
nerpa_controlplane/example/example.p4 // P4 program for the dataplane
nerpa_controlplane/example/commands.txt // Initial commands to configure the P4 switch
nerpa_controlplane/example/example.ovsschema // Schema for the OVSDB management plane

These files can also be created using a convenience script: ./scripts/create-new-nerpa.sh example.

In addition to defining the management plane, the .ovsschema is also used for addition and removal of switches on the network.

The switch client configuration has the following fields:

target - string - hardware/software entity hosting P4Runtime
device_id - integer - unique identifier for the target P4 device
role_id - integer - desired role ID for the controller.
is_primary - boolean - whether or not the client represents the primary device within that role

The (device_id, role_id) pairs should be managed by the user. The user should follow the "arbitration mechanism outside of the server" described in the P4Runtime spec. We recommend reading this section of the spec carefully. This will avoid conflicting configurations and errors when connecting multiple controllers to the P4Runtime server.

It also includes a UUID generated by OVSDB, which is also used as a unique identifier for a specific client.

Once these files are written, the Nerpa program can be built by running the build script: ./scripts/build-nerpa.sh nerpa_controlplane/example example. You can also individually execute the steps in the build script, as long as DDlog has been installed. Note that we do recommend using the build script, so that all software is in the expected locations for the runtime script.

If you are building a new Nerpa program after building a different example (ex., nerpa_controlplane/previous/), you may run into Cargo build errors due to conflicting dependencies. One potential source of errors may be the previous program's DDlog crate. Removing it can resolve these issues:

rm -rf nerpa_controlplane/previous/previous_ddlog.

Run

A built Nerpa program can be run using the runtime script. This script (1) configures and runs a P4 software switch; (2) configures and runs the OVSDB management plane; and (3) runs the controller program.

The runtime script's usage is the same as the build script:

./scripts/run-nerpa.sh nerpa_controlplane/example example

If you did not previously use the scripts that build dependencies and Nerpa, you must ensure that all software dependencies are in the expected locations for the runtime script.

Test

The snvs example program includes an automatic test program to check that the MAC learning table functions as expected. To use it, first build it with:

(cd nerpa_controlplane/snvs/ && cargo build)

Then start the behavioral model with the -s option to enable the automated tests:

scripts/run-nerpa.sh -s nerpa_controlplane/snvs snvs

Once it's started (which takes about 2 seconds), from another console run the tests:

nerpa_controlplane/snvs/target/debug/test-snvs ipc://bmv2.ipc

The test will print its progress. If it succeeds, it will print Success! On failure, it will panic before that point.

Writing a Nerpa Program

Write/Build Process

Writing and building a Nerpa program involves several steps. We lay those out here for clarity and to reduce pitfalls for new Nerpa programmers. All of these are steps of the build script, scripts/build-nerpa.sh. That script includes specific syntax that you should use if you are rolling your own build process.

Create default Nerpa program files: a P4 program, a DDlog program, an OVSDB schema, and P4 switch configuration commands. This is described above.
Design the OVSDB schema for the management plane and generate DDlog relations using ovsdb2ddlog. Note that OVSDB is also used to add switches to the network configuration.
Write the P4 program. Compile it, making sure to generate P4 runtime files.
Generate DDlog relations and related utilities from the dataplane program by calling p4info2ddlog. Note that running the full build script compiles the stub DDlog program and builds the crate, which can take several minutes.

In order, p4info2ddlog does the following:

Generate DDlog input relations representing P4 tables and actions
Generate DDlog input relations representing digest messages from P4
Generate Cargo.toml for the nerpa_controller crate, so it correctly imports all DDlog-related crates
Create the dp2ddlog crate, which can convert digests and packets to DDlog relations

Write the DDlog program. This represents the rules from the control plane. At this point, all relations necessary for import should be generated.
Generate necessary files and build the ovsdb_client crate. Even if your program does not use an OVSDB management plane, nerpa_controller depends on this import.
Build the controller crate. The name of the imported DDlog crate does need to be changed before building.

Assumptions

The Nerpa programming framework embeds some assumptions about the structure within P4 and DDlog programs. These are documented below.

Multicast: a DDlog output relation meant to push multicast group IDs to the switch must contain "multicast" in its name (not case-sensitive). A multicast relation must have two records, one representing the group ID and one representing the port. The group ID record name should include "id" (not case-sensitive). The port record name should include "port" (not case-sensitive).
PacketOut: a DDlog output relation can contain packets to send as PacketOut messages over the P4 Runtime API. Such a relation must be a NamedStruct, and its name must contain "packet" (not case-sensitive). One of its output Records must represent the packet to send as an Array; its name should include "packet" (not case-sensitive). All other fields represent packet metadata fields in the PacketOut struct (the P4 struct with controller header packet_out).
Modifying a table's default action: Users should not include "DefaultAction" (case-insensitive) in the names of DDlog relations. If a P4 table does not have a constant default_action, its value can be modified. For tables without a constant default_action, p4info2ddlog generates an output relation with "DefaultAction" in its name. When nerpa_controller processes an output whose name includes default_action, we construct the P4Runtime table entry update following P4Runtime's requirement to modify a default action: a Write RPC with a TableEntry message containing an empty FieldMatch field. Because this differs from the typical construction of a table entry update, other relations should not include "DefaultAction" (case-insensitive) in their names. That would create errors when updating accompanying P4 table entries.
Device-specific relations: Relations can be used to update table entries on a specified device. To do this, you can include a field called client_id of type int in an output relation. The field's value should be the UUID of the corresponding row in OVSDB's Client table. That row is used to configure a client which communicates with the specified device through P4Runtime. Then, if an output relation contains this field, nerpa_controller will only write the output to the switch with the expected UUID.

nerpa's People

Contributors

Stargazers

Watchers

Forkers

numansiddique armbiant isabella232 seanpm2001 trvon

nerpa's Issues

Cache action and table translation between DDlog and P4

Our current implementation of the controller has to map a DDlog output relation to the correct P4 table and action, to make sure that it specifies the correct table and action in table entry update. We do not currently cache this operation since the controller runs only once. However, it would be expensive to do this translation every time changes are pushed. Once OVS DB is hooked in, we should design and implement a sensible cache.

Alternatively, we can change the generation of the DDlog output relations, so the P4 table/action name and the DDlog output relation name have the string representation. This would obviate the translation altogether.

p4c-of omits needed ${} in DDlog action matches

When p4c-of generates match patterns that include variables, the actions that it generates include the variables as literal strings rather than interpolating their values.

I see three examples in snvs.dl as currently generated:

SnvsIngress_InputVlanActionSnvsIngress_SetVlan{vid_0} -> "move(vid_0->vid), load(1->present), resubmit(,6)",
SnvsIngress_InputVlanDefaultActionSnvsIngress_SetVlan{vid_0} -> "move(vid_0->vid), load(1->present), resubmit(,6)",
SnvsIngress_LearnedDstDefaultActionSnvsIngress_KnownDst{port_1} -> "move(port_1->${r_SnvsIngress_output(false)}), resubmit(,23)",

Here's what should be generated instead:

SnvsIngress_InputVlanActionSnvsIngress_SetVlan{vid_0} -> "move(${vid_0}->vid), load(1->present), resubmit(,6)",
SnvsIngress_InputVlanDefaultActionSnvsIngress_SetVlan{vid_0} -> "move(${vid_0}->vid), load(1->present), resubmit(,6)",
SnvsIngress_LearnedDstDefaultActionSnvsIngress_KnownDst{port_1} -> "move(${port_1}->${r_SnvsIngress_output(false)}), resubmit(,23)",

.github: implement CI using snvs

We want to implement continuous integration using the snvs program via Github Actions.

ovsdb_client: replace timer with latch

Currently, the ovsdb_client has a timer-based event processing loop. It tries to read updates from OVSDB, waits a few seconds, and tries again. Ben has included latch, the correct primitive to fix this, in ofp4 (#39). Once that is merged, the ovsdb_client loop should be rewritten to use this.

Test controller end-to-end

Now that we have some tests written for the P4 library, we want to write an end-to-end test for the controller. This should do the following:

spawn a software switch
set the config of the switch
build a table entry
use the DDlog output relation to figure out some packets to send and the expected results
send real packets to the switch
confirm the packets were received correctly

Investigate use of u16 in p4ext

Currently, in the p4ext crate, we use u16 to represent values for match-action keys and action parameters. This seems overly restrictive. That crate provides a nicer interface to the generated p4runtime and p4runtime_grpc files. These, these same values are represented by Vec<u8>.

We should investigate any deeper design decision behind the use of u16, in case it has been forgotten. If not, we probably should convert these values into Vec<u8>.

scripts: make build faster

Is your feature request related to a problem? Please describe.

Building Nerpa is slow. A large part of this is because we build many Rust crates in scripts/build-nerpa.sh. It will be useful to figure out how we can call cargo build fewer times.

Describe the solution you'd like

Deleting several cargo build calls from scripts/build-nerpa.sh is likely the best approach.

Describe alternatives you've considered

No response

Additional context

No response

simple_switch_grpc build failed

Hello,

I'm interested in learning more about this project.
Following the instruction specified here - https://github.com/vmware/nerpa#installation
I see the following error while trying to build the project.

[...]
Byte-compiling python modules...
bmpy_utils.pyruntime_CLI.pyp4dbg.pynanomsg_client.py
Byte-compiling python modules (optimized versions) ...
bmpy_utils.pyruntime_CLI.pyp4dbg.pynanomsg_client.py
make[2]: Leaving directory '/root/nerpa/nerpa-deps/behavioral-model/tools'
make[1]: Leaving directory '/root/nerpa/nerpa-deps/behavioral-model/tools'
make[1]: Entering directory '/root/nerpa/nerpa-deps/behavioral-model'
make[2]: Entering directory '/root/nerpa/nerpa-deps/behavioral-model'
make[2]: Nothing to be done for 'install-exec-am'.
make[2]: Nothing to be done for 'install-data-am'.
make[2]: Leaving directory '/root/nerpa/nerpa-deps/behavioral-model'
make[1]: Leaving directory '/root/nerpa/nerpa-deps/behavioral-model'
+ cd targets/simple_switch_grpc
+ ./autogen.sh
./p4-guide/bin/install-p4dev-v2.sh: line 477: ./autogen.sh: No such file or directory

This issue seems to be coming from behavioral-model repo. The autogen.sh seems to be missing from the upstream main under targets/simple_switch_grpc. But this is present in the 1.15.0 release branch. As a workaround, tried building the 1.15.0 release version of the behavioural-model instead, but the build fails later on.

behavioral-model$ git branch 
* main
behavioral-model$ ls -l targets/simple_switch_grpc/autogen.sh
ls: cannot access 'targets/simple_switch_grpc/autogen.sh': No such file or directory

behavioral-model$ git branch 
* 1.15.x
  1.9.x
  main
behavioral-model$ ls -l targets/simple_switch_grpc/autogen.sh
-rwxrwxr-x 1 piyengar piyengar 31 Jul  8 16:16 targets/simple_switch_grpc/autogen.sh

Running inside an Ubuntu 18.04 LXC container

root@ofp4_lxc:~/nerpa# hostnamectl 
   Static hostname: ofp4_lxc
         Icon name: computer-container
           Chassis: container
        Machine ID: c46ecc0d6a2549e19a88d909f1c26a30
           Boot ID: 3b87967db8334411b8254d3dc2640f6f
    Virtualization: lxc
  Operating System: Ubuntu 18.04.6 LTS
            Kernel: Linux 5.11.0-40-generic
      Architecture: x86-64

Requesting some assistance to debug this, if possible.

Thanks,
Prashanth

README: add system diagram

We want to add a system diagram to the README. This should be similar to Ben's one from the Milestone 1 presentation below. It may need some editing to make sure we are not misrepresenting the current system.

.github: override issue templates

Is your feature request related to a problem? Please describe.

The Github org provides issue templates. These are adding annoying friction.

Describe the solution you'd like

DDlog has a workaround here.

Describe alternatives you've considered

No response

Additional context

Mostly filing an issue so I remember how to do this after lunch, I am hungry.

ofp4: Does not commit to OpenFlow before responding to RPCs

It would be good for ofp4 to wait until changes have been committed to OpenFlow before responding to P4Runtime RPCs. That feature isn't implemented yet.

p4info2ddlog: handle P4 errors

We need to properly handle errors from the dataplane in p4info2ddlog, when converting a P4 struct to a DDlog input relation.

Document p4 extension library

We want to document p4ext, the P4 extension library. This is also an opportunity to streamline various interfaces to functions.

p4ext: rewrite library tests

We need to rewrite testing for the p4ext library. The tests don't currently work as expected. This should start as part of #27, in case certain p4ext functions are trimmed.

As part of this, we should parallelize running any async tests (as in #3).

ofp4: applying Boolean logic to table.apply().hit result miscompiles to openflow

In the current commit on the ofp4 branch, which is commit b8611d3 ("ofp4: Implement zero-extension for copying between fields."), the following in tests/snvs.p4:

      // Output VLAN processing, including priority tagging.
      bool tag_vlan = OutputVlan.apply().hit;
      VlanID vid = tag_vlan ? meta.vlan : 0;
      bool include_vlan_header = tag_vlan || PriorityTagging.apply().hit;
      if (include_vlan_header && hdr.vlan.present == 0) {
          hdr.vlan.present = 1;
	  hdr.vlan.vid = vid;
      } else if (!include_vlan_header && hdr.vlan.present == 1) {
          hdr.vlan.present = 0;
      }

hits the isBool() case here in backend.cpp:

static CFG::Node* findActionSuccessor(const CFG::Node* node, const IR::P4Action* action) {
    for (auto e : node->successors.edges) {
        if (e->isUnconditional()) {
            return e->endpoint;
        } else if (e->isBool()) {
            return nullptr;
        } else {
            // switch statement
            if (e->label == action->name) {
                return e->endpoint;
            }
        }
    }
    return nullptr;
}

which means that the generated ddlog ends up generating resubmit(,0) like this:

// SnvsEgress.OutputVlan
Flow("table=29, priority=${priority}, ${r_out_port(true)}=${port}/0xffff, ${r_vlan(true)}=${vlan as bit<32> << 8}/0xfff00 actions=${actions}") :- SnvsEgress_OutputVlan(port, vlan, priority, action),
   var actions = match(action) {
    SnvsEgress_OutputVlanAction_NoAction{} -> "resubmit(,0)"
}.

// SnvsEgress.OutputVlan
Flow("table=29, priority=1 actions=resubmit(,0)").

The effect of resubmit(,0) is to infinitely loop through the flow table (until OVS kills it with the depth limit).

In an initial quick discussion, @mbudiu-vmw suggested:

Then a workaround may be to only check .hit în an if

No boolean expression

The cfg pass assumes some bmv2 specific invariant that may not be established by the prior passes

nerpa_controlplane/snvs: stop hard-coding management plane

Currently, snvs.dl hard-codes several Ports. We should commit an initial configuration for the management plane.

This likely requires changes to the README and the run script. We may need to check for an initial configuration in the build script as well.

p4info2ddlog: accept more standard P4 syntax

Currently, a Nerpa programmer has to format various P4 entity names to be compatible with DDlog relations. This includes capitalizing P4 types and camel casing, rather than snake casing, table names.

We should write some utility functions so that p4info2ddlog can accept a fully standard-looking P4 program and correctly generate DDlog relations.

Document controller program

We need to write documentation for nerpa_controller before open sourcing.

Build is broken

p4c has tweaked the cmakefiles recently and now this build is broken:

-- Start configuring OFP4 back end
CMake Error at extensions/ofp4/CMakeLists.txt:51 (build_unified):
  Unknown CMake command "build_unified".

p4c-of: Handling overlapping matches for a single field

p4c-of does not properly handle it if an OF_SeqMatch has multiple matches on the same bits within a field. ofvisitors.cpp has the following explanatory comment:

               /* Masks from different matches overlap.  There are three cases:
                 *
                 *     1. The values are constants and bits in corresponding
                 *        positions are the same. Then the overlap makes no
                 *        difference.  We could handle this here by tracking
                 *        constant bits that overlap and verifying that they
                 *        are the same.
                 *
                 *     2. The values are constants and there is at least one
                 *        difference in the values for corresponding
                 *        positions. Then the overlap means that the flow
                 *        cannot possibly match.  By the time we arrive here to
                 *        print the match, it is too late to handle this
                 *        correctly.
                 *
                 *     3. At least one value is expanded from a variable. The
                 *        value bits might match or might not.  We would have
                 *        to do a dynamic comparison in DDlog code; again, it
                 *        is too late by the time we arrive here to print the
                 *        match.
                 *
                 * The code already here handles case #1 correctly, but not
                 * case #2 or #3, and can't yet distinguish.
                 */

Alternative build process

I wrote the following script today that is a simpler way to build nerpa and its dependencies than by using the rather elaborate, distro-specific, and (in my experience) error-prone way using jafingerhut's install-p4dev-v2.sh script, which is what scripts/install-nerpa.sh delegates to. I'd like to integrate it into Nerpa, but I'm not sure of the best way to do it. We could replace the existing scripts/install-nerpa.sh, we could add this one as an alternative, or we could adopt ideas from this one into the existing script.

Run async tests in separate process

We want to run async tests in separate processes. We use rusty-fork to do this in sync tests (#2). Once outstanding pull requests are merged in rusty-fork, we can use the same approach for async tests.

nerpa_controller: standardize retry logic

At various points, we need to retry requests to P4 Runtime. This logic needs to be standardized, either in nerpa_controller or in p4ext.

vmware-archive / nerpa Goto Github PK

nerpa's Introduction

NERPA

Steps

Building Dependencies

Tutorial

Build

Run

Test

Writing a Nerpa Program

Write/Build Process

Assumptions

nerpa's People

Contributors

Stargazers

Watchers

Forkers

nerpa's Issues

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Recommend Projects

Recommend Topics

Recommend Org