stratum / fabric-tna Goto Github PK
View Code? Open in Web Editor NEWThe SD-Fabric data plane
Home Page: https://docs.sd-fabric.org/
The SD-Fabric data plane
Home Page: https://docs.sd-fabric.org/
gen-p4-constants.py
should emit constants for the bitwidths of P4 fields and headers, so we can remove stuff like this:
We currently use the image hosted in the Aether private docker registry. To make it easier for the community to run PTF tests, we should use the version of the image that is available on DockerHub.
If we don't publish stratum_bfrt to Docker Hub yet, we should do that ASAP.
We observed this issue in the production pod, the cause is still unclear.
This is especially evident when monitoring high bandwidth (~10Gbps) TCP flows generated by iperf: DeepInsight shows a rate of dropped reports that is proportional, and in most cases the same, to that of successfully processed reports:
DeepInsight uses the seq_no
field in the INT report fixed header to detect dropped reports. In an iperf test, the INT reports delivered to the server have missing seq_no
s. From this pcap trace, we see that reports have seq_no
07 88 0b a0
07 88 0b a1
07 88 0b a3 # skipped 1
07 88 0b a5 # skipped 1
07 88 0b a7 # skipped 1
07 88 0b a9 # skipped 1
07 88 0b ab # skipped 1
07 88 0b ad # skipped 1
07 88 0b b1 # skipped 3
07 88 0b b3 # skipped 1
07 88 0b b5 # skipped 1
07 88 0b b7 # skipped 1
07 88 0b b9 # skipped 1
07 88 0b bb # skipped 1
07 88 0b bd # skipped 1
07 88 0b bf # skipped 1
07 88 0b c3 # skipped 2
...
We don't believe it's an issue with seq_no
computation in tofino, as when generating low bit rate traffic, the issue cannot be reproduced. Instead, we believe this is connected to how we use mirroring sessions and/or recirculation ports, and the fact that the port attached to the DI server is a 10G. The issue does not manifest when running a similar test in the staging server, where the DI port is 40G. On 03/30/2020 we have observed the same issue on the staging pod which uses 40G interfaces for the collector.
The ingress byte counters vary by 4 bytes and egress counters vary by 50 bytes for FabricSpgwUplinkTest
, FabricSpgwDownlinkTest
. We need to check how counters work for tofino and understand if the behavior is expected.
When reading from device direction
is always 0x0
. The Stratum log suggests it's one of those cases where the compiler optimizes out action arguments that write to unused PHVs.
19:56:55.347 WARN [P4RuntimeFlowRuleProgrammable] Table entry obtained from device device:leaf1 is different from one in in translation store: device=PiTableEntry{tableId=FabricIngress.spgw_ingress.interface_lookup, matchKey={ipv4_dst_addr=0xc0a8fb01/32, gtpu_is_valid=0x1}, tableAction=FabricIngress.spgw_ingress.set_source_iface(skip_spgw=0x0, src_iface=0x1, direction=0x0), priority=N/A, timeout=PERMANENT}, store=PiTableEntry{tableId=FabricIngress.spgw_ingress.interface_lookup, matchKey={ipv4_dst_addr=0xc0a8fb01/32, gtpu_is_valid=0x1}, tableAction=FabricIngress.spgw_ingress.set_source_iface(skip_spgw=0x0, src_iface=0x1, direction=0x1), priority=N/A, timeout=PERMANENT}
19:56:55.348 WARN [P4RuntimeFlowRuleProgrammable] Table entry obtained from device device:leaf1 is different from one in in translation store: device=PiTableEntry{tableId=FabricIngress.spgw_ingress.interface_lookup, matchKey={ipv4_dst_addr=0xafa0000/16, gtpu_is_valid=0x0}, tableAction=FabricIngress.spgw_ingress.set_source_iface(skip_spgw=0x0, src_iface=0x2, direction=0x0), priority=N/A, timeout=PERMANENT}, store=PiTableEntry{tableId=FabricIngress.spgw_ingress.interface_lookup, matchKey={ipv4_dst_addr=0xafa0000/16, gtpu_is_valid=0x0}, tableAction=FabricIngress.spgw_ingress.set_source_iface(skip_spgw=0x0, src_iface=0x2, direction=0x2), priority=N/A, timeout=PERMANENT}
For example:
************************************************
STARTING PTF TESTS...
************************************************
python -u ptf_runner.py --device stratum-bfrt --port-map port_map.veth.json --ptf-dir fabric.ptf --cpu-port 320 --device-id 1 --grpc-addr "127.0.0.1:28000" --p4info /p4c-out/p4info.txt --tofino-pipeline-tar /p4c-out/pipeline.tar.bz2 test.FabricIPv4UnicastGtpTest
INFO:PTF runner:Sending P4 config
INFO:PTF runner:Executing PTF command: ptf --test-dir fabric.ptf -i 0@veth1 -i 1@veth3 -i 2@veth5 -i 3@veth7 -i 4@veth9 -i 5@veth11 -i 6@veth13 -i 7@veth15 --test-params=p4info='/p4c-out/p4info.txt';grpcaddr='127.0.0.1:28000';device_id='1';cpu_port='320';device='stratum-bfrt' test.FabricIPv4UnicastGtpTest
WARNING: No route found for IPv6 destination :: (no default route?)
test.FabricIPv4UnicastGtpTest ... FAIL
======================================================================
FAIL: test.FabricIPv4UnicastGtpTest
----------------------------------------------------------------------
Traceback (most recent call last):
File "/fabric-p4test/tests/ptf/base_test.py", line 813, in handle
return f(*args, **kwargs)
File "fabric.ptf/test.py", line 115, in runTest
self.runIPv4UnicastTest(pkt, next_hop_mac=HOST2_MAC)
File "/fabric-p4test/tests/ptf/fabric_test.py", line 864, in runIPv4UnicastTest
testutils.verify_packet(self, exp_pkt, self.port2)
File "/usr/local/lib/python2.7/dist-packages/ptf/testutils.py", line 2546, in verify_packet
% (device, port, result.format()))
AssertionError: Expected packet was not received on device 0, port 2.
========== EXPECTED ==========
dst : DestMACField = '00:00:00:00:00:02' (None)
src : SourceMACField = '00:00:00:00:aa:01' (None)
type : XShortEnumField = 2048 (0)
--
version : BitField = 4 (4)
ihl : BitField = None (None)
tos : XByteField = 0 (0)
len : ShortField = None (None)
id : ShortField = 1 (1)
flags : FlagsField = 0 (0)
frag : BitField = 0 (0)
ttl : ByteField = 63 (64)
proto : ByteEnumField = 17 (0)
chksum : XShortField = None (None)
src : Emph = '10.0.3.1' (None)
dst : Emph = '10.0.4.1' ('127.0.0.1')
options : PacketListField = [] ([])
--
sport : ShortEnumField = 2152 (53)
dport : ShortEnumField = 2152 (53)
len : ShortField = None (None)
chksum : XShortField = None (None)
--
version : BitField = 1 (1)
PT : BitField = 1 (1)
reserved : BitField = 0 (0)
E : BitField = 0 (0)
S : BitField = 0 (0)
PN : BitField = 0 (0)
gtp_type : ByteField = 255 (255)
length : ShortField = None (None)
teid : IntField = 4009738480 (0)
--
version : BitField = 4 (4)
ihl : BitField = None (None)
tos : XByteField = 0 (0)
len : ShortField = None (None)
id : ShortField = 1 (1)
flags : FlagsField = 0 (0)
frag : BitField = 0 (0)
ttl : ByteField = 64 (64)
proto : ByteEnumField = 17 (0)
chksum : XShortField = None (None)
src : Emph = '10.0.1.1' (None)
dst : Emph = '10.0.2.1' ('127.0.0.1')
options : PacketListField = [] ([])
--
sport : ShortEnumField = 5061 (53)
dport : ShortEnumField = 5060 (53)
len : ShortField = None (None)
chksum : XShortField = None (None)
--
load : StrField = '\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab' ('')
--
0000 00 00 00 00 00 02 00 00 00 00 AA 01 08 00 45 00 ..............E.
0010 00 C0 00 01 00 00 3F 11 60 2B 0A 00 03 01 0A 00 ......?.`+......
0020 04 01 08 68 08 68 00 AC 08 D4 30 FF 00 9C EE FF ...h.h....0.....
0030 C0 F0 45 00 00 9C 00 01 00 00 40 11 63 4F 0A 00 [email protected]..
0040 01 01 0A 00 02 01 13 C5 13 C4 00 88 D5 68 AB AB .............h..
0050 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
0060 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
0070 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
0080 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
0090 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
00a0 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
00b0 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
00c0 AB AB AB AB AB AB AB AB AB AB AB AB AB AB ..............
========== RECEIVED ==========
1 total packets. Displaying most recent 1 packets:
------------------------------
dst : DestMACField = '00:00:00:00:00:02' (None)
src : SourceMACField = '00:00:00:00:aa:01' (None)
type : XShortEnumField = 2048 (0)
--
version : BitField = 3L (4)
ihl : BitField = 0L (None)
tos : XByteField = 255 (0)
len : ShortField = 156 (None)
id : ShortField = 61183 (1)
flags : FlagsField = 6L (0)
frag : BitField = 240L (0)
ttl : ByteField = 69 (64)
proto : ByteEnumField = 0 (0)
chksum : XShortField = 192 (None)
src : Emph = '0.1.0.0' (None)
dst : Emph = '63.17.96.43' ('127.0.0.1')
options : PacketListField = [<IPOption copy_flag=0L optclass=control option=experimental_measurement length=0 value='\x03\x01\n\x00\x04\x01\x08h\x08h\x00\xac\x08\xd4\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab' |>, <IPOption_MTU_Probe copy_flag=1L optclass=1L option=mtu_probe length=171 |>] ([])
--
load : StrField = '\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab\xab' ('')
--
0000 00 00 00 00 00 02 00 00 00 00 AA 01 08 00 30 FF ..............0.
0010 00 9C EE FF C0 F0 45 00 00 C0 00 01 00 00 3F 11 ......E.......?.
0020 60 2B 0A 00 03 01 0A 00 04 01 08 68 08 68 00 AC `+.........h.h..
0030 08 D4 AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
0040 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
0050 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
0060 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
0070 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
0080 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
0090 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
00a0 AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB AB ................
00b0 AB AB ..
==============================
We observed this issue in production with
org.stratumproject.fabric-spgw.stratum_bfrt.mavericks_sde_9_2_0
The attached evidence includes:
The ping reply coming out of the switch is delivered at the enb with wrong IPv4 total lenght. The reported length is 16 bytes longer than the captured packet:
In Aether, fabric switches programmed with fabric-tna are supposed to forward SCTP traffic between the eNB and the mobile control plane. For this reason, we should add SCTP to the packet types tested with PTF (currently TCP, UDP, GTP, and ICMP).
As per suggestion from Vladimir/Barefoot. We should not use @auto_init_metadata and set metadata explicitly.
Today we use egress-to-egress mirroring to generate INT reports. When processing such a mirrored packet, we use the egress parser to remove headers we don't want to show up in the INT report (e.g., GTP-U, MPLS). However, that results in a quite intricated parser implementation with:
#ifdef
enabling/disabling such branching inside the parser states for the different profilesIt looks like we could obtain the same behavior in a much more elegant and simpler way by using sub-parsers:
https://p4.org/p4-spec/docs/P4-16-v1.0.0-spec.html#sec-invoke-subparser
In the egress parser, we could have two sub-parsers: (1) for regular packets and (2) for INT report mirrors. Upon detecting the packet type, we could invoke (1) or (2). (2) would have states that by default skip parsing unwanted headers.
If we care about preserving parser resources for the different profiles, we would have to use #ifdef
only to wrap (2) and its invocation.
Note that if/when DeepInsight will be capable of handling tunneling protocols, we won't need to care about removing headers...
Most likely because the egress sub-parser for INT mirrors ignore the fake ethertype.
Switch info table init is in PacketIn test run function. In case of loopback, dataplane packets also come as packet-ins, since cpu_port is set only in PacketIn test, all the other loopback tests fail. Plan to move the switch info table init to test setUp()
We should move to the bfrt device config builder tool added in this PR https://github.com/stratum/stratum-bfrt/pull/34.
This might require some packaging work on the Stratum side.
The compiler output shows that the p4pp file doesn't change after adding the -DWITH_DEBUG
to the Makefile.
Because of #118, we no longer pass the necessary context.json and other files required by tofino-model to produce logs with symbol names (table names, action names, etc.)
Possible options:
run
script, unpack binary config file to a temporary directory and pass that to tofino-modelrun
script to mount the content of ./tmp
instead of ./src/main/resources/p4c-out
inside the tofino-model container (maybe move/rename ./tmp
to something more meaningful such as ./p4src/build
)The egress will decrease TTL for every valid IP packets. However, if an IP packet is bridged, the TTL should not be updated.
A possible solution is to introduce a metadata that controls the TTL decrement and set it to false by default, and only flip this bit to true in action set_next_id_routing_v4
The "simple" next table was introduced in fabric-v1model as an alternative to the hashed table for HW targets that did not support action selectors. We do have support for action selectors in TNA so that logic is obsolete and should be removed.
That means removing:
#ifdef WITH_SIMPLE_NEXT
TODO: add profile with simple next or remove references
)Issue with register reset annotations and tofino-model:
bf_status_t bf_pal_pltfm_type_get(bf_dev_id_t dev_id, bool *is_sw_model)
;We use the term "INT" everywhere, but that's not correct. Our current implementation does not support inband telemetry. We do support the standard telemetry report format (v0.5), but we're not appending INT metadata to data packets.
We should replace current references toint
with something else like dtel
(as in data plane telemetry) or tel
or something else.
Current sizing is the default for the bmv2 build of the old fabric-v1model, which is too small. We have plenty of memory resources to make our tables bigger without worrying about pipeline optimizations.
Not sure why we abandoned #6
Should we rewrite history?
Last time we asked BF they said:
We can allow to publish p4 program + context.json, tofino.bin, and bfrt.json/p4info.txt, no other compilation artifacts.
.conf
file is not mentioned, but we already publish part of it as part of stratum: tofino_skip_p4.conf
The diff between tofino_skip_p4.conf
and fabric-tna.conf
is mostly file paths. I can't see any other sensitive information.
However, the tarball contains much more that must be removed:
.
├── bfrt.json
├── fabric-tna.conf
├── fabric-tna.p4pp
├── manifest.json
├── p4info.txt
└── pipe
├── context.json
├── fabric-tna.bfa
├── fabric-tna.dynhash.json
├── fabric-tna.prim.json
├── graphs
│ ├── FabricEgress.dot
│ ├── FabricEgressDeparser.dot
│ ├── FabricEgressParser.dot
│ ├── FabricIngress.dot
│ ├── FabricIngressDeparser.dot
│ ├── FabricIngressParser.dot
│ ├── dep.json
│ ├── egress.power.dot
│ ├── ingress.power.dot
│ ├── placement_graph.dot
│ ├── power_graph.dot
│ ├── program_graph.dot
│ └── table_dep_graph_placement_0.dot
├── logs
│ ├── flexible_packing.log
│ ├── mau.characterize.log
│ ├── mau.json
│ ├── mau.resources.log
│ ├── metrics.json
│ ├── pa.characterize.log
│ ├── pa.results.log
│ ├── parser.characterize.log
│ ├── parser.log
│ ├── phv.json
│ ├── phv_allocation_0.log
│ ├── power.json
│ ├── pragmas.log
│ ├── resources.json
│ ├── table_dependency_graph.log
│ ├── table_placement_1.log
│ └── table_summary.log
└── tofino.bin
stratum-bfrt complains about:
E20200916 20:57:42.347164 40 bfrt_packetio_manager.cc:272] Return Error: DeparsePacketOut(packet, &buf) failed with StratumErrorSpace::ERR_INVALID_PARAM: 'it != packet.metadata().end()' is false. Missing metadata with Id 2 in PacketOut payload: "\034I{\266\036\025T\207\336\255\276\357\010\006\000\001\010\000\006\004\000\002T\207\336\255\276\357\300\250\373\001\034I{\266\036\025\300\250\373\313" metadata { metadata_id: 1 value: "\000\275" }
A fix for this already exists in #40, but we should fix master ASAP if we're not planning to merge #40 soon, as packet-out is broken when using stratum-bfrt.
Even if the P4 program is now chip-independent (#40), we still produce to pipeconf IDs for the different Tofino chip types (montara
, mavericks
) since some behaviors depend on the pipeconf ID to decide which entries to insert (e.g., for the switch_info
table, or INT mirroring sessions).
The plan is to update the behaviors to avoid depending on the pipeconf ID, but instead, retrieve the chip-type via gNMI. This might require support in ONOS core drivers.
We now replace the clone session/mirror with copy_to_cpu
flag for flows that copying packets to CPU.
However, we still need a mirror for INT.
The current implementation of the mirror is not correct, the egress parser should check the mirror flag and parse additional mirror header.
The TM container will always add veth pairs and set up DMA when running the test
We should do some cleanup after the test, like remove all veth pair and revert the DMA setting.
Currently, we set the MPLS TTL value to a default one(64), however, we should copy the TTL from the IP header.
We also need to set the TTL back to the IP header when we pop the MPLS label.
Not much detail in the original RFC showing how to handle TTL with IP packet
https://tools.ietf.org/html/rfc3031#section-3.23
But there are some rules in RFC2032 (MPLS Label Stack Encoding)
https://tools.ietf.org/html/rfc3032#section-2.4.3
2.4.3. IP-dependent rules
We define the "IP TTL" field to be the value of the IPv4 TTL field,
or the value of the IPv6 Hop Limit field, whichever is applicable.
When an IP packet is first labeled, the TTL field of the label stack
entry MUST BE set to the value of the IP TTL field. (If the IP TTL
field needs to be decremented, as part of the IP processing, it is
assumed that this has already been done.)
When a label is popped, and the resulting label stack is empty, then
the value of the IP TTL field SHOULD BE replaced with the outgoing
TTL value, as defined above. In IPv4 this also requires modification
of the IP header checksum.
It is recognized that there may be situations where a network
administration prefers to decrement the IPv4 TTL by one as it
traverses an MPLS domain, instead of decrementing the IPv4 TTL by the
number of LSP hops within the domain.
Also, there are some explanations on these websites:
https://www.ciscopress.com/articles/article.asp?p=680824&seqNum=4
http://wiki.kemot-net.com/mpls-ttl-behavior
Which says we need to set the TTL value to TTL-1
from the previous header (push/swap/pop)
Exactly as we do for fabric-v1model in ONOS, to allow catching breaking changes to table names, etc. at compile-time rather than run-time.
Now we are using multiple layers of for-loop to create combinations of input parameters, for example:
@autocleanup
def doRunTest(self, param1, param2, ...., paramN):
# run the test with parameters
def runTest(self):
for param1 in [...]
for param2 in [...]
# skip test for some combanitaions
if param1 == A and param2 == B:
continue
for param3 in [....]
doRunTest(param1, param2, ..., paramN)
See comment below
We might be able to use python decorator to make the test more readable, for example:
@autocleanup
@test_param('param1', .......)
@test_param('param2', .......)
@skip_test_if(param1=A, param2=B)
@test_param('param3', .......)
def doRunTest(self, param1, param2, param3):
# run the test with parameters
def runTest(self):
doRunTest()
23:18:04.727 WARN [P4RuntimeFlowRuleProgrammable] Table entry obtained from device device:leaf1 is different from one in in translation store: device=PiTableEntry{tableId=FabricEgress.pkt_io_egress.switch_info, matchKey=DEFAULT-ACTION, tableAction=NoAction(), priority=N/A, timeout=PERMANENT}, store=PiTableEntry{tableId=FabricEgress.pkt_io_egress.switch_info, matchKey=DEFAULT-ACTION, tableAction=FabricEgress.pkt_io_egress.set_cpu_port(cpu_port=0x140), priority=N/A, timeout=PERMANENT}
The issue was observed after reloading the fabric-tna pipeconf app in a running ONOS instance (production pod).
The ONOS core does not contain any flow for the switch_info
table, which is weird. Seems like a bug with the ONOS translation store, and with fabric-tna pipeline initialization logic.
The current pipeconf build for bfrt is broken and always builds the old PI pipeline format. This causes the pipeline push to fail. It's a runtime issue.
I suspect commit e87824a broke something.
First bytes of the pushed pipeline:
00000000: 3f00 0000 6f72 672e 7374 7261 7475 6d70 ?...org.stratump
00000010: 726f 6a65 6374 2e66 6162 7269 632d 7370 roject.fabric-sp
00000020: 6777 2d69 6e74 2e73 7472 6174 756d 5f62 gw-int.stratum_b
00000030: 662e 6d6f 6e74 6172 615f 7364 655f 395f f.montara_sde_9_
00000040: 325f 30ac d217 0000 0000 48b4 0000 0002 2_0.......H.....
00000050: 6173 6d5f 7665 7273 696f 6e00 0600 0000 asm_version.....
ONOS log:
onos_1 | 21:43:10.030 INFO [PipelineConfigClientImpl] Setting pipeline config for device:leaf1 to org.stratumproject.fabric-spgw-int.stratum_bf.montara_sde_9_2_0...
We can use git filter-branch:
git filter-branch --tree-filter 'rm -rf src/main/resources/p4c-out' -- --all
This requires rewriting history for all branches
FAIL: test.FabricIPv4UnicastGroupTestAllPortTcpSport
----------------------------------------------------------------------
Traceback (most recent call last):
File "/fabric-p4test/tests/ptf/base_test.py", line 1006, in handle
return f(*args, **kwargs)
File "/fabric-p4test/tests/ptf/base_test.py", line 982, in handle
return f(*args, **kwargs)
File "fabric.ptf/test.py", line 256, in runTest
[exp_pkt_to2, exp_pkt_to3], [self.port2, self.port3])
File "/fabric-p4test/tests/ptf/base_test.py", line 443, in verify_any_packet_any_port
return testutils.verify_any_packet_any_port(self, pkts, ports)
File "/usr/local/lib/python2.7/dist-packages/ptf/testutils.py", line 2707, in verify_any_packet_any_port
verify_no_other_packets(test, device_number=device_number)
File "/usr/local/lib/python2.7/dist-packages/ptf/testutils.py", line 2579, in verify_no_other_packets
"packets.\n%s" % (result.device, result.port, result.format()))
AssertionError: A packet was received on device 0, port 3, but we expected no packets.
========== RECEIVED ==========
0000 00 00 00 00 00 03 00 00 00 00 AA 01 08 00 45 00 ..............E.
0010 00 56 00 01 00 00 3F 06 64 A0 0A 00 01 01 0A 00 .V....?.d.......
0020 02 01 04 E5 00 50 00 00 00 00 00 00 00 00 50 02 .....P........P.
0030 20 00 77 6B 00 00 00 01 02 03 04 05 06 07 08 09 .wk............
0040 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 ................
0050 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 ...... !"#$%&'()
0060 2A 2B 2C 2D *+,-
==============================
FAIL: test.FabricIPv4UnicastGroupTestAllPortIpSrc
----------------------------------------------------------------------
Traceback (most recent call last):
File "fabric.ptf/test.py", line 405, in runTest
self.IPv4UnicastGroupTestAllPortL4SrcIp("udp")
File "/fabric-p4test/tests/ptf/base_test.py", line 1006, in handle
return f(*args, **kwargs)
File "/fabric-p4test/tests/ptf/base_test.py", line 982, in handle
return f(*args, **kwargs)
File "fabric.ptf/test.py", line 379, in IPv4UnicastGroupTestAllPortL4SrcIp
[exp_pkt_to2, exp_pkt_to3], [self.port2, self.port3])
File "/fabric-p4test/tests/ptf/base_test.py", line 443, in verify_any_packet_any_port
return testutils.verify_any_packet_any_port(self, pkts, ports)
File "/usr/local/lib/python2.7/dist-packages/ptf/testutils.py", line 2711, in verify_any_packet_any_port
"device %d.\n%s" % (ports, device_number, result.format()))
AssertionError: Did not receive any expected packet on any of ports [2, 3] for device 0.
FAIL: test.FabricIPv4UnicastGroupTestAllPortTcpDport
----------------------------------------------------------------------
Traceback (most recent call last):
File "/fabric-p4test/tests/ptf/base_test.py", line 1006, in handle
return f(*args, **kwargs)
File "/fabric-p4test/tests/ptf/base_test.py", line 982, in handle
return f(*args, **kwargs)
File "fabric.ptf/test.py", line 317, in runTest
[exp_pkt_to2, exp_pkt_to3], [self.port2, self.port3])
File "/fabric-p4test/tests/ptf/base_test.py", line 443, in verify_any_packet_any_port
return testutils.verify_any_packet_any_port(self, pkts, ports)
File "/usr/local/lib/python2.7/dist-packages/ptf/testutils.py", line 2711, in verify_any_packet_any_port
"device %d.\n%s" % (ports, device_number, result.format()))
AssertionError: Did not receive any expected packet on any of ports [2, 3] for device 0.
========== RECEIVED ==========
0 total packets.
==============================
FAIL: test.FabricIPv4UnicastGroupTestAllPortIpDst
----------------------------------------------------------------------
Traceback (most recent call last):
File "fabric.ptf/test.py", line 475, in runTest
self.IPv4UnicastGroupTestAllPortL4DstIp("tcp")
File "/fabric-p4test/tests/ptf/base_test.py", line 1006, in handle
return f(*args, **kwargs)
File "/fabric-p4test/tests/ptf/base_test.py", line 982, in handle
return f(*args, **kwargs)
File "fabric.ptf/test.py", line 450, in IPv4UnicastGroupTestAllPortL4DstIp
[exp_pkt_to2, exp_pkt_to3], [self.port2, self.port3])
File "/fabric-p4test/tests/ptf/base_test.py", line 443, in verify_any_packet_any_port
return testutils.verify_any_packet_any_port(self, pkts, ports)
File "/usr/local/lib/python2.7/dist-packages/ptf/testutils.py", line 2707, in verify_any_packet_any_port
verify_no_other_packets(test, device_number=device_number)
File "/usr/local/lib/python2.7/dist-packages/ptf/testutils.py", line 2579, in verify_no_other_packets
"packets.\n%s" % (result.device, result.port, result.format()))
AssertionError: A packet was received on device 0, port 2, but we expected no packets.
========== RECEIVED ==========
0000 00 00 00 00 00 02 00 00 00 00 AA 01 08 00 45 00 ..............E.
0010 00 56 00 01 00 00 3F 06 64 41 0A 00 01 01 0A 00 .V....?.dA......
0020 02 60 04 D2 00 50 00 00 00 00 00 00 00 00 50 02 .`...P........P.
0030 20 00 77 1F 00 00 00 01 02 03 04 05 06 07 08 09 .w.............
0040 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 ................
0050 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 ...... !"#$%&'()
0060 2A 2B 2C 2D *+,-
==============================
Given a watchlist entry matching only on IPv4 source and destination address, we get reports when running iperf in TCP mode, but we don't get any when using UDP mode.
The issue was experienced on the production pod.
The fabric pipeconf in ONOS has unit tests that should be ported to this repo as well
Lines 30 to 34 in 5299544
Otherwise, DeepInsight will not be able to recognize the flow (unknown ethertype)
Currently, we generate constants for fabric-spgw-int
as we assume that profile is the one with all preprocessor flags enabled, but that might no longer be the case. It seems a better idea to pass a list of p4info files to the script, and have the script output all constants for all profiles
Now we can use the serializale_enums
from the p4info file to generate constants, here is anexample of serializable_enums
:
type_info {
serializable_enums {
key: "BridgedMdType_t"
value {
underlying_type {
bitwidth: 8
}
members {
name: "INVALID"
value: "\000"
}
members {
name: "INGRESS_TO_EGRESS"
value: "\001"
}
members {
name: "EGRESS_MIRROR"
value: "\002"
}
members {
name: "INGRESS_MIRROR"
value: "\003"
}
}
}
}
tofino-model is appending 2 extra bytes 00 00
to the packet-ins in FabricArpBroadcastUntaggedTest
and FabricArpBroadcastMixedTest
. This behavior is seen when using the loopback mode. Attached tofino-model logs for loopback mode for FabricArpBroadcastUntaggedTest
loopback_arp_broadcast_untagged_model_0.log
The current approach of maintaining two copies of inner/outer L4 headers adds annoying complexity and hogs PHV resources. An alternative approach is to resubmit packets that require decapsulation, and simply skip the outer headers in the parser. This will greatly simplify the pipeline and parser at the cost of reducing bandwidth, which should be ok given the heckin' fat maximum bandwidth of tofino. It is also currently unclear how much resubmit really reduces maximum bandwidth, and needs further investigation.
Some tests are using codes like below:
flowRuleService.applyFlowRules(eq(expectedFlow));
expectLastCall().andVoid().once();
verify(flowRuleService);
The eq
function uses equals
from the object, and the DefaultFlowRule
won't actually compare the treatment.
We should capture the flow rule and use exactMatch
to compare the result.
The watchlist is designed to match only on IPv4 flows. This is probably happening because of PHV conflicts (arp/lldp fields ending up in ipv4/udp/tcp fields).
We should update the watchlist table to either:
if (hdr.ipv4.isValid()) watchlist.apply()
)We should check all the other tables in the pipeline. Do we check header validity when matching certain fields?
I wish the compiler would be capable of emitting warnings for such conditions.
FabricMPLSSegmentRoutingTest
fails with packet mismatch for next_hop_spine_True
scenarios. All the tests are run on hardware in loopback mode.
=== RUN FabricMplsSegmentRoutingTest0
=== RUN FabricMplsSegmentRoutingTest0/tcp_next_hop_spine_True
WARN[2020-10-22T10:20:13.412Z]p4rt.go:141 verifyPacketIn() Payloads don't match
Expected: 00 00 00 00 00 02 00 00 00 00 aa 01 88 47 00 06 41 3f 45 00 00 42 00 01 00 00 40 06 63 b4 0a 00 01 01 0a 00 02 01 04 d2 00 50 00 00 00 00 00 00 00 00 50 02 20 00 d6 fb 00 00 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19
Actual : 00 00 00 00 00 02 00 00 00 00 aa 01 08 00 45 00 00 42 00 01 00 00 40 06 63 b4 0a 00 01 01 0a 00 02 01 04 d2 00 50 00 00 00 00 00 00 00 00 50 02 20 00 d6 fb 00 00 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19
=== RUN FabricMplsSegmentRoutingTest0/Undo_Write_Requests
--- FAIL: FabricMplsSegmentRoutingTest0 (0.03s)
--- FAIL: FabricMplsSegmentRoutingTest0/tcp_next_hop_spine_True (0.02s)
--- PASS: FabricMplsSegmentRoutingTest0/Undo_Write_Requests (0.00s)
=== RUN FabricMplsSegmentRoutingTest1
=== RUN FabricMplsSegmentRoutingTest1/tcp_next_hop_spine_False
=== RUN FabricMplsSegmentRoutingTest1/Undo_Write_Requests
--- PASS: FabricMplsSegmentRoutingTest1 (0.02s)
--- PASS: FabricMplsSegmentRoutingTest1/tcp_next_hop_spine_False (0.01s)
--- PASS: FabricMplsSegmentRoutingTest1/Undo_Write_Requests (0.00s)
=== RUN FabricMplsSegmentRoutingTest2
=== RUN FabricMplsSegmentRoutingTest2/udp_next_hop_spine_True
WARN[2020-10-22T10:20:13.447Z]p4rt.go:141 verifyPacketIn() Payloads don't match
Expected: 00 00 00 00 00 02 00 00 00 00 aa 01 88 47 00 06 41 3f 45 00 00 42 00 01 00 00 40 11 63 a9 0a 00 01 01 0a 00 02 01 04 d2 00 50 00 2e 8c 04 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25
Actual : 00 00 00 00 00 02 00 00 00 00 aa 01 08 00 45 00 00 42 00 01 00 00 40 11 63 a9 0a 00 01 01 0a 00 02 01 04 d2 00 50 00 2e 8c 04 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 24 25
=== RUN FabricMplsSegmentRoutingTest2/Undo_Write_Requests
--- FAIL: FabricMplsSegmentRoutingTest2 (0.02s)
--- FAIL: FabricMplsSegmentRoutingTest2/udp_next_hop_spine_True (0.01s)
--- PASS: FabricMplsSegmentRoutingTest2/Undo_Write_Requests (0.00s)
=== RUN FabricMplsSegmentRoutingTest3
=== RUN FabricMplsSegmentRoutingTest3/udp_next_hop_spine_False
=== RUN FabricMplsSegmentRoutingTest3/Undo_Write_Requests
--- PASS: FabricMplsSegmentRoutingTest3 (0.02s)
--- PASS: FabricMplsSegmentRoutingTest3/udp_next_hop_spine_False (0.02s)
--- PASS: FabricMplsSegmentRoutingTest3/Undo_Write_Requests (0.00s)
=== RUN FabricMplsSegmentRoutingTest4
=== RUN FabricMplsSegmentRoutingTest4/icmp_next_hop_spine_True
WARN[2020-10-22T10:20:13.487Z]p4rt.go:141 verifyPacketIn() Payloads don't match
Expected: 00 00 00 00 00 02 00 00 00 00 aa 01 88 47 00 06 41 3f 45 00 00 42 00 01 00 00 40 01 63 b9 0a 00 01 01 0a 00 02 01 08 00 64 6c 00 00 00 00 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
Actual : 00 00 00 00 00 02 00 00 00 00 aa 01 08 00 45 00 00 42 00 01 00 00 40 01 63 b9 0a 00 01 01 0a 00 02 01 08 00 64 6c 00 00 00 00 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
=== RUN FabricMplsSegmentRoutingTest4/Undo_Write_Requests
--- FAIL: FabricMplsSegmentRoutingTest4 (0.02s)
--- FAIL: FabricMplsSegmentRoutingTest4/icmp_next_hop_spine_True (0.01s)
--- PASS: FabricMplsSegmentRoutingTest4/Undo_Write_Requests (0.00s)
=== RUN FabricMplsSegmentRoutingTest5
=== RUN FabricMplsSegmentRoutingTest5/icmp_next_hop_spine_False
=== RUN FabricMplsSegmentRoutingTest5/Undo_Write_Requests
--- PASS: FabricMplsSegmentRoutingTest5 (0.02s)
--- PASS: FabricMplsSegmentRoutingTest5/icmp_next_hop_spine_False (0.01s)
--- PASS: FabricMplsSegmentRoutingTest5/Undo_Write_Requests (0.00s)
FAIL
Linked Jenkins build here:
https://jenkins.stratumproject.org/blue/organizations/jenkins/fabric-tna-hardware-loopback/detail/fabric-tna-hardware-loopback/45/pipeline/73
ONOS needs them and issue https://github.com/stratum/stratum-bfrt/pull/17 showed that wildcards have different semantics from normal reads. Therefore we should test this both on exact match and ternary tables.
Should not use things like "../define.p4" or "../header.p4" incase we move files to a different path.
I'm sending the following packet:
21:24:22.548175 00:1e:67:d2:ee:ea > 00:1e:67:d2:ce:52, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 54276, offset 0, flags [DF], proto ICMP (1), length 84)
192.168.250.20 > 192.168.250.21: ICMP echo request, id 8766, seq 62, length 64
But ONOS reports that the VLAN broadcast entry is the one being matched, even when there is one for dest MAC 00:1e:67:d2:ce:52
:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.