Comments (55)
Hi Liguang and Eric,
I didn't find the packet parsing if in the interface class and neither from the arch diagram. So whether this dhcp server need to handle scenarios for incoming packet from network? Naive question. Thanks.
@xieus @er1cthe0ne
from alcor-control-agent.
Hi Eric,
I'm trying to meet that goal. I'll keep updating you.
from alcor-control-agent.
Hi @cj-chung @er1cthe0ne ,
Added dhcp openflow rule procedure.
Here is the API James may use: ACA_Dhcp_Server::get_instance().dhcps_recv(dhcp_payload).Will let @cj-chung to point out where is change the code to call the DHCP code above. The idea is to let @w2520n2520 make that change in this PR :)
@cj-chung , please guid me to build the whole project locally.
Shoud i re-generate the makefile use CMake then make it?To build it inside the docker container, the dockerfile should have all the necessary steps to get all the depencenies during container creation: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/build/Dockerfile
To build it on a physical machine or VM, take a look at the machine bring up script, you can run all the steps except the last step 8 and 9: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/build/aca-machine-init.sh
I think @er1cthe0ne has replied most of questions. For the place in the aca_ovs_control.cpp to call dhcp_server, you can modify the codes between #205-#221 in the aca_ovs_control.cpp. The payload in the #210 is the udp (dhcp) payload. You can just call ACA_Dhcp_Server::get_instance().dhcps_recv(payload) here and send the payload to the function.
from alcor-control-agent.
Item 1 is done. Design doc link: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/docs/dhcp_programming.adoc
from alcor-control-agent.
Item 2 is under review: futurewei-cloud/alcor#193
from alcor-control-agent.
I'm interesting in this issue, and I have some experence in network stack developing. May this issue assigned to me? Thanks.
from alcor-control-agent.
@w2520n2520 Absolutely, and thank you! This issue has been assigned.
from alcor-control-agent.
Update to Item 2: PR futurewei-cloud/alcor#193 has been merged to alcor/master.
from alcor-control-agent.
Hi Liguang and Eric,
I didn't find the packet parsing if in the interface class and neither from the arch diagram. So whether this dhcp server need to handle scenarios for incoming packet from network? Naive question. Thanks.
@xieus @er1cthe0ne
Hi @w2520n2520 - you asked the right question and on the right track. This dhcp server needs to intercept the dhcp packets using openflow rules, parse it and reply with DHCP_OFFER and later DHCP_ACK message. More information is available in the reference session in the design doc: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/docs/dhcp_programming.adoc
from alcor-control-agent.
Running alcor-control-agent and tests
You can run the test (optional):
root@ca62b6feec63:/mnt/host/code/alcor-control-agent# ./build/tests/aca_tests
When building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve.
from alcor-control-agent.
Running alcor-control-agent and tests
You can run the test (optional):
root@ca62b6feec63:/mnt/host/code/alcor-control-agent# ./build/tests/aca_testsWhen building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve.
Did you encounter this issue of "libgtest.so can't open or doesn't exist"? The intent is to run aca_tests inside the build container which has all the dependency setup already.
from alcor-control-agent.
Followed build and execution
Running alcor-control-agent and tests
You can run the test (optional):
root@ca62b6feec63:/mnt/host/code/alcor-control-agent# ./build/tests/aca_testsWhen building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve.
Did you encounter this issue of "libgtest.so can't open or doesn't exist"? The intent is to run aca_tests inside the build container which has all the dependency setup already.
Well, build and test should be executed in the generated docker "a1", my misunderstanding.
I can got 18 tests passed but still fail to run the bin.
from alcor-control-agent.
Followed build and execution
Running alcor-control-agent and tests
You can run the test (optional):
root@ca62b6feec63:/mnt/host/code/alcor-control-agent# ./build/tests/aca_testsWhen building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve.
Did you encounter this issue of "libgtest.so can't open or doesn't exist"? The intent is to run aca_tests inside the build container which has all the dependency setup already.
Well, build and test should be executed in the generated docker "a1", my misunderstanding.
I can got 18 tests passed but still fail to run the bin.
Seeing 18 tests passed on the unit/functional test is good enough for now. What kind of error do you see when you run ./build/bin/AlcorControlAgent? It will try to connect to kafka so those error maybe expected if kafka was not setup.
The next step on DHCP implementation is to develop a standalone DHCP application based on this design. We can do the integration to AlcorControlAgent later.
from alcor-control-agent.
Thanks
Followed build and execution
Running alcor-control-agent and tests
You can run the test (optional):
root@ca62b6feec63:/mnt/host/code/alcor-control-agent# ./build/tests/aca_testsWhen building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve.
Did you encounter this issue of "libgtest.so can't open or doesn't exist"? The intent is to run aca_tests inside the build container which has all the dependency setup already.
Well, build and test should be executed in the generated docker "a1", my misunderstanding.
I can got 18 tests passed but still fail to run the bin.Seeing 18 tests passed on the unit/functional test is good enough for now. What kind of error do you see when you run ./build/bin/AlcorControlAgent? It will try to connect to kafka so those error maybe expected if kafka was not setup.
The next step on DHCP implementation is to develop a standalone DHCP application based on this design. We can do the integration to AlcorControlAgent later.
Thanks Eric. Just trying to build up my working ground here.
from alcor-control-agent.
Hi Eric,
One question below:
int Aca_Comm_Manager::update_goal_state()
{
update_vpc_states();
update_subnet_states();
update_port_states();
update_dhcp_states(); //to be
}
So these resources will always be updates together? Any chance they can be updated independently? Thanks. @er1cthe0ne
from alcor-control-agent.
Hi Eric,
One question below:
int Aca_Comm_Manager::update_goal_state() { update_vpc_states(); update_subnet_states(); update_port_states(); update_dhcp_states(); //to be }So these resources will always be updates together? Any chance they can be updated independently? Thanks. @er1cthe0ne
Hi Nan Wu,
Good question, the GoalState message contains:
- 0 to N vpc_states
- 0 to N subnet_states
- 0 to N port_states
- 0 to N security_group_states
- 0 to N dhcp_states
Aca_Comm_Manager will try to update the whole GoalState in an efficient manner.
For DHCP create, the likely GoalState message would look like:
- 1 port_states, OperationType::CREATE - create/configure a new port
- 1 dhcp_states, OperationType::CREATE - create the DHCP info for the new port
Or DHCP update, it could look like:
- 1 dhcp_states, OperationType::UPDATE - update the DHCP info for a port
Does it make sense? Let me know if you have other questions. @w2520n2520
from alcor-control-agent.
The next step on DHCP implementation is to develop a standalone DHCP application based on this design. We can do the integration to AlcorControlAgent later.
Hi Nan Wu,
Do you think you can have the standalone DHCP application available in a few weeks? It would be great if we can complete the integration into AlcorControlAgent by the month of June. @w2520n2520
from alcor-control-agent.
The next step on DHCP implementation is to develop a standalone DHCP application based on this design. We can do the integration to AlcorControlAgent later.
Hi Nan Wu,
Do you think you can have the standalone DHCP application available in a few weeks? It would be great if we can complete the integration into AlcorControlAgent by the month of June. @w2520n2520
Hi Nan Wu,
Checking in here. Do you think we can meet the target of June to have a standalone DHCP application based on this design and integrate it with AlcorControlAgent? Let me know. @w2520n2520
from alcor-control-agent.
Hi Nan Wu,
Checking in here and see if there is anything I can help. Maybe we can breakdown the standalone DHCP application task into smaller pieces? e.g.:
- basic framework on the application, command line parsing but doesn't need to be fancy.
- Implement DHCP handler class inherit from Dhcp_Programming_Interface in https://github.com/futurewei-cloud/alcor-control-agent/blob/master/include/aca_dhcp_programming_if.h
- program the openflow rule to route DHCP packets into and out of the DHCP application
- parsing of the input parameter (comes from goalstate message) to DHCP handler class
- determine the needed DHCP actions within the DHCP application
- unit test infrastructure and test cases
How does it sound? @w2520n2520
from alcor-control-agent.
Hi Nan Wu,
Checking in here and see if there is anything I can help. Maybe we can breakdown the standalone DHCP application task into smaller pieces? e.g.:
- basic framework on the application, command line parsing but doesn't need to be fancy.
- Implement DHCP handler class inherit from Dhcp_Programming_Interface in https://github.com/futurewei-cloud/alcor-control-agent/blob/master/include/aca_dhcp_programming_if.h
- program the openflow rule to route DHCP packets into and out of the DHCP application
- parsing of the input parameter (comes from goalstate message) to DHCP handler class
- determine the needed DHCP actions within the DHCP application
- unit test infrastructure and test cases
How does it sound? @w2520n2520
Hi Eric,
Actually I've done about 4th item, dhcp handler part. I'm working on the 2nd and 3rd items. But i have doubts on them.
Per my understanding, here is the code flow for state msg: consumer->comm_mgr-->update_goal-->dhcp_state_handler(newly_added)-->dhcp_prog_if--------??--------->dhcp_server
Q1: How should i put dhcp_server? Should it be in a independent thread or run in the same one with aca_main?(maybe not a good idea). About the "??" part, net_handler use rpc to talk to transit_daemon of mizar, but dhcp_server is supposed to be on the same node, so rpc may be not necessary here, but again network dhcp-server will be on different node, the same comm way will benefit. I have limited understanding about alcor-agent's whole design behind, I may need your involvement here.
Q2: How is like the code flow for 3rd item? Didn't find the if for packet_in under current src dir.
Thanks for your guidance and help.
@er1cthe0ne
from alcor-control-agent.
Hi Nan Wu,
Thanks for the questions, I will answer it one by one. Do let me know if you have other questions.
Should it be in a independent thread
Great question, it should be independent thread spin up by aca_main. We will implement it during integration with DHCP standalone app into ACA.
About the "??"
After integration, the DHCP code will be part of ACA running in another thread, so no RPC is needed. You can check out https://github.com/futurewei-cloud/alcor-control-agent/blob/164a8a7cbad1f3b46c0d0592d11df875f192326d/include/aca_dataplane_ovs.h as an example to how to consume an ACA programming interface.
network dhcp-server will be on different node
It will by driven by ACA running on that node in the future, so same communication flow from Alcor controller which sends down goal state message to ACA.
How is like the code flow for 3rd item?
Can you tell me which specific code flow? I want to give you the right information. Are you talking about the openflow rule programming, or how to provide the right DHCP response back to the VM?
Thanks,
Eric
from alcor-control-agent.
Hi Eric,
Thanks for the reply.
Still have further questions, may need more your time, trying to understand the design here. :)
Should it be in a independent thread
Great question, it should be independent thread spin up by aca_main. We will implement it during integration with DHCP standalone app into ACA.
[Nan]: OK. I thought i was supposed to start from here. We can do it later.
network dhcp-server will be on different node
It will by driven by ACA running on that node in the future, so same communication flow from Alcor controller which sends down goal state message to ACA.
[Nan]: No, I mean the packet_in flow here instead of the control message flow(goal state). In the dhcp design doc, it mentioned openflow table rules will be used to transfer dhcp packets to dhcp-server. The question is if the dataplane is mizar, there will be no openflow tables right? Another one is, if openflow table is used, there will be two flows--one for local dhcp-server, the other is for network-dhcp-server with low priority. When the local one fails, so should its corresponding flow, so packet will be transfer to the network one.
Is this understanding correct? Still confused about the packet_in_handler flow here.
How is like the code flow for 3rd item?
Can you tell me which specific code flow? I want to give you the right information. Are you talking about the openflow rule programming, or how to provide the right DHCP response back to the VM?
[Nan]: Yes, about the openflow rule programming part.
@er1cthe0ne
from alcor-control-agent.
Hi Nan Wu,
Still have further questions, may need more your time, trying to understand the design here. :)
No problem, feel free to ask :)
Another one is, if openflow table is used, there will be two flows--one for local dhcp-server, the other is for network-dhcp-server with low priority. When the local one fails, so should its corresponding flow, so packet will be transfer to the network one.
The current focus is OVS dataplane, and the current design only support one dataplane per host.
The backup network-dhcp-server is used when local ACA is down, and it didn't have a chance to setup the local-dhcp-server flow. In the event if ACA exit gracefully, it should remove the local-dhcp-server flow. If ACA exit unexpectedly, it will try to restart a few times and if ACA really cannot get back to running state. Alcor controller would detect it and perform corrective actions.
In summary, I am not sure how both local-dhcp-server and network-dhcp-server flow works at the same time since one of them will be used based on priority. Unless we set a timeout on local-dhcp-server flow but then ACA will need to keep renewing it.
Still confused about the packet_in_handler flow here.
Did I answer your question above? Let me know.
[Nan]: Yes, about the openflow rule programming part.
Ok, please go ahead and execute system call for now (see execute_system_command). ACA will be adding better openflow client support in the future (per current design) and then DHCP code can leverage that when ready.
Hope all of them make sense to you.
BTW, once you have some code implemented, it will be great to send a PR so that we can look at and discuss if needed. @w2520n2520
from alcor-control-agent.
More information on packet_in_handler flow. In order to have DHCP packets send to ACA, we will need to implement an openflow controller, and have an openflow rule send the matched DHCP packets to openflow controller, that's ACA in our case.
We may use something similar to ovs-ofctl implementation, which acks as an openflow controller. Below is an experiment to show that it should work:
root@fw0016589: ping -I 192.168.0.131 -c1 192.168.0.124
PING 192.168.0.124 (192.168.0.124) from 192.168.0.131 : 56(84) bytes of data.
64 bytes from 192.168.0.124: icmp_seq=1 ttl=64 time=0.348 ms
Br-int is letting all the traffic go now:
root@fw0016589: ovs-ofctl dump-flows br-int
cookie=0x0, duration=699.025s, table=0, n_packets=140, n_bytes=15059, priority=0 actions=NORMAL
Adding new openflow rule to send all packet to CONTROLLER, that’s ovs-ofctl for this case:
root@fw0016589: ovs-ofctl add-flow br-int "table=0, priority=100, actions=CONTROLLER"
root@fw0016589: ovs-ofctl dump-flows br-int
cookie=0x0, duration=786.163s, table=0, n_packets=140, n_bytes=15059, priority=0 actions=NORMAL
cookie=0x0, duration=4.482s, table=0, n_packets=0, n_bytes=0, priority=100 actions=CONTROLLER:65535
Ping doesn’t work anymore because the packets has been sent to CONTROLLER!
root@fw0016589: ping -I 192.168.0.131 -c1 192.168.0.124
PING 192.168.0.124 (192.168.0.124) from 192.168.0.131 : 56(84) bytes of data.
--- 192.168.0.124 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Printed out by ovs-ofctl!
root@fw0016589: ovs-ofctl monitor br-int 1
NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=98 in_port=int0 (via action) data_len=98 (unbuffered)
icmp,vlan_tci=0x0000,dl_src=ee:c3:0f:ee:c3:46,dl_dst=36:f2:97:d5:3a:b9,nw_src=192.168.0.131,nw_dst=192.168.0.124,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0 icmp_csum:947d
NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=42 in_port=int0 (via action) data_len=42 (unbuffered)
arp,vlan_tci=0x0000,dl_src=ee:c3:0f:ee:c3:46,dl_dst=36:f2:97:d5:3a:b9,arp_spa=192.168.0.131,arp_tpa=192.168.0.124,arp_op=1,arp_sha=ee:c3:0f:ee:c3:46,arp_tha=00:00:00:00:00:00
NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=42 in_port=int0 (via action) data_len=42 (unbuffered)
arp,vlan_tci=0x0000,dl_src=ee:c3:0f:ee:c3:46,dl_dst=36:f2:97:d5:3a:b9,arp_spa=192.168.0.131,arp_tpa=192.168.0.124,arp_op=1,arp_sha=ee:c3:0f:ee:c3:46,arp_tha=00:00:00:00:00:00
NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=42 in_port=int0 (via action) data_len=42 (unbuffered)
arp,vlan_tci=0x0000,dl_src=ee:c3:0f:ee:c3:46,dl_dst=36:f2:97:d5:3a:b9,arp_spa=192.168.0.131,arp_tpa=192.168.0.124,arp_op=1,arp_sha=ee:c3:0f:ee:c3:46,arp_tha=00:00:00:00:00:00
OFPT_ECHO_REQUEST (xid=0x0): 0 bytes of payload
The flow rules shows that the packets is going to CONTROLLER:
root@fw0016589: ovs-ofctl dump-flows br-int
cookie=0x0, duration=979.012s, table=0, n_packets=140, n_bytes=15059, priority=0 actions=NORMAL
cookie=0x0, duration=197.331s, table=0, n_packets=8, n_bytes=420, priority=100 actions=CONTROLLER:65535
@w2520n2520 - let me know if you have question on the approach or have a better suggestion.
from alcor-control-agent.
Hi Nan Wu,
Let me know if you have any outstanding questions. It will be good if you can join the Alcor community meeting tomorrow to discuss and sync up on the progress. @w2520n2520
from alcor-control-agent.
Hi Nan Wu,
Just checking to see how is the work going? Let me know if you have any questions. @w2520n2520
from alcor-control-agent.
Hi Eric,
I've submit a PR. Maybe you can help review it and give some suggestion.
Thank you.
@er1cthe0ne
from alcor-control-agent.
@w2520n2520 - one of the action item is for your code to provide an interface to be called when ACA received a DHCP DISCOVERY or REQUEST packet. We have a community call scheduled today, let me know if you want to join to discuss.
from alcor-control-agent.
Hi @er1cthe0ne @cj-chung
One question: Shoud OVS_Controller use dhcp_programming_if to call dhcps_recv?
I mean shoudl dhcp_programming_if encapulate calling from dataplane the same as management plane?
from alcor-control-agent.
Hi @cj-chung @er1cthe0ne ,
Added dhcp openflow rule procedure.
Here is the API James may use: ACA_Dhcp_Server::get_instance().dhcps_recv(dhcp_payload).
@cj-chung , please guid me to build the whole project locally.
Shoud i re-generate the makefile use CMake then make it?
from alcor-control-agent.
Hi @er1cthe0ne @cj-chung
One question: Shoud OVS_Controller use dhcp_programming_if to call dhcps_recv?
I mean shoudl dhcp_programming_if encapulate calling from dataplane the same as management plane?
yes, please use dhcp_programming_if as the interface for other modules to communicate.
from alcor-control-agent.
Hi @cj-chung @er1cthe0ne ,
Added dhcp openflow rule procedure.
Here is the API James may use: ACA_Dhcp_Server::get_instance().dhcps_recv(dhcp_payload).
Will let @cj-chung to point out where is change the code to call the DHCP code above. The idea is to let @w2520n2520 make that change in this PR :)
@cj-chung , please guid me to build the whole project locally.
Shoud i re-generate the makefile use CMake then make it?
To build it inside the docker container, the dockerfile should have all the necessary steps to get all the depencenies during container creation: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/build/Dockerfile
To build it on a physical machine or VM, take a look at the machine bring up script, you can run all the steps except the last step 8 and 9: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/build/aca-machine-init.sh
from alcor-control-agent.
Hi @cj-chung ,
Please have a look at error below, how did this come? No change since 10 July.
[ 95%] Linking CXX executable ../build/tests/aca_tests
../src/libAlcorControlAgentLib.a(aca_ovs_control.cpp.o): In function aca_ovs_control::ACA_OVS_Control::control()': /mnt/host/code/src/ovs/aca_ovs_control.cpp:59: undefined reference to
g_ofctl_target[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:60: undefined reference to g_ofctl_target[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:60: undefined reference to
g_ofctl_target[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:61: undefined reference to g_ofctl_target[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:63: undefined reference to
g_ofctl_options[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:64: undefined reference to g_ofctl_options[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:64: undefined reference to
g_ofctl_options[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:65: undefined reference to g_ofctl_options[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:67: undefined reference to
g_ofctl_command[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:69: undefined reference to g_ofctl_command[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:71: undefined reference to
g_ofctl_command[abi:cxx11]'
from alcor-control-agent.
Hi @cj-chung ,
Please have a look at error below, how did this come? No change since 10 July.
[ 95%] Linking CXX executable ../build/tests/aca_tests
../src/libAlcorControlAgentLib.a(aca_ovs_control.cpp.o): In functionaca_ovs_control::ACA_OVS_Control::control()': /mnt/host/code/src/ovs/aca_ovs_control.cpp:59: undefined reference to
g_ofctl_target[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:60: undefined reference tog_ofctl_target[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:60: undefined reference to
g_ofctl_target[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:61: undefined reference tog_ofctl_target[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:63: undefined reference to
g_ofctl_options[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:64: undefined reference tog_ofctl_options[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:64: undefined reference to
g_ofctl_options[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:65: undefined reference tog_ofctl_options[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:67: undefined reference to
g_ofctl_command[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:69: undefined reference tog_ofctl_command[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:71: undefined reference to
g_ofctl_command[abi:cxx11]'
@w2520n2520 just checking, did you change aca_tests.cpp? Somehow I don't see this error in my environment.
from alcor-control-agent.
No. I have doubt too.
from alcor-control-agent.
@w2520n2520 Did you see these error messages in your local environment when you compile it? If you have latest aca build in your local, it shouldn't have any aca_ovs_control function calls in /tests/gtests/aca_tests.cpp.
If you cannot bypass it, you can just add those global variables in /tests/gtests/aca_tests.cpp like:
string g_ofctl_command = EMPTY_STRING;
string g_ofctl_target = EMPTY_STRING;
string g_ofctl_options = EMPTY_STRING;
from alcor-control-agent.
@w2520n2520 - we have an open source meeting scheduled: Monday, Aug 17, 2020 06:30 PM Pacific Time (US and Canada), you are welcome to join and raise any questions you may have.
from alcor-control-agent.
@w2520n2520 Did you see these error messages in your local environment when you compile it? If you have latest aca build in your local, it shouldn't have any aca_ovs_control function calls in /tests/gtests/aca_tests.cpp.
If you cannot bypass it, you can just add those global variables in /tests/gtests/aca_tests.cpp like:string g_ofctl_command = EMPTY_STRING; string g_ofctl_target = EMPTY_STRING; string g_ofctl_options = EMPTY_STRING;
I think this is the reason:
If you get linker errors about undefined references to symbols that involve types in the std::__cxx11 namespace or the tag [abi:cxx11] then it probably indicates that you are trying to link together object files that were compiled with different values for the _GLIBCXX_USE_CXX11_ABI macro. This commonly happens when linking to a third-party library that was compiled with an older version of GCC. If the third-party library cannot be rebuilt with the new ABI then you will need to recompile your code with the old ABI.
https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html
Solving:
https://stackoverflow.com/questions/55406770/gcc-undefined-references-with-abicxx11
But need cmake minimal version 3.12.4. Tried this in CMakeLists.txt but the CI ENV seems unable to satisfy (3.10.2)
Any idea? @er1cthe0ne @cj-chung
from alcor-control-agent.
@w2520n2520 - allow me to suggest a few things, let me know if that make sense.
First thing is to setup a local compiling environment:
https://github.com/futurewei-cloud/alcor-control-agent/blob/master/src/README.md
cd ~/dev/alcor-control-agent
./build/build.sh
Once you have the build container setup, you can enter the docker container and rebuild ACA anytime:
docker exec -it a1 /bin/bash
cd /mnt/host/code && cmake . && make
If we don't want to use containers to build, an alternate approach is to setup the physical machine for building and running, please see ./build/aca-machine-init.sh on how to setup the dependencies
Since @chenpiaoping is looking into ACA, maybe he can give a hand on it.
Once you have the local build setup, we can resolve the issues quickly. If there is a need to update the cmake version on our CI to 3.12.4, we can make that modification in our CI environment assuming that's the solution to resolve all the compiling issues.
from alcor-control-agent.
Tried in local env, same issue.
from alcor-control-agent.
Tried in local env, same issue.
Let's update your local environment's cmake version to 3.12.4 or higher, apply the fix you tried previously on CMakeLists.txt and see if that would address the issues. Please show us the error message so that we can take a look.
from alcor-control-agent.
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11")
-- Using protobuf
-- Found OpenSSL: /usr/lib/x86_64-linux-gnu/libcrypto.so (found version "1.1.1")
-- Using gRPC 1.24.3
-- Found Protobuf: /usr/local/lib/libprotobuf.a;-lpthread (found version "3.8.0")
-- Found Threads: TRUE
-- Found Protobuf: /usr/local/bin/protoc-3.8.0.0 (found version "3.8.0.0")
-- Using protobuf
-- Using gRPC 1.24.3
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Using protobuf
-- Using gRPC 1.24.3
-- Found GTest: /usr/local/lib/libgtest.so
-- Configuring done
-- Generating done
-- Build files have been written to: /mnt/host/code
[ 2%] Generating goalstateprovisioner.pb.cc, goalstateprovisioner.pb.h, goalstateprovisioner.grpc.pb.cc, goalstateprovisioner.grpc.pb.h
Scanning dependencies of target grpc
[ 4%] Building CXX object src/grpc/CMakeFiles/grpc.dir/goalstateprovisioner.pb.cc.o
:0:1: error: macro names must be identifiers
src/grpc/CMakeFiles/grpc.dir/build.make:94: recipe for target 'src/grpc/CMakeFiles/grpc.dir/goalstateprovisioner.pb.cc.o' failed
make[2]: *** [src/grpc/CMakeFiles/grpc.dir/goalstateprovisioner.pb.cc.o] Error 1
CMakeFiles/Makefile2:262: recipe for target 'src/grpc/CMakeFiles/grpc.dir/all' failed
make[1]: *** [src/grpc/CMakeFiles/grpc.dir/all] Error 2
Makefile:102: recipe for target 'all' failed
make: *** [all] Error 2
from alcor-control-agent.
Hi @w2520n2520 and @gure,
I was able to get your branch to compile, please see the below steps.
- Revert the change in CMakeList.txt so that it look like this:
cmake_minimum_required(VERSION 3.10)
project(AlcorControlAgent)
# Set the version number.
set(CMAKE_BUILD_TYPE Debug)
set(CMAKE_CXX_STANDARD 14)
set(CPPKAFKA_VERSION_MAJOR 0)
set(CPPKAFKA_VERSION_MINOR 3)
set(CPPKAFKA_VERSION_REVISION 1)
set(CPPKAFKA_VERSION "${CPPKAFKA_VERSION_MAJOR}.${CPPKAFKA_VERSION_MINOR}.${CPPKAFKA_VERSION_REVISION}")
set(RDKAFKA_MIN_VERSION 0x00090400)
#add_compile_options(-O0) # enable no optimization during development
add_compile_options(-Wall)
#add_compile_definitions(-D_GLIBCXX_USE_CXX11_ABI=0)
add_subdirectory(src)
add_subdirectory(test)
-
add the below global variables under test/gtest/aca_tests.cpp and test/func_tests/gs_test.cpp as mentioned previously:
string g_ofctl_command = EMPTY_STRING;
string g_ofctl_target = EMPTY_STRING;
string g_ofctl_options = EMPTY_STRING; -
run "cmake ." and then make:
root@28abfb290c2e:/mnt/host/code/aca-dhcp# make
[ 8%] Built target grpc
[ 54%] Built target proto
[ 86%] Built target AlcorControlAgentLib
[ 91%] Built target AlcorControlAgent
[ 95%] Built target aca_tests
Scanning dependencies of target gs_tests
[ 97%] Building CXX object test/CMakeFiles/gs_tests.dir/func_tests/gs_tests.cpp.o
[100%] Linking CXX executable ../build/tests/gs_tests
[100%] Built target gs_tests
from alcor-control-agent.
Hi @er1cthe0ne , @Gzure
Adding g_ofctl_command to both gtest and functest makes the compilation work.
I may figure out the reason of this issue:
- Executable aca_test depends on AlcorControlAgentLib which compile source file including aca_ovs_control, which has declaration of g_ofctl_command.
- Linking error was delayed until executable aca_test was linked and resolved.
- Executable AlcorControlAgent was ok because the it contained the g_ofctl_command definition.
Would it be possible that g_ofctl_command is self-contained inside AlcorControlAgentLib since it is a lib?
from alcor-control-agent.
I am thinking about to remove it, on issue #120 number 4 point, I am suggesting to remove g_ofctl_command since we may not need it.
from alcor-control-agent.
All related unit test passed. Request to merge. @er1cthe0ne @cj-chung
from alcor-control-agent.
@w2520n2520 @gure, please reference to this script for physical machine setup of ACA:
https://github.com/futurewei-cloud/alcor-control-agent/blob/master/build/aca-machine-init.sh
from alcor-control-agent.
Hi @er1cthe0ne
ovs_control.packet_in
-->
monitor_vconn
--> monitor
-->control() ----------> has no caller
@Gzure and I do this for testing:
//ACA_OVS_Control::get_instance().monitor("br-tun", "resume");
ACA_OVS_Control::get_instance().monitor("br-int", "resume");
And B.T.W, why only one monitor is allowed?
from alcor-control-agent.
Hi @er1cthe0ne
ovs_control.packet_in
-->
monitor_vconn
--> monitor
-->control() ----------> has no caller
Hi @w2520n2520, I am not sure I understand the concern. Can you tell me what is your question?
And B.T.W, why only one monitor is allowed?
This could be a limitation based on the OVS code we use, but I don't think it is a blocking issue because we would only monitor br-int for the scenarios we defined. @cj-chung to correct me if I am wrong.
from alcor-control-agent.
void ACA_OVS_Control::parse_packet(void *packet)
{
aca_dhcp_server::ACA_Dhcp_Server::get_instance().dhcps_recv()
}
void OVS_Control::monitor_vconn()
{
ACA_OVS_Control::get_instance().parse_packet(pin.packet)
}
void OVS_Control::monitor(const char *bridge, const char *opt)
{
monitor_vconn(vconn, true, resume_continuations, bridge)
}
4.1
int ACA_OVS_Control::control()
{
monitor(target, options);
}
4.2
int main()
{
ACA_OVS_Control::get_instance().monitor("br-tun", "resume");
}
Since we didn't find the caller of control so we change the entrance in main to br-int to debug packet procedure.
from alcor-control-agent.
Yes. that's correct call stack.
The current monitor in ACA_OVS_Control is daemonized but not multiple threads yet. So I think one ACA instance can only has 1 monitor channel.
from alcor-control-agent.
Hi @cj-chung ,
One question:
For packet-out procedure, we observe br-tun's TX keeps increasing but no packet seen in tcp-dump. So we changed actions to "output:8" but no luck.
The calling of below has no error.
error = parse_ofp_packet_out_str(&po, options,
ports_to_accept(bridge),
tables_to_accept(bridge),
&usable_protocols);
Whether another flow should be installed for packet-replying-from-server-to-client?
//bridge = "br-int" opts = "in_port=controller packet=<hex-string> actions=normal"
aca_ovs_control::ACA_OVS_Control::get_instance().packet_out(bridge.c_str(),
options.c_str());
In a word, we have no error seen in code flow now but no packet-out observed on network. We may use your help to figure it out. Thanks. @Gzure @er1cthe0ne
from alcor-control-agent.
Hi @cj-chung ,
One question:
For packet-out procedure, we observe br-tun's TX keeps increasing but no packet seen in tcp-dump. So we changed actions to "output:8" but no luck.
The calling of below has no error.error = parse_ofp_packet_out_str(&po, options, ports_to_accept(bridge), tables_to_accept(bridge), &usable_protocols);
Whether another flow should be installed for packet-replying-from-server-to-client?
//bridge = "br-int" opts = "in_port=controller packet=<hex-string> actions=normal" aca_ovs_control::ACA_OVS_Control::get_instance().packet_out(bridge.c_str(), options.c_str());
In a word, we have no error seen in code flow now but no packet-out observed on network. We may use your help to figure it out. Thanks. @Gzure @er1cthe0ne
The "in_port" indicates where the packet sent to, so the packet should be sent to controller. If you use tcpdump to capture packets on br-tun or br-int, you should able to see the packet on these bridges.
You can use the following command to test the packet-out function:
./build/bin/AlcorControlAgent -c packet-out -t br-int -o "in_port=controller packet=02AC10FF002202AC10FF001108004500001C000100000A015A9DAC10FF0BAC10FF160800F7FF00000000 actions=normal"
and use tcpdump -i br-int -v
on ovs, you should able to capture the packet.
from alcor-control-agent.
Hi @cj-chung @er1cthe0ne ,
Packet-Out Syntax
packet=hex-string
The actual packet to send, expressed as a string of hexadecimal
bytes. This field is required.
http://www.openvswitch.org/support/dist-docs/ovs-ofctl.8.txt
It seems this command only send "actual packet" which means dhcp needs to encap the whole packet from app-to-eth instead of dhcp payload.
Am I right?
from alcor-control-agent.
Hi @cj-chung @er1cthe0ne ,
Packet-Out Syntax
packet=hex-string
The actual packet to send, expressed as a string of hexadecimal
bytes. This field is required.
http://www.openvswitch.org/support/dist-docs/ovs-ofctl.8.txtIt seems this command only send "actual packet" which means dhcp needs to encap the whole packet from app-to-eth instead of dhcp payload.
Am I right?
@w2520n2520 Yes. You need a whole packet for the hex string. Since I just directly send the packet string to OVS.
from alcor-control-agent.
Related Issues (20)
- [Test bug] DISABLED_l2_arp_test_one_machine not passing HOT 2
- ACA Segment Faulting when started after busybox container is started HOT 3
- ACA create tap device failed in OpenStack/Nova environment. HOT 11
- [Deployment] Create Ansible script to automatically deploy ACA on multiple nodes.
- Neighbor configuration error when ACA create or update L3 neighbor. HOT 11
- Need subnet state in order to configure the router.
- Can/should ACA configure gateway when only subnet state is provided? HOT 3
- Wrong return code when adding/updating a router with multiple subnets.
- ACA Crashes when Processing a Large Number of GoalStates at (Almost) the Same Time. HOT 6
- ACA keeps running when br-tun exists but br-tun does not. HOT 1
- [Improvement] Refactor state computation/orchestration layer HOT 1
- [Improvement] Driver communication layer HOT 3
- Comparing performance for synchronous and asynchronous gRPC servers on ACA. HOT 1
- [Enhancement] Add support for GoalState V1 in the gRPC server.
- ACA crashes when executing Pulsar code
- Test Extreme Case for Routing Rule Update
- Investigate High Memory Usage of ACA when Testing 1 Million Neighbors
- Deadlock in of_controller
- Test Routing Rule update using GoalStateV2
- Port deletion got dhcp update error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alcor-control-agent.