Giter Club home page Giter Club logo

Comments (55)

w2520n2520 avatar w2520n2520 commented on August 16, 2024 1

Hi Liguang and Eric,
I didn't find the packet parsing if in the interface class and neither from the arch diagram. So whether this dhcp server need to handle scenarios for incoming packet from network? Naive question. Thanks.
@xieus @er1cthe0ne

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024 1

Hi Eric,
I'm trying to meet that goal. I'll keep updating you.

from alcor-control-agent.

cj-chung avatar cj-chung commented on August 16, 2024 1

Hi @cj-chung @er1cthe0ne ,
Added dhcp openflow rule procedure.
Here is the API James may use: ACA_Dhcp_Server::get_instance().dhcps_recv(dhcp_payload).

Will let @cj-chung to point out where is change the code to call the DHCP code above. The idea is to let @w2520n2520 make that change in this PR :)

@cj-chung , please guid me to build the whole project locally.
Shoud i re-generate the makefile use CMake then make it?

To build it inside the docker container, the dockerfile should have all the necessary steps to get all the depencenies during container creation: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/build/Dockerfile

To build it on a physical machine or VM, take a look at the machine bring up script, you can run all the steps except the last step 8 and 9: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/build/aca-machine-init.sh

I think @er1cthe0ne has replied most of questions. For the place in the aca_ovs_control.cpp to call dhcp_server, you can modify the codes between #205-#221 in the aca_ovs_control.cpp. The payload in the #210 is the udp (dhcp) payload. You can just call ACA_Dhcp_Server::get_instance().dhcps_recv(payload) here and send the payload to the function.

from alcor-control-agent.

xieus avatar xieus commented on August 16, 2024

Item 1 is done. Design doc link: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/docs/dhcp_programming.adoc

from alcor-control-agent.

xieus avatar xieus commented on August 16, 2024

Item 2 is under review: futurewei-cloud/alcor#193

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

I'm interesting in this issue, and I have some experence in network stack developing. May this issue assigned to me? Thanks.

from alcor-control-agent.

xieus avatar xieus commented on August 16, 2024

@w2520n2520 Absolutely, and thank you! This issue has been assigned.

from alcor-control-agent.

xieus avatar xieus commented on August 16, 2024

Update to Item 2: PR futurewei-cloud/alcor#193 has been merged to alcor/master.

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi Liguang and Eric,
I didn't find the packet parsing if in the interface class and neither from the arch diagram. So whether this dhcp server need to handle scenarios for incoming packet from network? Naive question. Thanks.
@xieus @er1cthe0ne

Hi @w2520n2520 - you asked the right question and on the right track. This dhcp server needs to intercept the dhcp packets using openflow rules, parse it and reply with DHCP_OFFER and later DHCP_ACK message. More information is available in the reference session in the design doc: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/docs/dhcp_programming.adoc

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Running alcor-control-agent and tests
You can run the test (optional):
root@ca62b6feec63:/mnt/host/code/alcor-control-agent# ./build/tests/aca_tests

When building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve.

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Running alcor-control-agent and tests
You can run the test (optional):
root@ca62b6feec63:/mnt/host/code/alcor-control-agent# ./build/tests/aca_tests

When building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve.

Did you encounter this issue of "libgtest.so can't open or doesn't exist"? The intent is to run aca_tests inside the build container which has all the dependency setup already.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Followed build and execution

Running alcor-control-agent and tests
You can run the test (optional):
root@ca62b6feec63:/mnt/host/code/alcor-control-agent# ./build/tests/aca_tests

When building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve.

Did you encounter this issue of "libgtest.so can't open or doesn't exist"? The intent is to run aca_tests inside the build container which has all the dependency setup already.

Well, build and test should be executed in the generated docker "a1", my misunderstanding.
I can got 18 tests passed but still fail to run the bin.

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Followed build and execution

Running alcor-control-agent and tests
You can run the test (optional):
root@ca62b6feec63:/mnt/host/code/alcor-control-agent# ./build/tests/aca_tests

When building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve.

Did you encounter this issue of "libgtest.so can't open or doesn't exist"? The intent is to run aca_tests inside the build container which has all the dependency setup already.

Well, build and test should be executed in the generated docker "a1", my misunderstanding.
I can got 18 tests passed but still fail to run the bin.

Seeing 18 tests passed on the unit/functional test is good enough for now. What kind of error do you see when you run ./build/bin/AlcorControlAgent? It will try to connect to kafka so those error maybe expected if kafka was not setup.

The next step on DHCP implementation is to develop a standalone DHCP application based on this design. We can do the integration to AlcorControlAgent later.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Thanks

Followed build and execution

Running alcor-control-agent and tests
You can run the test (optional):
root@ca62b6feec63:/mnt/host/code/alcor-control-agent# ./build/tests/aca_tests

When building you may encounter for "libgtest.so can't open or doesn't exist" issue, please refer https://blog.csdn.net/bocksong/article/details/93207753 to resolve.

Did you encounter this issue of "libgtest.so can't open or doesn't exist"? The intent is to run aca_tests inside the build container which has all the dependency setup already.

Well, build and test should be executed in the generated docker "a1", my misunderstanding.
I can got 18 tests passed but still fail to run the bin.

Seeing 18 tests passed on the unit/functional test is good enough for now. What kind of error do you see when you run ./build/bin/AlcorControlAgent? It will try to connect to kafka so those error maybe expected if kafka was not setup.

The next step on DHCP implementation is to develop a standalone DHCP application based on this design. We can do the integration to AlcorControlAgent later.

Thanks Eric. Just trying to build up my working ground here.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi Eric,
One question below:
int Aca_Comm_Manager::update_goal_state()
{
update_vpc_states();
update_subnet_states();
update_port_states();
update_dhcp_states(); //to be
}

So these resources will always be updates together? Any chance they can be updated independently? Thanks. @er1cthe0ne

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi Eric,
One question below:
int Aca_Comm_Manager::update_goal_state() { update_vpc_states(); update_subnet_states(); update_port_states(); update_dhcp_states(); //to be }

So these resources will always be updates together? Any chance they can be updated independently? Thanks. @er1cthe0ne

Hi Nan Wu,

Good question, the GoalState message contains:

  • 0 to N vpc_states
  • 0 to N subnet_states
  • 0 to N port_states
  • 0 to N security_group_states
  • 0 to N dhcp_states

Aca_Comm_Manager will try to update the whole GoalState in an efficient manner.

For DHCP create, the likely GoalState message would look like:

  • 1 port_states, OperationType::CREATE - create/configure a new port
  • 1 dhcp_states, OperationType::CREATE - create the DHCP info for the new port

Or DHCP update, it could look like:

  • 1 dhcp_states, OperationType::UPDATE - update the DHCP info for a port

Does it make sense? Let me know if you have other questions. @w2520n2520

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

The next step on DHCP implementation is to develop a standalone DHCP application based on this design. We can do the integration to AlcorControlAgent later.

Hi Nan Wu,

Do you think you can have the standalone DHCP application available in a few weeks? It would be great if we can complete the integration into AlcorControlAgent by the month of June. @w2520n2520

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

The next step on DHCP implementation is to develop a standalone DHCP application based on this design. We can do the integration to AlcorControlAgent later.

Hi Nan Wu,

Do you think you can have the standalone DHCP application available in a few weeks? It would be great if we can complete the integration into AlcorControlAgent by the month of June. @w2520n2520

Hi Nan Wu,

Checking in here. Do you think we can meet the target of June to have a standalone DHCP application based on this design and integrate it with AlcorControlAgent? Let me know. @w2520n2520

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi Nan Wu,

Checking in here and see if there is anything I can help. Maybe we can breakdown the standalone DHCP application task into smaller pieces? e.g.:

  1. basic framework on the application, command line parsing but doesn't need to be fancy.
  2. Implement DHCP handler class inherit from Dhcp_Programming_Interface in https://github.com/futurewei-cloud/alcor-control-agent/blob/master/include/aca_dhcp_programming_if.h
  3. program the openflow rule to route DHCP packets into and out of the DHCP application
  4. parsing of the input parameter (comes from goalstate message) to DHCP handler class
  5. determine the needed DHCP actions within the DHCP application
  6. unit test infrastructure and test cases

How does it sound? @w2520n2520

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi Nan Wu,

Checking in here and see if there is anything I can help. Maybe we can breakdown the standalone DHCP application task into smaller pieces? e.g.:

  1. basic framework on the application, command line parsing but doesn't need to be fancy.
  2. Implement DHCP handler class inherit from Dhcp_Programming_Interface in https://github.com/futurewei-cloud/alcor-control-agent/blob/master/include/aca_dhcp_programming_if.h
  3. program the openflow rule to route DHCP packets into and out of the DHCP application
  4. parsing of the input parameter (comes from goalstate message) to DHCP handler class
  5. determine the needed DHCP actions within the DHCP application
  6. unit test infrastructure and test cases

How does it sound? @w2520n2520

Hi Eric,
Actually I've done about 4th item, dhcp handler part. I'm working on the 2nd and 3rd items. But i have doubts on them.
Per my understanding, here is the code flow for state msg: consumer->comm_mgr-->update_goal-->dhcp_state_handler(newly_added)-->dhcp_prog_if--------??--------->dhcp_server

Q1: How should i put dhcp_server? Should it be in a independent thread or run in the same one with aca_main?(maybe not a good idea). About the "??" part, net_handler use rpc to talk to transit_daemon of mizar, but dhcp_server is supposed to be on the same node, so rpc may be not necessary here, but again network dhcp-server will be on different node, the same comm way will benefit. I have limited understanding about alcor-agent's whole design behind, I may need your involvement here.

Q2: How is like the code flow for 3rd item? Didn't find the if for packet_in under current src dir.

Thanks for your guidance and help.
@er1cthe0ne

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi Nan Wu,

Thanks for the questions, I will answer it one by one. Do let me know if you have other questions.

Should it be in a independent thread
Great question, it should be independent thread spin up by aca_main. We will implement it during integration with DHCP standalone app into ACA.

About the "??"
After integration, the DHCP code will be part of ACA running in another thread, so no RPC is needed. You can check out https://github.com/futurewei-cloud/alcor-control-agent/blob/164a8a7cbad1f3b46c0d0592d11df875f192326d/include/aca_dataplane_ovs.h as an example to how to consume an ACA programming interface.

network dhcp-server will be on different node
It will by driven by ACA running on that node in the future, so same communication flow from Alcor controller which sends down goal state message to ACA.

How is like the code flow for 3rd item?
Can you tell me which specific code flow? I want to give you the right information. Are you talking about the openflow rule programming, or how to provide the right DHCP response back to the VM?

Thanks,
Eric

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi Eric,

Thanks for the reply.
Still have further questions, may need more your time, trying to understand the design here. :)

Should it be in a independent thread
Great question, it should be independent thread spin up by aca_main. We will implement it during integration with DHCP standalone app into ACA.

[Nan]: OK. I thought i was supposed to start from here. We can do it later.

network dhcp-server will be on different node
It will by driven by ACA running on that node in the future, so same communication flow from Alcor controller which sends down goal state message to ACA.

[Nan]: No, I mean the packet_in flow here instead of the control message flow(goal state). In the dhcp design doc, it mentioned openflow table rules will be used to transfer dhcp packets to dhcp-server. The question is if the dataplane is mizar, there will be no openflow tables right? Another one is, if openflow table is used, there will be two flows--one for local dhcp-server, the other is for network-dhcp-server with low priority. When the local one fails, so should its corresponding flow, so packet will be transfer to the network one.
Is this understanding correct? Still confused about the packet_in_handler flow here.

How is like the code flow for 3rd item?
Can you tell me which specific code flow? I want to give you the right information. Are you talking about the openflow rule programming, or how to provide the right DHCP response back to the VM?

[Nan]: Yes, about the openflow rule programming part.
@er1cthe0ne

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi Nan Wu,

Still have further questions, may need more your time, trying to understand the design here. :)

No problem, feel free to ask :)

Another one is, if openflow table is used, there will be two flows--one for local dhcp-server, the other is for network-dhcp-server with low priority. When the local one fails, so should its corresponding flow, so packet will be transfer to the network one.

The current focus is OVS dataplane, and the current design only support one dataplane per host.

The backup network-dhcp-server is used when local ACA is down, and it didn't have a chance to setup the local-dhcp-server flow. In the event if ACA exit gracefully, it should remove the local-dhcp-server flow. If ACA exit unexpectedly, it will try to restart a few times and if ACA really cannot get back to running state. Alcor controller would detect it and perform corrective actions.

In summary, I am not sure how both local-dhcp-server and network-dhcp-server flow works at the same time since one of them will be used based on priority. Unless we set a timeout on local-dhcp-server flow but then ACA will need to keep renewing it.

Still confused about the packet_in_handler flow here.

Did I answer your question above? Let me know.

[Nan]: Yes, about the openflow rule programming part.

Ok, please go ahead and execute system call for now (see execute_system_command). ACA will be adding better openflow client support in the future (per current design) and then DHCP code can leverage that when ready.

Hope all of them make sense to you.

BTW, once you have some code implemented, it will be great to send a PR so that we can look at and discuss if needed. @w2520n2520

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

More information on packet_in_handler flow. In order to have DHCP packets send to ACA, we will need to implement an openflow controller, and have an openflow rule send the matched DHCP packets to openflow controller, that's ACA in our case.

We may use something similar to ovs-ofctl implementation, which acks as an openflow controller. Below is an experiment to show that it should work:

root@fw0016589: ping -I 192.168.0.131 -c1 192.168.0.124
PING 192.168.0.124 (192.168.0.124) from 192.168.0.131 : 56(84) bytes of data.
64 bytes from 192.168.0.124: icmp_seq=1 ttl=64 time=0.348 ms

Br-int is letting all the traffic go now:

root@fw0016589: ovs-ofctl dump-flows br-int
cookie=0x0, duration=699.025s, table=0, n_packets=140, n_bytes=15059, priority=0 actions=NORMAL

Adding new openflow rule to send all packet to CONTROLLER, that’s ovs-ofctl for this case:
root@fw0016589: ovs-ofctl add-flow br-int "table=0, priority=100, actions=CONTROLLER"
root@fw0016589: ovs-ofctl dump-flows br-int
cookie=0x0, duration=786.163s, table=0, n_packets=140, n_bytes=15059, priority=0 actions=NORMAL
cookie=0x0, duration=4.482s, table=0, n_packets=0, n_bytes=0, priority=100 actions=CONTROLLER:65535

Ping doesn’t work anymore because the packets has been sent to CONTROLLER!
root@fw0016589: ping -I 192.168.0.131 -c1 192.168.0.124
PING 192.168.0.124 (192.168.0.124) from 192.168.0.131 : 56(84) bytes of data.

--- 192.168.0.124 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

Printed out by ovs-ofctl!
root@fw0016589: ovs-ofctl monitor br-int 1
NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=98 in_port=int0 (via action) data_len=98 (unbuffered)
icmp,vlan_tci=0x0000,dl_src=ee:c3:0f:ee:c3:46,dl_dst=36:f2:97:d5:3a:b9,nw_src=192.168.0.131,nw_dst=192.168.0.124,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0 icmp_csum:947d
NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=42 in_port=int0 (via action) data_len=42 (unbuffered)
arp,vlan_tci=0x0000,dl_src=ee:c3:0f:ee:c3:46,dl_dst=36:f2:97:d5:3a:b9,arp_spa=192.168.0.131,arp_tpa=192.168.0.124,arp_op=1,arp_sha=ee:c3:0f:ee:c3:46,arp_tha=00:00:00:00:00:00
NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=42 in_port=int0 (via action) data_len=42 (unbuffered)
arp,vlan_tci=0x0000,dl_src=ee:c3:0f:ee:c3:46,dl_dst=36:f2:97:d5:3a:b9,arp_spa=192.168.0.131,arp_tpa=192.168.0.124,arp_op=1,arp_sha=ee:c3:0f:ee:c3:46,arp_tha=00:00:00:00:00:00
NXT_PACKET_IN2 (xid=0x0): cookie=0x0 total_len=42 in_port=int0 (via action) data_len=42 (unbuffered)
arp,vlan_tci=0x0000,dl_src=ee:c3:0f:ee:c3:46,dl_dst=36:f2:97:d5:3a:b9,arp_spa=192.168.0.131,arp_tpa=192.168.0.124,arp_op=1,arp_sha=ee:c3:0f:ee:c3:46,arp_tha=00:00:00:00:00:00
OFPT_ECHO_REQUEST (xid=0x0): 0 bytes of payload

The flow rules shows that the packets is going to CONTROLLER:
root@fw0016589: ovs-ofctl dump-flows br-int
cookie=0x0, duration=979.012s, table=0, n_packets=140, n_bytes=15059, priority=0 actions=NORMAL
cookie=0x0, duration=197.331s, table=0, n_packets=8, n_bytes=420, priority=100 actions=CONTROLLER:65535

@w2520n2520 - let me know if you have question on the approach or have a better suggestion.

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi Nan Wu,

Let me know if you have any outstanding questions. It will be good if you can join the Alcor community meeting tomorrow to discuss and sync up on the progress. @w2520n2520

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi Nan Wu,

Just checking to see how is the work going? Let me know if you have any questions. @w2520n2520

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi Eric,

I've submit a PR. Maybe you can help review it and give some suggestion.
Thank you.
@er1cthe0ne

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

@w2520n2520 - one of the action item is for your code to provide an interface to be called when ACA received a DHCP DISCOVERY or REQUEST packet. We have a community call scheduled today, let me know if you want to join to discuss.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi @er1cthe0ne @cj-chung
One question: Shoud OVS_Controller use dhcp_programming_if to call dhcps_recv?
I mean shoudl dhcp_programming_if encapulate calling from dataplane the same as management plane?

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi @cj-chung @er1cthe0ne ,
Added dhcp openflow rule procedure.
Here is the API James may use: ACA_Dhcp_Server::get_instance().dhcps_recv(dhcp_payload).

@cj-chung , please guid me to build the whole project locally.
Shoud i re-generate the makefile use CMake then make it?

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi @er1cthe0ne @cj-chung
One question: Shoud OVS_Controller use dhcp_programming_if to call dhcps_recv?
I mean shoudl dhcp_programming_if encapulate calling from dataplane the same as management plane?

yes, please use dhcp_programming_if as the interface for other modules to communicate.

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi @cj-chung @er1cthe0ne ,
Added dhcp openflow rule procedure.
Here is the API James may use: ACA_Dhcp_Server::get_instance().dhcps_recv(dhcp_payload).

Will let @cj-chung to point out where is change the code to call the DHCP code above. The idea is to let @w2520n2520 make that change in this PR :)

@cj-chung , please guid me to build the whole project locally.
Shoud i re-generate the makefile use CMake then make it?

To build it inside the docker container, the dockerfile should have all the necessary steps to get all the depencenies during container creation: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/build/Dockerfile

To build it on a physical machine or VM, take a look at the machine bring up script, you can run all the steps except the last step 8 and 9: https://github.com/futurewei-cloud/alcor-control-agent/blob/master/build/aca-machine-init.sh

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi @cj-chung ,
Please have a look at error below, how did this come? No change since 10 July.
[ 95%] Linking CXX executable ../build/tests/aca_tests
../src/libAlcorControlAgentLib.a(aca_ovs_control.cpp.o): In function aca_ovs_control::ACA_OVS_Control::control()': /mnt/host/code/src/ovs/aca_ovs_control.cpp:59: undefined reference to g_ofctl_target[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:60: undefined reference to g_ofctl_target[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:60: undefined reference to g_ofctl_target[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:61: undefined reference to g_ofctl_target[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:63: undefined reference to g_ofctl_options[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:64: undefined reference to g_ofctl_options[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:64: undefined reference to g_ofctl_options[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:65: undefined reference to g_ofctl_options[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:67: undefined reference to g_ofctl_command[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:69: undefined reference to g_ofctl_command[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:71: undefined reference to g_ofctl_command[abi:cxx11]'

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi @cj-chung ,
Please have a look at error below, how did this come? No change since 10 July.
[ 95%] Linking CXX executable ../build/tests/aca_tests
../src/libAlcorControlAgentLib.a(aca_ovs_control.cpp.o): In function aca_ovs_control::ACA_OVS_Control::control()': /mnt/host/code/src/ovs/aca_ovs_control.cpp:59: undefined reference to g_ofctl_target[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:60: undefined reference to g_ofctl_target[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:60: undefined reference to g_ofctl_target[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:61: undefined reference to g_ofctl_target[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:63: undefined reference to g_ofctl_options[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:64: undefined reference to g_ofctl_options[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:64: undefined reference to g_ofctl_options[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:65: undefined reference to g_ofctl_options[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:67: undefined reference to g_ofctl_command[abi:cxx11]'
/mnt/host/code/src/ovs/aca_ovs_control.cpp:69: undefined reference to g_ofctl_command[abi:cxx11]' /mnt/host/code/src/ovs/aca_ovs_control.cpp:71: undefined reference to g_ofctl_command[abi:cxx11]'

@w2520n2520 just checking, did you change aca_tests.cpp? Somehow I don't see this error in my environment.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

No. I have doubt too.

from alcor-control-agent.

cj-chung avatar cj-chung commented on August 16, 2024

@w2520n2520 Did you see these error messages in your local environment when you compile it? If you have latest aca build in your local, it shouldn't have any aca_ovs_control function calls in /tests/gtests/aca_tests.cpp.
If you cannot bypass it, you can just add those global variables in /tests/gtests/aca_tests.cpp like:

string g_ofctl_command = EMPTY_STRING;
string g_ofctl_target = EMPTY_STRING;
string g_ofctl_options = EMPTY_STRING;

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

@w2520n2520 - we have an open source meeting scheduled: Monday, Aug 17, 2020 06:30 PM Pacific Time (US and Canada), you are welcome to join and raise any questions you may have.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

@w2520n2520 Did you see these error messages in your local environment when you compile it? If you have latest aca build in your local, it shouldn't have any aca_ovs_control function calls in /tests/gtests/aca_tests.cpp.
If you cannot bypass it, you can just add those global variables in /tests/gtests/aca_tests.cpp like:

string g_ofctl_command = EMPTY_STRING;
string g_ofctl_target = EMPTY_STRING;
string g_ofctl_options = EMPTY_STRING;

I think this is the reason:

If you get linker errors about undefined references to symbols that involve types in the std::__cxx11 namespace or the tag [abi:cxx11] then it probably indicates that you are trying to link together object files that were compiled with different values for the _GLIBCXX_USE_CXX11_ABI macro. This commonly happens when linking to a third-party library that was compiled with an older version of GCC. If the third-party library cannot be rebuilt with the new ABI then you will need to recompile your code with the old ABI.
https://gcc.gnu.org/onlinedocs/libstdc++/manual/using_dual_abi.html

Solving:
https://stackoverflow.com/questions/55406770/gcc-undefined-references-with-abicxx11
But need cmake minimal version 3.12.4. Tried this in CMakeLists.txt but the CI ENV seems unable to satisfy (3.10.2)

Any idea? @er1cthe0ne @cj-chung

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

@w2520n2520 - allow me to suggest a few things, let me know if that make sense.

First thing is to setup a local compiling environment:
https://github.com/futurewei-cloud/alcor-control-agent/blob/master/src/README.md
cd ~/dev/alcor-control-agent
./build/build.sh
Once you have the build container setup, you can enter the docker container and rebuild ACA anytime:
docker exec -it a1 /bin/bash
cd /mnt/host/code && cmake . && make
If we don't want to use containers to build, an alternate approach is to setup the physical machine for building and running, please see ./build/aca-machine-init.sh on how to setup the dependencies

Since @chenpiaoping is looking into ACA, maybe he can give a hand on it.

Once you have the local build setup, we can resolve the issues quickly. If there is a need to update the cmake version on our CI to 3.12.4, we can make that modification in our CI environment assuming that's the solution to resolve all the compiling issues.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Tried in local env, same issue.

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Tried in local env, same issue.

Let's update your local environment's cmake version to 3.12.4 or higher, apply the fix you tried previously on CMakeLists.txt and see if that would address the issues. Please show us the error message so that we can take a look.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.11")
-- Using protobuf
-- Found OpenSSL: /usr/lib/x86_64-linux-gnu/libcrypto.so (found version "1.1.1")
-- Using gRPC 1.24.3
-- Found Protobuf: /usr/local/lib/libprotobuf.a;-lpthread (found version "3.8.0")
-- Found Threads: TRUE
-- Found Protobuf: /usr/local/bin/protoc-3.8.0.0 (found version "3.8.0.0")
-- Using protobuf
-- Using gRPC 1.24.3
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Using protobuf
-- Using gRPC 1.24.3
-- Found GTest: /usr/local/lib/libgtest.so
-- Configuring done
-- Generating done
-- Build files have been written to: /mnt/host/code
[ 2%] Generating goalstateprovisioner.pb.cc, goalstateprovisioner.pb.h, goalstateprovisioner.grpc.pb.cc, goalstateprovisioner.grpc.pb.h
Scanning dependencies of target grpc
[ 4%] Building CXX object src/grpc/CMakeFiles/grpc.dir/goalstateprovisioner.pb.cc.o
:0:1: error: macro names must be identifiers
src/grpc/CMakeFiles/grpc.dir/build.make:94: recipe for target 'src/grpc/CMakeFiles/grpc.dir/goalstateprovisioner.pb.cc.o' failed
make[2]: *** [src/grpc/CMakeFiles/grpc.dir/goalstateprovisioner.pb.cc.o] Error 1
CMakeFiles/Makefile2:262: recipe for target 'src/grpc/CMakeFiles/grpc.dir/all' failed
make[1]: *** [src/grpc/CMakeFiles/grpc.dir/all] Error 2
Makefile:102: recipe for target 'all' failed
make: *** [all] Error 2

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi @w2520n2520 and @gure,

I was able to get your branch to compile, please see the below steps.

  1. Revert the change in CMakeList.txt so that it look like this:
    cmake_minimum_required(VERSION 3.10)
    project(AlcorControlAgent)

# Set the version number.
set(CMAKE_BUILD_TYPE Debug)
set(CMAKE_CXX_STANDARD 14)
set(CPPKAFKA_VERSION_MAJOR 0)
set(CPPKAFKA_VERSION_MINOR 3)
set(CPPKAFKA_VERSION_REVISION 1)
set(CPPKAFKA_VERSION "${CPPKAFKA_VERSION_MAJOR}.${CPPKAFKA_VERSION_MINOR}.${CPPKAFKA_VERSION_REVISION}")
set(RDKAFKA_MIN_VERSION 0x00090400)

#add_compile_options(-O0) # enable no optimization during development
add_compile_options(-Wall)
#add_compile_definitions(-D_GLIBCXX_USE_CXX11_ABI=0)

add_subdirectory(src)
add_subdirectory(test)

  1. add the below global variables under test/gtest/aca_tests.cpp and test/func_tests/gs_test.cpp as mentioned previously:
    string g_ofctl_command = EMPTY_STRING;
    string g_ofctl_target = EMPTY_STRING;
    string g_ofctl_options = EMPTY_STRING;

  2. run "cmake ." and then make:
    root@28abfb290c2e:/mnt/host/code/aca-dhcp# make
    [ 8%] Built target grpc
    [ 54%] Built target proto
    [ 86%] Built target AlcorControlAgentLib
    [ 91%] Built target AlcorControlAgent
    [ 95%] Built target aca_tests
    Scanning dependencies of target gs_tests
    [ 97%] Building CXX object test/CMakeFiles/gs_tests.dir/func_tests/gs_tests.cpp.o
    [100%] Linking CXX executable ../build/tests/gs_tests
    [100%] Built target gs_tests

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi @er1cthe0ne , @Gzure

Adding g_ofctl_command to both gtest and functest makes the compilation work.
I may figure out the reason of this issue:

  1. Executable aca_test depends on AlcorControlAgentLib which compile source file including aca_ovs_control, which has declaration of g_ofctl_command.
  2. Linking error was delayed until executable aca_test was linked and resolved.
  3. Executable AlcorControlAgent was ok because the it contained the g_ofctl_command definition.

Would it be possible that g_ofctl_command is self-contained inside AlcorControlAgentLib since it is a lib?

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

I am thinking about to remove it, on issue #120 number 4 point, I am suggesting to remove g_ofctl_command since we may not need it.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

All related unit test passed. Request to merge. @er1cthe0ne @cj-chung

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

@w2520n2520 @gure, please reference to this script for physical machine setup of ACA:
https://github.com/futurewei-cloud/alcor-control-agent/blob/master/build/aca-machine-init.sh

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi @er1cthe0ne
ovs_control.packet_in
-->
monitor_vconn
--> monitor
-->control() ----------> has no caller

@Gzure and I do this for testing:
//ACA_OVS_Control::get_instance().monitor("br-tun", "resume");
ACA_OVS_Control::get_instance().monitor("br-int", "resume");
And B.T.W, why only one monitor is allowed?

from alcor-control-agent.

er1cthe0ne avatar er1cthe0ne commented on August 16, 2024

Hi @er1cthe0ne
ovs_control.packet_in
-->
monitor_vconn
--> monitor
-->control() ----------> has no caller

Hi @w2520n2520, I am not sure I understand the concern. Can you tell me what is your question?

And B.T.W, why only one monitor is allowed?

This could be a limitation based on the OVS code we use, but I don't think it is a blocking issue because we would only monitor br-int for the scenarios we defined. @cj-chung to correct me if I am wrong.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024
void ACA_OVS_Control::parse_packet(void *packet)
{
  aca_dhcp_server::ACA_Dhcp_Server::get_instance().dhcps_recv()
}
void OVS_Control::monitor_vconn()
{
  ACA_OVS_Control::get_instance().parse_packet(pin.packet)
}
void OVS_Control::monitor(const char *bridge, const char *opt)
{
  monitor_vconn(vconn, true, resume_continuations, bridge)
}

4.1

int ACA_OVS_Control::control()
{
  monitor(target, options);
}

4.2

int main()
{
  ACA_OVS_Control::get_instance().monitor("br-tun", "resume");
}

Since we didn't find the caller of control so we change the entrance in main to br-int to debug packet procedure.

from alcor-control-agent.

cj-chung avatar cj-chung commented on August 16, 2024

Yes. that's correct call stack.
The current monitor in ACA_OVS_Control is daemonized but not multiple threads yet. So I think one ACA instance can only has 1 monitor channel.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi @cj-chung ,

One question:
For packet-out procedure, we observe br-tun's TX keeps increasing but no packet seen in tcp-dump. So we changed actions to "output:8" but no luck.
The calling of below has no error.

error = parse_ofp_packet_out_str(&po, options,
                                         ports_to_accept(bridge),
                                         tables_to_accept(bridge),
                                         &usable_protocols);

Whether another flow should be installed for packet-replying-from-server-to-client?

//bridge = "br-int" opts = "in_port=controller packet=<hex-string> actions=normal"
aca_ovs_control::ACA_OVS_Control::get_instance().packet_out(bridge.c_str(),
                                                              options.c_str());

In a word, we have no error seen in code flow now but no packet-out observed on network. We may use your help to figure it out. Thanks. @Gzure @er1cthe0ne

from alcor-control-agent.

cj-chung avatar cj-chung commented on August 16, 2024

Hi @cj-chung ,

One question:
For packet-out procedure, we observe br-tun's TX keeps increasing but no packet seen in tcp-dump. So we changed actions to "output:8" but no luck.
The calling of below has no error.

error = parse_ofp_packet_out_str(&po, options,
                                         ports_to_accept(bridge),
                                         tables_to_accept(bridge),
                                         &usable_protocols);

Whether another flow should be installed for packet-replying-from-server-to-client?

//bridge = "br-int" opts = "in_port=controller packet=<hex-string> actions=normal"
aca_ovs_control::ACA_OVS_Control::get_instance().packet_out(bridge.c_str(),
                                                              options.c_str());

In a word, we have no error seen in code flow now but no packet-out observed on network. We may use your help to figure it out. Thanks. @Gzure @er1cthe0ne

The "in_port" indicates where the packet sent to, so the packet should be sent to controller. If you use tcpdump to capture packets on br-tun or br-int, you should able to see the packet on these bridges.

You can use the following command to test the packet-out function:
./build/bin/AlcorControlAgent -c packet-out -t br-int -o "in_port=controller packet=02AC10FF002202AC10FF001108004500001C000100000A015A9DAC10FF0BAC10FF160800F7FF00000000 actions=normal"

and use tcpdump -i br-int -v on ovs, you should able to capture the packet.

from alcor-control-agent.

w2520n2520 avatar w2520n2520 commented on August 16, 2024

Hi @cj-chung @er1cthe0ne ,

Packet-Out Syntax
packet=hex-string
The actual packet to send, expressed as a string of hexadecimal
bytes. This field is required.
http://www.openvswitch.org/support/dist-docs/ovs-ofctl.8.txt

It seems this command only send "actual packet" which means dhcp needs to encap the whole packet from app-to-eth instead of dhcp payload.
Am I right?

from alcor-control-agent.

cj-chung avatar cj-chung commented on August 16, 2024

Hi @cj-chung @er1cthe0ne ,

Packet-Out Syntax
packet=hex-string
The actual packet to send, expressed as a string of hexadecimal
bytes. This field is required.
http://www.openvswitch.org/support/dist-docs/ovs-ofctl.8.txt

It seems this command only send "actual packet" which means dhcp needs to encap the whole packet from app-to-eth instead of dhcp payload.
Am I right?

@w2520n2520 Yes. You need a whole packet for the hex string. Since I just directly send the packet string to OVS.

from alcor-control-agent.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.