Giter Club home page Giter Club logo

wimoved's Introduction

WiMoVE - Wireless Mobility through VXLAN EVPN

CI Status Linter Status GitHub license

WiMoVE is a scalable Wi-Fi System that partitions all stations in separate overlay L2 domains to limit the amount of wireless L2 broadcast traffic. In large Wi-Fi systems, broadcast traffic can take up large amounts of airtime. A great talk that explains the problem in more detail is available here. In WiMoVE, overlay L2 domains "follow" the stations, being resized on demand, thus preserving handover.

WiMoVE is built with standard network protocols, on top of open‑source technology:

  • The overlay networks use BGP EVPN with VXLAN encapsulation.
  • All BGP speakers run FRRouting to exchange EVPN routes.
  • The access points run OpenWrt with our custom, open‑source daemon called wimoved.

This solution allows for using commodity access points running OpenWrt for large‑scale Wi‑Fi deployments, even from different vendors.

WiMoVE consists of multiple parts. If you want to learn more about the architecture or create a full setup, take a look at our documentation.

This repository

This repository contains the WiMoVE Access Point daemon wimoved. The daemon is reponsible for connecting 802.11 stations to BGP EVPN. This is achieved by letting hostapd create one interface per station (see below), creating VXLAN interfaces and bridging the interfaces.

The following diagram shows the components and their interactions:

The daemon is responsible for handling hostapd events, creating VXLAN interfaces and bridges. FRR receives events whenever network interfaces change and advertises the corresponding reachability information using BGP EVPN.

Supported architectures

Currently, we support OpenWrt with the architectures ramips-mt7621 and mvebu-cortexa9. We test our software on the Access Point models ZyXEL NWA50AX and Linksys WRT1900ACS. If you need support for other OpenWrt architectures, feel free to open an issue.

Development setup

The development setup is an easy way to test wimoved. Please be aware that after following this guide, you do not have a full WiMoVE installation. You can connect to the Wi-Fi but won't be connected to other devices or the Internet. If you want a complete WiMoVE setup, follow the setup guide instead. Currently, we only support Linux as a development platform.

Setup hostapd

First, we will set up hostapd.

  • Install hostapd using your distribution's package manager. Alternatively, you can build it from source, see here. By default, hostapd comes with VLAN support. If you encounter issues, make sure that it was compiled with CONFIG_NO_VLAN=n.
  • Put the following in your /etc/hostapd.conf, replacing the placeholder values:
    interface=<interface>
    ssid=<ssid>
    ieee80211d=1
    country_code=<Your country code>
    hw_mode=g
    ieee80211n=1
    channel=6
    beacon_int=1000
    dtim_period=2
    max_num_sta=255
    rts_threshold=-1
    fragm_threshold=-1
    macaddr_acl=0
    auth_algs=1
    ignore_broadcast_ssid=0
    wmm_enabled=0
    eapol_key_index_workaround=0
    eap_server=0
    wpa=2
    wpa_key_mgmt=WPA-PSK
    rsn_pairwise=CCMP
    wpa_passphrase=<Secret key>
    per_sta_vif=1
    vlan_file=/etc/hostapd/hostapd.vlan
  • Create a file /etc/hostapd/hostapd.vlan with the following content:
*   vlan#
  • Hostapd has to be started with the option -g /var/run/hostapd/global. For the service to work properly, you might have to edit the service file for hostapd. Run systemctl status hostapd to locate this file.
  • Start hostapd, either with systemctl start hostapd or on the command line. You might need to stop NetworkManager before starting hostapd since the programs interfere with each other.

Build

  • Install libnl. On a recent Linux system, the corresponding package is probably already installed.
  • Install prometheus-cpp.
    • On Ubuntu, install the package prometheus-cpp-dev.
    • On Arch Linux, the package is available in the AUR as prometheus-cpp-git.
  • Clone the repository by running git clone https://github.com/WiMoVE-OSS/wimoved.
  • Build the project by running cmake . followed by make -j$(nproc).
  • Start wimoved by running ./wimoved.

Coding guidelines

Format the source files by running make format. Lint the source files by running make lint. Build the tests by running make test. Run both checks and the tests by doing make precommit. The coding guidelines are enforced via the CI pipeline.

As linting takes a long time, we recommend integrating clang-tidy into your editor.

Sanitization

Build with sanitizers enabled by running cmake . -DWIMOVED_SANITIZE=ON. For full stacktrace support, set the environment variable LSAN_OPTIONS=fast_unwind_on_malloc=0:malloc_context_size=30.

Build for OpenWrt

OpenWrt packages are built on every release. On each commit on main, a pre-release is created and a build is triggered. You can download the packages on the releases page.

The pipeline works as follows:

  • For both supported architectures (mvebu-cortexa9, ramips-mt7621), there is a Docker image containing an OpenWrt build environment. In the image, libnl and prometheus-cpp are already built. The corresponding Dockerfile is located in docker/openwrt-build-env.
  • The pipeline sets the environment variable IMAGE and runs openwrt/build-openwrt.sh. This creates a container from the image, mounting the source code and the output directory.
  • The packages from the output directory are uploaded as pipeline artifacts.

If you want to build for OpenWrt locally, you first need to build or pull a base image. Base images can be found on the Packages page. Then, you can run ./openwrt/build-openwrt.sh:

export IMAGE=ghcr.io/wimove-oss/wimove/wimove-buildenv/mvebu-cortexa9-22.03.3:main
docker pull "$IMAGE"
./openwrt/build-openwrt.sh

The script creates an out directory and the package will be inside that directory.

Configuration

The default configuration file is /etc/wimoved/config. A different configuration file can be specified as the first argument, i.e. by running wimoved /path/to/config. The following configuration options are available:

Name Default Explanation
hapd_sockdir /var/run/hostapd Socket directory for hostapd. Hostapd sockets are discovered automatically from this directory.
hapd_group root Group under which hostapd is run. This is needed to set the appropriate permissions when communicating with hostapd. On OpenWrt, this must be "network".
log_path wimoved.log Path to the log file when logging to a file. Does nothing if logging to syslog.
cleanup_interval 10 Duration of the cleanup timer (seconds). Wimoved removes interfaces whenever the cleanup timer expires.
min_vni 1 Minimum VNI used for hashing station MACs to VNIs
max_vni 20 Maximum VNI used for hashing station MACs to VNIs (exclusive)
sockets Explicit list of hostapd sockets, comma-separated. If used, hapd_sockdir will not be scanned for sockets. Names are relative to hapd_sockdir.

Monitoring and Logging

Monitoring

wimoved exposes metrics via prometheus on port 9500. Via the endpoint, information on how many stations are associated, how many events have been handled and how long the event handling took is exposed. The endpoint is activated by default. We plan on making the endpoint optional via a config option.

Logging

Wimoved can log to a file or to syslog.

The file can be set through the configuration option log_path.

On OpenWrt, wimoved logs to syslog. This is done by defining ELPP_SYSLOG in CMakeLists.txt. A syslog aggregator like syslog-ng can be used to aggregate those logs on a different host.

Known Issues

  • When a station roams from one AP to another, zebra can get into a state where no packets are forwarded to the station. We track the issue as #68. The cause seems to be an upstream issue in FRR which we reported here.

wimoved's People

Contributors

dasgoogle avatar linascience avatar rgwohlbold avatar sohn123 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

dasgoogle

wimoved's Issues

Add testing infrastructure

Right now, we are only testing manually. With each new feature, the possibility of bugs increases. We should therefore look at possible testing frameworks, pick one and add them to our pipeline.

In separate issues, we can then write tests that test behavior of different subsystems.

Rename Project

Currently, we are still using the Gaffa name in all source code. We need to swap this to WiMoVE.

Make multiple networks on multiple radios work

Our APs have two radios: One for 2.4GHz and one for 5GHz.
We want these APs to be able to send out their network on both frequencies and work with wimoved.

  • Explore what currently works: Can we combine all events into one hostAPD socket?
  • Decide on what needs to be done
  • Implement changes

Problems with Zebra MAC Learning

During testing, we discovered that when using wpa_cli to forcefully roam from one AP to another, zebra gets stuck in a state that prevents the station from receiving packets. The bridge of the station's VXLAN has learned the MAC address on both ports, although zebra reported that it removed the route. Reconnecting the station or roaming again fixes the issue. We opened an issue in the frr repository. There you find more details and the logs generated by the access points.

Packaging for OpenWRT

Right now, the scripts directory is quite confusing, and the scripts are pretty unorganized. We should rework the scripts and change the directory structure to make adaptions to the packaging easier.

Create tagged binaries on release

It would be nice if we could generate binaries automatically when we release. These binaries should also be clearly labeled with a commit hash to indicate the exact version that was built.

shift exponent 32 is too large for 32-bit type 'int'

When testing with ubsan, we get the following error:

src/MacAddress.cpp:22:29: runtime error: shift exponent 32 is too large for 32-bit type 'int'

We should eliminate the undefined behavior in MacAddress::number.

Config lines without = are not parsed correctly

When the configuration parser is given a config file that contains a line without =, it sets the value of the line's contents to itself. For example, take the following configuration file:

hapd_sockdir

This is equivalent to hapd_sockdir=hapd_sockdir. I would expect this to throw an error since the configuration file is malformed.

While fixing this parsing bug, we should make sure that we don't rely on any undefined behavior, especially not using previous iterators after calling erase().

Automatically detect hostapd sockets

Currently, we have a configuration option that allows us to set the hostapd sockets to connect to.
However, for easier basic setup, it would be nice to have the option to automatically detect all sockets and load them as configuration.

An implementation suggestion:

  • When a list of sockets is provided, that list is used
  • When no list is provided, we automatically detect
  • For automatic detection, we look at which files exist in the hapd_sockdir directory
  • We exclude all files that are called global or have a file extension Obsolete, see next point
  • Is there a better way to detect if a file there is a (hostapd) socket? Yes, we can check that it is a socket using the fs library of C++

Handle Ping-Pong for HostAPD better

When we do not receive a message from HostAPD for some time, we send a PING command and expect a PONG.
However, sometimes there are other events that are sent to us before the PONG so that we let the daemon crash even though it shouldn't.

  • WiMoVE does not crash when there is something other than PONG received
  • The other received event is properly parsed and used by the system

RAM usage seems to greatly increase over time, might result in crashes

Some days ago on a weekend I observed that the RAM usage on one of the APs increased caused by wimoved.
image

Today, I also observed that the RAM use on one AP rose to 90% only to then drop back to 20% after I connected. When looking into the logs I found that wimoved had crashed.
Before that, it seems however that the loop worked as expected.
image

As a first step, we might want to check whether we leak memory somewhere causing these issues. It likely will not be very easy to recreate this scenario.

Test hostapd event parsing

The hostapd event parsing code is delicate since there is no specified protocol, only the hostapd implementation. We should document our assumptions of the message format as tests and verify automally that the code parses the events correctly.

Additionally, we could introduce fuzz testing against our code to check its resistance against invalid input.

Improve packaging for OpenWRT

  • Install wimove as a service to OpenWRT when installing the package
  • Create config directory and place default config there

Improve logging structure

Right now, we have structured logging with easylogging++, but our usage is inconsistent. We should do the following:

  • Create a consistent format for log message (consistent capitalization, message content)
  • Create a policy of which attached information should be logged, preferably in key-value pairs such as mac=00:00:00:00:00:00 vni=5 iface=vxlan5
  • Enforce format everywhere
  • Remove redundant logging statements
  • Review logging priorities

For example, we could do: cleanup: deleting interface iface=vxlan5

Deauth station when VXLAN connection fails

In some rare cases, it might happen that something goes wrong when connecting a station to a VXLAN interface. In such a case, we want the station to be disassociated from the AP by hostapd so we can get a clean start and the station can attempt to reconnect again.

Basic documentation

For some more project information we need the following:

  • README with setup and build instructions
  • Example config files for FRR, WiMoVE etc.

Improve naming of Socket and Socket80211

Currently, the naming of Socket and Socket80211 does not make the differences between these classes clear. We could rename Socket to something that differentiates itself from Socket80211.

Issues starting wimoved

While starting wimoved in OpenWRT 22.03, the following issue is seen. Could you provide some assistance?

wimoved[3265]: [ERROR] 2023-06-07 13:47:43,012 main.cpp:33 - An Exception was thrown that was not caught.
wimoved[3265]: [FATAL] 2023-06-07 13:47:43,013 main.cpp:37 - 14CivetException: null context when constructing CivetServer. Possible problem binding to port.
[WARNING] 2023-06-07 13:47:43,013 main.cpp:37 - Aborting application. Reason: Fatal log at [wimoved/src/main.cpp:37]
Aborted

This issue is reproducible with version v0.0.2 Latest and Development Build: v0.0.3+beta2023-04-26_12.56. Additionally, I reproduced the error in the Linksys models 1200ac and the 1900ac.

Link `prometheus-cpp` dynamically and make it optional

Right now, we link prometheus-cpp statically for our OpenWRT builds. It would be better to provide prometheus-cpp in a different package and link dynamically to it. I don't know if we can achieve this or how we can achieve it.

Additionally, we should think about making this functionality optional at compile time to save on disk space.

Unify RAII classes for Netlink

Currently, we have three different classes, Vxlan, Bridge and Link that all serve the same purpose: freeing the internal rtnl_link * when destroyed.

We could use one class, Link instead that receives a rtnl_link * as a constructor argument and takes care of freeing the link on destruction.
This would also have the advantage of making the Netlink code more readable since all libnl calls except the frees are visible.
We could also use this pattern for the LinkCache.

MAX_VNI is non-inclusive

The MAX_VNI setting is non-inclusive even though we expect it to be. Thus, there is an off-by-one error in the VNI calculation.

For now, the issue can be solved by simply increasing MAX_VNI by one.

Improve VLAN parsing

Do proper error handling, i.e. ignore events when VLAN ID outside of correct range

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.