FedLive: A Federated Transmission Framework for Panoramic Livecast with Reinforced Variational Inference

1. Abstract

Providing premium panoramic livecast services for worldwide viewers against the negative effects of its ultra-high data rate and delay-sensitive demands is still a significant challenge to the current delivery system. Therefore, it is imperative to explore an efficient way of improving the quality of experience while conserving the bandwidth resources for panoramic livecast. This paper expends the design space of both the Field of View (FoV) prediction and the multi-viewers 360◦ live streaming by presenting a novel cost-efficient federated transmission framework called FedLive. We first propose a gradient-based clustering method to group the geo-distributed viewers with similar viewing behavior into content delivery alliances by exploiting geometric properties of the gradient loss. With viewers’ resources integration, a Reinforced Variational Inference (RVI) structure-based approach is proposed to assist in the collaborative training of the FoV prediction model while accelerating multiple multi-rate tiles delivery. We further design a prediction-based asynchronous delivery algorithm, in which both the high accuracy FoV prediction and efficient live 360◦ video transmission are achieved in a decentralized manner. Finally, we use the synchronized algorithm as a benchmark to evaluate our solution performance over a real-world dataset. Additionally, prototype-based experimental results reveal that our approach provides the highest prediction accuracy, reduces delay, and saves bandwidth compared with state-of-the-art solutions.

2. Framework

The above figure presents the diagram of FedLive, the proposed federated transmission framework for PLS. In FedLive multiple types of nodes are involved, including content providers, CDN servers, and viewers with HMDs. FedLive contains two major phases including distributed learning phase and transmission phase with three blocks (viewers side, CDN servers side, and content provider side). This repository introduces distributed learning phase.

In the distributed learning phase, each viewer trains a local FoV prediction model with inputs of its local viewing records and prefetching records provided by the FoV prediction model. The loss information as the output will be used for the local backward propagation and will also be captured by the nearby CDN servers as the input of the user clustering algorithm. Especially, the loss matrix is calculated by adding the predicted FoV binary matrix to the ground truth binary matrix. Once the CDN servers have collected the loss information from all viewers, they will invoke the gradient-based user clustering algorithm to divide the viewers into multiple viewer clusters. Meanwhile, the CDN servers continuously summarize the loss value for each cluster and update it to the content provider along with the clustering results as the input for the unified model training. With the loss information provided by the CDN servers, the content provider updates the unified model for each cluster with the weighted average loss value. The process extends the concept of federated learning by adding clustering. Further, the unified models will be distributed to viewers as part of the live streaming, while the prefetching priority of different tiles is determined corresponding to the predicted results of the unified models.

3. Installation

3.1 Install dependent packages

We build FedLive with Torch+Gym. You can install as follow:

Torch: 1.10.1+cu113
Python: 3.8
Gym: 0.19.0
OpenCV: 4.5.4.60
Other python packages: please refer to <requestments.txt>

3.2 Download FedLive

First of all, you should download FedLive from Github:

git clone https://gitee.com/uglyghost123/FedLive.git

4. How To Use

4.1 Folders

./game/: Files for RL agent and gym environment. It includes:
- game/agent.py: RL agent
- game/grid_video_world.py: the custom environment for 360-degree video FoV prediction
./log/: Files for the experimental log.
./nn_model/: Files for different RL policies.
- nn_model/CNN.py: A simple convolutional neural network for sailency detection.
- nn_model/AC.py: Actor-Critic
- nn_model/DDPG.py: Deep Deterministic Policy Gradient
- nn_model/DDQN.py: Double Deep Q-learning
- nn_model/DQN.py: Deep Q-Learning
- nn_model/PPO.py: Proximal Policy Optimization
- nn_model/RVI.py: Reinforced Variational Inference
- nn_model/SAC.py: Soft Actor-Critic
- nn_model/TD3.py: Twin Delayed DDPG
./pic/: Some pictures of the experiment results.
./save_model/: Save RL models for different user clusters
./utils/: Some data processing scripts.
./main.py: Main function.
./arguments.py: For arguments configuration. such --policy SAC, use soft AC policy
./get_frame.py: Get the video frames.

4.2 Prepare the datasets

./Saliency: 360-degree saliency dataset. link
./Videos: Panoramic videos. link
./frames: Frames extracted from the panoramic videos. (Note configure the path first!)

python get_frames.py

./VRdataset: A head tracking dataset composes of 48 users (24 males and 24 females) watching 18 spherical videos from 5 categories. link

4.3 Run with SAC policy

Set parameters and file path before running the code.
Start with SAC policy.

python main.py --policy SAC

5. Selected Results

5.1 Viewing experience

with the custom environment "grid_video_world.py"

(RVI-SAC)	(RVI-AC)	(Ground truth)

(RVI-A3C)	(RVI-DDPG)	(RVI-TD3)

(RVI-DDQN)	(RVI-PPO)	(RVI-DQN)

5.2 Accuracy, precision, and recall

Further, we evaluate the performance of our solution in terms of prediction accuracy, precision, and recall and we compare it with that of three state-of-the-art solutions: LiveDeep, LiveObj, and PanoSalNet, in an asynchronous manner.

5.3 The objective function (as QoE performance)

6. Contributors

7. Citation

Contact

Xingyan Chen ([email protected]), Southwestern University of Finance and Economics

taozqy / fedlive Goto Github PK

fedlive's Introduction