Spectral embedding of dynamic multiplex networks

This repository contains a Python library supporting the work on spectral embedding of dynamic multiplex graphs by Maximilian Baum, Francesco Sanna Passino, and Axel Gandy (Department of Mathematics, Imperial College London).

The library dmprdpg can be installed in edit mode as follows:

pip3 install -e lib/

The library can then be imported in any Python session:

import dmprdpg

A demo on how to use the library can be found in notebooks/test_library.ipynb.

In this work, we explore a spectral embedding method for dynamic graphs with multiple layers. The base model for this work is a network in which a set of shared nodes exhibit connections across a number of different layers and this multiplex network is observed at a fixed number of points in time. We extend the theory of Unfolded Adjacency Spectral Embedding (UASE) to the dynamic case and plan to provide stability guarantees as well as a central limit theorem.

Background and Notation

We consider the case of an undirected network with $K$ layers observed at $T$ points in time for $K, T \in \mathbb{N}$. This network can be encoded in a collection of adjacency matrices $\textbf{A} = {\textbf{A}_{k,t}}$ where $k = 1,\dots , K$ and $t = 1,\dots , T$. Currently, we consider only the case of undirected networks. For our model, we adopt the concept of the latent position model in which the connection probabilities between nodes are defined by each node's latent position in an underlying $d$ dimensional embedding space. Specifically, each node in our model is represented by a position in two different embedding spaces $\mathcal{X} \subset \mathbb{R}^d$ and $\mathcal{Y} \subset \mathbb{R}^d$ where the positions $\textbf{X}^{k}_i \in \mathcal{X}$ are shared across time but are different across layers and the positions $\textbf{Y}^{t}_j \in \mathcal{Y}$ are shared across layers but vary over time. The connection probability for nodes $i$ and $j$ at time $t$ in layer $k$ is given by the inner product of these positions. We can therefore express the adjacency matrices probabilistically as

$$\textbf{A}_{k,t, i,j} \sim \mathrm{Bernoulli}\left(\textbf{X}^{k \intercal}_i \textbf{Y}^{t}_j\right).$$

Methodology

Given a set of adjacency matrices $\textbf{A}_{k,t} \in \lbrace 0,1\rbrace^{n \times n}$ we define the adjacency unfolding $\textbf{A} \in \lbrace 0,1\rbrace ^{nK \times nT}$ as:

$$\textbf{A} = \begin{bmatrix} \textbf{A}_{1,1} & \dots & \textbf{A}_{1,T} \\\ \vdots & \ddots & \vdots \\\ \textbf{A}_{K,1} & \dots & \textbf{A}_{K,T} \end{bmatrix}.$$

In order to estimate the latent positions $\textbf{X}$ and $\textbf{Y}$, we propose Doubly Unfolded Adjacency Spectral Embedding (DUASE) for dynamic multiplex graphs. Given the realized adjacency matrices $\textbf{A}_{k,t}$ $k = 1,\dots , K$ and $t = 1,\dots , T$ we make use of a truncated SVD of rank $d$ to obtain a low-rank approximation of the doubly unfolded matrix $\textbf{A}$ as $\textbf{A} \approx \textbf{UDV}^{\intercal}$ where $\textbf{D}$ contains the top $d$ singular values on the diagonal and $\textbf{U}$ and $\textbf{V}$ contain the corresponding singular vectors. The estimates for the latent positions for each node are recovered according to

$$\hat{\textbf{X}} = \textbf{UD}^{1/2}, \hspace{1cm} \hat{\textbf{Y}} = \textbf{VD}^{1/2}.$$

We then retrieve the estimates $\hat{\textbf{X}}^k$ and $\hat{\textbf{Y}}^t$ by unstacking the $n \times d$ chunks of $\hat{\textbf{X}}$ and $\hat{\textbf{Y}}$ .

Simulation Results

We have generated a number of computer simulations to empirically test the properties of our embedding technique in the simplified case of a stochastic block model. This represents the setting where each node is assigned to one of a fixed number of communities which have a shared latent position. In this case, our model has $1000$ nodes and these are equally distributed among four communities with latent positions defined such that at time $t$ in layer $k$ the $(i,j)^{th}$ entry of $B^{(k)}_t$ denotes the connection probability between a node in community $i$ and a node in community $j$.

$$B^{(1)}_1 = \begin{bmatrix} 0.08 & 0.02 & 0.18 & 0.10\\\ 0.02 & 0.20 & 0.04 & 0.10\\\ 0.18 & 0.04 & 0.02 & 0.02\\\ 0.10 & 0.10 & 0.02 & 0.06 \end{bmatrix}, % B^{(1)}_2 = \begin{bmatrix} 0.16 & 0.16 & 0.04 & 0.10\\\ 0.16 & 0.16 & 0.04 & 0.10\\\ 0.04 & 0.04 & 0.09 & 0.02\\\ 0.10 & 0.10 & 0.02 & 0.06 \end{bmatrix}, % B^{(1)}_3 = \begin{bmatrix} 0.08 & 0.02 & 0.18 & 0.10\\\ 0.02 & 0.20 & 0.04 & 0.10\\\ 0.18 & 0.04 & 0.02 & 0.02\\\ 0.10 & 0.10 & 0.02 & 0.06 \end{bmatrix}$$

$$B^{(2)}_1 = \begin{bmatrix} 0.08 & 0.02 & 0.18 & 0.10\\\ 0.02 & 0.20 & 0.04 & 0.10\\\ 0.18 & 0.04 & 0.02 & 0.02\\\ 0.10 & 0.10 & 0.02 & 0.06 \end{bmatrix}, % B^{(2)}_2 = \begin{bmatrix} 0.16 & 0.16 & 0.04 & 0.10\\\ 0.16 & 0.16 & 0.04 & 0.10\\\ 0.04 & 0.04 & 0.09 & 0.02\\\ 0.10 & 0.10 & 0.02 & 0.06 \end{bmatrix}, % B^{(2)}_3 = \begin{bmatrix} 0.08 & 0.02 & 0.18 & 0.10\\\ 0.02 & 0.20 & 0.04 & 0.10\\\ 0.18 & 0.04 & 0.02 & 0.02\\\ 0.10 & 0.10 & 0.02 & 0.06 \end{bmatrix}$$

$$B^{(3)}_1 = \begin{bmatrix} 0.08 & 0.08 & 0.08 & 0.08\\\ 0.08 & 0.08 & 0.08 & 0.08\\\ 0.08 & 0.08 & 0.08 & 0.08\\\ 0.08 & 0.08 & 0.08 & 0.08 \end{bmatrix}, % B^{(3)}_2 = \begin{bmatrix} 0.08 & 0.08 & 0.08 & 0.08\\\ 0.08 & 0.08 & 0.08 & 0.08\\\ 0.08 & 0.08 & 0.08 & 0.08\\\ 0.08 & 0.08 & 0.08 & 0.08 \end{bmatrix}, % B^{(3)}_3 = \begin{bmatrix} 0.08 & 0.08 & 0.08 & 0.08\\\ 0.08 & 0.08 & 0.08 & 0.08\\\ 0.08 & 0.08 & 0.08 & 0.08\\\ 0.08 & 0.08 & 0.08 & 0.08 \end{bmatrix}$$

In the figures below, the true latent position for each node is one of four points denoted with an orange $\times$ and the estimated latent positions for each community are plotted by color. The mean embedding for each group is denoted with a red dot. These simulations show that our estimates for each group are generally clustered around the true latent positions and suggest that the desired stability properties are likely to apply.

It also appears that our estimators for the latent positions have normally distributed errors and a comparison of estimator variance for networks of different sizes appears to be roughly consistent with $1/n$ scaling. These results are consistent with the central limit theorem we aim to prove.

youngser / dmprdpg Goto Github PK

dmprdpg's Introduction

Spectral embedding of dynamic multiplex networks

Background and Notation

Methodology

Simulation Results

dmprdpg's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent