Giter Club home page Giter Club logo

simplifynet's Introduction

simplifyNet

Package for network sparsification.

Description

An R package for network sparsification with a variety of novel and known network sparsification techniques. All network sparsification reduce the number of edges, not the number of nodes. A network is usually a large, complex weighted graph obtained from real-world data. It is commonly stored as an adjacency matrix or edge list. Network sparsification is sometimes referred to as network dimensionality reduction.

Getting Started

Install and load devtools package:

install.packages("devtools")

Use install_github function to pull and install simplifyNet in your session:

install_github("kramera3/simplifyNet")

Prerequisites

The following packages are required:

igraph, sanic, Matrix, tidyr, methods, fields, stats, dplyr

Also set up the working directory:

setwd("<em>working directory</em>")

simplifyNet

simplifyNet is a R package for network sparsification. It contains a suite of different network sparsification algorithms to output a sparsified network.

Global Network Sparsification:

Global network sparsification. Uses a threshold cutoff to remove all edges below a certain edge weight or removes a certain proportion of lowest edge weight edges.

gns(E_List, remove.prop, cutoff)

Arguments

  • E_List: An edge list of the format |n1|n2|weight|)
  • remove.prop: The proportion of highest weighted edges to retain. A value between 0 and 1.
  • cutoff: Threshold value for edge weight thresholding.

LANS:

Local Adaptive Network Sparsification from the paper by Foti et al.

lans(Adj, remove.prop, output)

Arguments

  • Adj: Weighted adjacency matrix of the network.
  • remove.prop: Alpha value threshold to designate statistically "unimportant" edges by edge weight.
  • output: Designates if the output should be directed or undirected. Default is that the output is the same as the input based on adjacency matrix symmetry. If the default value is to be overridden, set as either "undirected" or "directed".

Sparsification by Edge Effective Resistances:

Sparsification by sampling edges proportional to their effective resistances as formulated by Spielman and Srivastava. This requires two discrete steps: (1) approximating the effective resistances for all edges, (2) sampling them according to the method devised by Spielman and Srivastava.

effR = EffR(E_List, epsilon, type="kts", tol)
EffRSparse(n, E_List, q, effR)
  1. EffR, effective resistances calculator.

    • E_List: Edge list formatted | n1 | n2 | weight |.

    • epsilon: Governs the relative fidelity of the approximation methods 'spl' and 'kts'. The smaller the value, the greater the fidelity of the approximation and the greater the space and time requirements. Default value is 0.1.

    • type: There are three methods.

      $1$ 'ext' which exactly calculates the effective resistances (WARNING! Not ideal for large graphs).

      $2$ 'spl' which approximates the effective resistances of the inputted graph using the original Spielman-Srivastava algorithm.

      $3$ 'kts' which approximates the effective resistances of the inputted graph using the implementation by Koutis et al. (ideal for large graphs where memory usage is a concern).

    • tol: Tolerance for the linear algebra (conjugate gradient) solver to find the effective resistances. Default value is 1e-10.

  2. EffRSparse, network sparsification through sampling effective resistances.

    • n: The number of nodes in the network.

    • E_List: Edge list formatted | n1 | n2 | weight |.

    • q: The numbers of samples taken. The fidelity to the original network increases as the number of samples increases, but decreases the sparseness.

    • effR: Effective resistances corresponding to each edge. Should be in the same order as "weight".

Authors

  • Dr. Andrew Kramer - Primary Author simplifyNet
  • Alexander Mercier - Package Maintainer
  • Shubhankar Tripathi - Contributor
  • Tomlin Pulliam - Contributor
  • John Drake - Contributor

Method Acknowledgements

License

  • GNU General Public License

simplifynet's People

Contributors

ammercier avatar kramera3 avatar shubhankar-tripathi avatar

Stargazers

 avatar

Watchers

 avatar

simplifynet's Issues

Understanding Line 90

Floor takes a numeric (the prop times the number of orders) and returns a numeric vector containing the largest integers =< than the numeric argument.
Am I understanding this correctly?

d[o[1:floor(remove.prop*length(o))]] <- 0

Starting point

@shubhankar1201 Here is the new repository for our work on the network simplification code. We will work in branches here.

You can start by reviewing this code. I will also share some additional papers.

Dependencies

Change so that the required dependencies are installed if missing from users packages

How to make QGen less RAM Costly

The random matrix generator for approximating the effective resistance between all pairs of nodes (QGen) uses the
Johnson–Lindenstrauss lemma requiring a large, random, and dense matrix with +/-1\sqrt(k) entries such that k=24log(n)/epsilon^2 where n is the number of edges and epsilon is the degree of error in the approximation.
R uses RAM to store these matrices. However, for networks with many edges and a fine degree of approximation (small epsilon), this requires large quantities of RAM to store.

This needs to be changed to disk or other means of storage for larger network analysis.

Or, since all that is really needed from this matrix is matrix multiplication, a way to perform the same operation but entry-wise without the large random matrix could also be done.

Any suggestions are welcome.

package help

Add document so that '> ?simplifyNet" brings up brief description of the package. Can be as simple as ?base, but nice if it listed the main function(s) so that user would know where to start.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.