Giter Club home page Giter Club logo

fas's Introduction

Computing Feedback Arc Sets for Very Large Graphs

This repository contains efficient implementations for computing feedback arc sets on large graphs. The details of the implementations and their optimizations are described in the following paper:

Michael Simpson, Venkatesh Srinivasan, Alex Thomo: Efficient Computation of Feedback Arc Set at Web-Scale. PVLDB 10(3): 133-144 (2016)

Remarks

All algorithms load the graph in compressed form in main memory. Because of the superb compression that WebGraph provides, we can fit in the main memory of a moderate machine a large graph, such as Clueweb 2012, which has more than 42 billion edges (http://law.di.unimi.it/webdata/clueweb12/).

Compiling

Compile as follows:

javac -cp "lib/*" -d bin ./*

Input

The graphs should be in WebGraph format.

There are three files in this format:

basename.graph
basename.properties
basename.offsets

There many available datasets in this format in: http://law.di.unimi.it/datasets.php

Let us see for an example dataset, cnr-2000, in http://law.di.unimi.it/webdata/cnr-2000

There you can see the following files available for download.

cnr-2000.graph
cnr-2000.properties
cnr-2000-t.graph
cnr-2000-t.properties
...
(you can ignore the rest of the files)

The first two files are for the forward (regular) cnr-2000 graph. The other two are for the transpose (inverse) graph. If you only need the forward graph, just download:

cnr-2000.graph
cnr-2000.properties

What's missing is the "offsets" file. This can be easily created by running:

java -cp "lib/*" it.unimi.dsi.webgraph.BVGraph -o -O -L cnr-2000

Edgelist format

This section is for the case when your graph is given a text file of edges (known as edgelist). If your graph is already in WebGraph format, skip to the next section.

It is very easy to convert an edgelist file into WebGraph format. I am making the folloiwng assumptions:

  1. The graph is unlabeled and the vertices are given by consecutive numbers, 0,1,2,...
    (If there are some vertices "missing", e.g. you don't have a vertex 0 in your file, it's not a problem. WebGraph will create dummy vertices, e.g. 0, that does not have any neighbor.)

  2. The edgelist file is TAB separated (not comma separated).

Now, to convert the edgelist file to WebGraph format execute the following steps:

Sort the file, then remove any duplicate edges:

sort -nk 1 edgelistfile | uniq > edgelistsortedfile

(If you are on Windows, download sort.exe and uniq.exe from http://gnuwin32.sourceforge.net/packages/coreutils.htm)

Run:

java -cp "lib/*" it.unimi.dsi.webgraph.BVGraph -1 -g ArcListASCIIGraph dummy basename < edgelistsortedfile

(This will create basename.graph, basename.offsets, basename.properties. The basename can be e.g. simplegraph)

Running

java -cp "bin:lib/*" algorithm_name basename

e.g.
java -cp "bin:lib/*" ArrayFAS simplegraph

(Change : to ; if you are on Windows)

The result will be output to the console.

Contact

For any questions, send email to [email protected]

fas's People

Contributors

stamps avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

fas's Issues

helped needed for weighted graphs and input format

Dear Michael Simpson,

I am Shuai Wang and I am currently a PhD student of Computer Science in the Netherlands. I have read your paper titled Efficient Computation of Feedback Arc Set at Web-Scale. I have a few questions related to this paper and Iā€™d really appreciate it if you may answer them.

  1. For the probabilistic case of Berger-Shor algorithm, would we have the risk of having cycles after removing the FAS? Since all the arcs have a probabilistic chance of being removed, it is logical to suspect that there might be a case (even with low probability) where all the edges in a cycle are not removed.

  2. I am dealing with weighted graphs rather than probabilistic graphs. I am not an expert on graph theory. Do you think I can develop a weighted case based on your greedy algorithm presented in section 5.1.1? Any advice is more than welcome!

  3. I am not familiar with the Webgraph format. I followed your guidance on your github page for unweighted cases. But Iā€™m not sure how to deal with weighted cases. Could you please give me a bit of guidance?

Thank you very much!
Regards,
Shuai Wang

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.