sbromberger / lightgraphs.jl Goto Github PK

An optimized graphs package for the Julia programming language

License: Other

Julia 99.96% TeX 0.04%

julia graph lightgraphs graph-theory graph-generation graph-analytics graph-algorithms

lightgraphs.jl's Introduction

LightGraphs

Project Status: As of 8 October 2021 LightGraphs is no longer under active development. It will remain available on Github at sbromberger/LightGraphs.jl. The JuliaGraphs organization will continue to maintain packages that use LightGraphs and transition development over the long term.

LightGraphs offers both (a) a set of simple, concrete graph implementations -- Graph (for undirected graphs) and DiGraph (for directed graphs), and (b) an API for the development of more sophisticated graph implementations under the AbstractGraph type.

The project goal is to mirror the functionality of robust network and graph analysis libraries such as NetworkX while being simpler to use and more efficient than existing Julian graph libraries such as Graphs.jl. It is an explicit design decision that any data not required for graph manipulation (attributes and other information, for example) is expected to be stored outside of the graph structure itself. Such data lends itself to storage in more traditional and better-optimized mechanisms.

Additional functionality may be found in a number of companion packages, including:

LightGraphsExtras.jl: extra functions for graph analysis.
MetaGraphs.jl: graphs with associated meta-data.
SimpleWeightedGraphs.jl: weighted graphs.
GraphIO.jl: tools for importing and exporting graph objects using common file types like edgelists, GraphML, Pajek NET, and more.

Documentation

Full documentation is available at GitHub Pages. Documentation for methods is also available via the Julia REPL help system. Additional tutorials can be found at JuliaGraphsTutorials.

Installation

Installation is straightforward: enter Pkg mode by hitting ], and then

(v1.0) pkg> add LightGraphs

Supported Versions

LightGraphs master is generally designed to work with the latest stable version of Julia (except during Julia version increments as we transition to the new version).
Julia 0.3: LightGraphs v0.3.7 is the last version guaranteed to work with Julia 0.3.
Julia 0.4: LightGraphs versions in the 0.6 series are designed to work with Julia 0.4.
Julia 0.5: LightGraphs versions in the 0.7 series are designed to work with Julia 0.5.
Julia 0.6: LightGraphs versions in the 0.8 through 0.12 series are designed to work with Julia 0.6.
Julia 0.7 / 1.0: LightGraphs versions in the 1.x series are designed to work with Julia 0.7 and Julia 1.0.
Later versions: Some functionality might not work with prerelease / unstable / nightly versions of Julia. If you run into a problem, please file an issue.

Contributing and Reporting Bugs

We welcome contributions and bug reports! Please see CONTRIBUTING.md for guidance on development and bug reporting.

Citing

We encourage you to cite our work if you have used our libraries, tools or datasets. Starring the repository on GitHub is also appreciated. See the Zenodo badge above or refer to CITATION.bib.

lightgraphs.jl's People

Contributors

Stargazers

Watchers

Forkers

jpfairbanks jiahao gitter-badger luluxu9466 zaccranko yuyichao marcliberatore hayd carlolucibello pranavtbhat mbeltagy tkelman cjamadagni rschwarz abhijithanilkumar arekfu afternone amckay1 michaelphstumpf transgirlcodes azzaare issamt suzhu1988 jgoldfar scottpjones bicycle1885 hronellenfitsch memoiry athulappadan nadesai chrisrackauckas apilastri gmarcais bisraelsen rohitvarkey dsrivastavv wtkranz thuener martinwolke nickeubank cstjean gordstephen mschauer j143-zz okwh tejus-gupta blegat iaciac dellison lelogrott kristofferc djsegal shizejin matbesancon complexsystem myuuuuun mathtd iaravena scheidan sohamtamba mburq fredrikekre simonschoelly jhwaltjim tpapalex tinayan328 slyles1001 cmcaine maximerischard mcognetta bovine3dom vpetukhov kshyatt mjc401 skleinbo gunnarfarneback oxinabox hadirgax juliadocsforks cguyomar gkappler chamanagrawal cjprybol abhinavmehndiratta eqt petershintech jordancluts chethega ethan-lord thomasyang1207 ropenta pranjaliraturi sharathga eulerkochy dahtah saiharsh sinhatushar billyrrr eliasbcd sschuldenzucker

lightgraphs.jl's Issues

edge_dists as a matrix (and misc)

I discovered Julia recently, and was originally put-off by the complicated approach of Graphs.jl. As someone who uses graphs everyday, I can say that the simplicity of the library is a major adoption criterion---look at the success of NetworkX!

I had a couple of questions regarding the use of edges_dists

Why keeping weights out of the graph structure? Does it lead to major performance improvements?
Any particular reasons to have it as a matrix? In particular, if the network is huge but sparse, you will be storing a lot of 0s.

Also, just as an aside, I think Julia will be adopted for graphs analysis if we can get the best of all worlds, that is

A library as simple to use and patch as NetworkX
A library as fast as graph-tool
Easy-parallelism for resource-heavy computations (Betweenness centrality for instance)

I'd like to get involved in some way, what are the current priorities?

Create matvec for integration with GraphMatrices.jl

Tracking for
jpfairbanks/GraphMatrices.jl#1 (comment)

Build benchmarks into the repo

@jpfairbanks suggested in #31 (comment) that we develop performance benchmarks into the repo (so that we can compare the effects of changes to data structures and algorithms).

IMO, this is an excellent idea. I've opened this issue to figure out how we do this.

betweenness_centrality docstring

So, this is because

g: SimpleGraph
    A Graph, directed or undirected.
weights: AbstractArray{Float64, 2}, optional
    Matrix containing the weight associated with each edge (i,j)
    `ω[i, j]` is the weight between vertices `i` and `j`.
    If no weights are specified, shortest paths are computed using equal
    weights. If values are missing, they are assumed equal to `1`.
    *Advice* Use sparse matrices for better performances.

also on rtd https://lightgraphsjl.readthedocs.org/en/latest/centrality/

I'm not sure this is a Markdown bug (although I know that Markdown does not yet parse newlines properly nor does it do nested lists - IIRC that's on the TODO list). meh

Perhaps easiest/cleanest to just do:

##### g: SimpleGraph
A Graph, directed or undirected.
##### weights: AbstractArray{Float64, 2}, optional
Matrix containing the weight associated with each edge (i,j)
`ω[i, j]` is the weight between vertices `i` and `j`.

Honestly I kind hate all these sections...

Note: Writing up some quick slides in case I've been roped into presenting at the next meetup!

Possible issue with adjacency matrices

cc @jpfairbanks

I noticed something unusual today:

julia> using LightGraphs

julia> g = PathDiGraph(3)
{3, 2} directed graph

julia> edges(g)
Set([edge 1 - 2,edge 2 - 3])

julia> adjacency_matrix(g)
3x3 sparse matrix with 2 Int64 entries:
    [2, 1]  =  1
    [3, 2]  =  1

julia> adjacency_matrix(g,:out)
3x3 sparse matrix with 2 Int64 entries:
    [2, 1]  =  1
    [3, 2]  =  1

julia> adjacency_matrix(g,:in)
3x3 sparse matrix with 2 Int64 entries:
    [1, 2]  =  1
    [2, 3]  =  1

julia> adjacency_matrix(g)[1,2]
0

This looks backwards. The adjacency matrix A[i,j] should be 1 if there's an edge from i to j.

See https://github.com/JuliaGraphs/LightGraphs.jl/blob/master/src/spectral.jl#L9-L35 for the relevant code.

The problem is, when I change it to what I think is correct code, katz_centrality() breaks.

Subgraph Operations

I would like to kick around some ideas on supporting subgraphs.

It would be nice to have induced subgraphs accessible. If you are willing to copy out the subgraph, which is useful in some use cases, it is easy to implement. You just take a subset of integers in 1:n and iterate over the them grabbing their neighbors and renumbering everything based on scan(+, input).

For copy-free subgraphs like subarrays things are a bit more complicated.
A subtype is the way to go. If you take a subgraph as a Vector{Int} and the original graph it is a subgraph of, then you overload the basic operations like neighbors, edges, degree... to use one level of indirection.

type AbstractSubGraph <: AbstractGraph
vertices::Range
supergraph::AbstractGraph
vertexmap::AbstractArray 
inversevertexmap::AbstractArray
end

Where vertexmap maps from the subgraph index to the supergraph index and inversevertexmap maps from supergraph index to subgraph index.
So neighbors becomes something like

function neighbors(h::AbstractSubgraph, v)
    superneighbors = neighbors(h.supergraph, h.vertexmap[v])
    subneighbors = Array(Int,0)
    for u in superneighbors
        if u in h.inversevertexmap
            push!(subneighbors, h.inversevertexmap[u])
        end
    end
    return subneighbors
end

This approach leads to subgraphs which take time linear in the original graph size to use. So it would only be worth if it you had subgraphs which are almost the same size as the original graph. In this case the benefit is to avoid the space requirements of the copies. And would be useful if you had some function that required you to remove each vertex in a set and evaluate a function on the induced subgraph.

For non-induced subgraphs we could treat the property of being in the subgraph as an externally stored property like distance or edge weight and use a masking technique.

If anyone has other ideas they want to discuss just put them out there on this issue.

RFC: Parameterized Graphs {T<:Integer}

I've got a use case where I need the ability to reference nodes that have a range of 2^128. That is, I could represent the graph without external mapping if I were able to use 128-bit ints instead of what we currently have, which is Int (which defaults to 64 or 32 depending on system).

Conversely, if I know I have a small graph, it might make sense to specify an integer size < 64 or 32 bits. This would result in memory savings, and perhaps some speed.

Finally, allowing parameterization can eliminate the inconsistency of relying on system default Int sizes.

Drawbacks (will add more as I think of them):

Increased complexity
need to decide what happens when add_vertices!(n) produces nv(g) > chosen typemax
how we handle defaultdistance / edge weights may have to change as well

Anyway - thoughts on this idea?

smallgraphs should be using unsafe_add_edge!

Per @CarloLucibello's suggestion in #163 (comment)

Since we know (as much as that's possible) that the smallgraphs are sound, we should use the faster routines to add edges.

Should I create a FlowDescriptor type?

I was implementing the Push-Relabel algorithm for maximum flows. The problem is that there are a large number of arrays that need to be passed to each function. For example the relabel function needs 7 arguments.

function relabel!{T<:Number}(
    flow_graph::LightGraphs.DiGraph, 
    u::Int,                               
    capacity_matrix::AbstractArray{T, 2}, 
    flow_matrix::AbstractArray{T,2},   
    height::AbstractArray{Int,1},    
    count::AbstractArray{Int,1},      
    Q::AbstractArray{Int,1}        
    )
    count[height[u]]--;
    height[u] = 2*nv(flow_graph);

    for v in fadj(flow_graph, u)
        if capacity_matrix[u,v] > flow_matrix[u,v]
            height[u] = min(height[u], height[v]+1)
        end
    end
    count[height[u]]++
    unshift!(Q, u)
end

I could create a new FlowDescriptor type, something like

type FlowDescriptor{T<:Number}
    flow_graph::LightGraphs.DiGraph                  
    capacity_matrix::AbstractArray{T, 2}
    flow_matrix::AbstractArray{T,2}   
    height::AbstractArray{Int,1}    
    count::AbstractArray{Int,1}      
    Q::AbstractArray{Int,1}   
end

and pass it to functions. Will doing this have any effect on performance/memory?

WIP: Final bits of coverage to get to 100%!

https://coveralls.io/builds/3335340/source?filename=src%2Fspectral.jl line 82 (cc @hayd since I'll bet you have a really good solution for this)
https://coveralls.io/builds/3335340/source?filename=src%2Fpersistence.jl - need a bad graph to test.

The getting started section has disappeared from the docs

compose() shadows Compose.compose

The compose function shadows Compose.jl's primary function, compose, which makes it very difficult to work with together, e.g. when drawing custom graph layouts building upon GraphLayout.jl.

using Compose
using LightGraphs
compose(context(), rectangle(0, 0, 1, 1))
#Error:`compose` has no method matching compose(::Context, ::Form{RectanglePrimitive{P<:Point{XM<:Measure{S,T},YM<:Measure{S,T}},M1<:Measure{S,T},M2<:Measure{S,T}}})

using Compose
compose(context(), rectangle(0, 0, 1, 1))
using LightGraphs
#Warning: using LightGraphs.compose in module Main conflicts with an existing identifier.

Can we do something about this? Maybe rename compose to something else?

Metadata outside of LightGraphs.jl

I felt that PR #163 had gotten off topic so I'm opening up this discussion issue.
cc @CarloLucibello @sbromberger.

LightGraphs.jl current state and goals

I think that it is a good decision to focus on the core graph data structure, traversal, classic graph problems, and modern graph analysis. That is enough of a task for one package. As Graphs.jl shows us, there are quite a few ways to implement graphs. There are many more ways to implement metadata containing graphs. Networkx implemented their graphs in the most flexible way possible. In Networkx everything is a Dict{Any, Any} so you can store whatever metadata you want anywhere. This leads to significant performance problems.

I agree with @sbromberger's diagram about the tradeoffs between performance, simplicity, and flexibility. Since LightGraphs aims to be in the simple and performant part of the space, excluding metadata from the internal graph structure is a sound policy.

For supporting metadata outside of the graph data structure.

However, I think that if someone wants to use a LightGraphs.Graph and attach metadata to it in some external package; we should be supportive of that, without "supporting" it.

A data structure that looks like:

type Network{G<:LightGraphs.SimpleGraph, TE, TV}
g::G
edgemeta::Dict{Edge,TE}
vertmeta::Dict{Int, TV}
end

should work fine and a Networks.jl package could easily support this definition.
The traversal functions in LightGraphs are flexible enough to allow you to reuse quite a bit of the code while adding support for using the metadata in the traversals.

While not standard usage, like the idea of a graph being just the sets G=(V,E) and a network as a graph with some metadata about the vertices and edges.
Networks.jl exists, but the top google hit is a github repo with 3 commits last updated 3 years ago.

Weights work somewhat like this

Some functions that work with edge weights or distances take a graph and a distance matrix. For example, dijkstra_shortest_path.

function dijkstra_shortest_paths{T}(
    g::SimpleGraph,
    srcs::Vector{Int},
    distmx::AbstractArray{T, 2}=DefaultDistance();
    allpaths=false
)

One could formalize this pattern in a type,

type DistGraph{G<:LightGraphs.SimpleGraph, D}
g::G
distmx::AbstractMatrix{D}
end

then use metaprogramming to define methods for all of the functions in LightGraphs to pass them either just g, or g and distmx.

These are my thoughts on the matter. I'm not trying to start a debate. I just think it is good to get our thoughts out there so that people understand how decisions are made on these issues. What do you think?

doc issue with bulleted lists?

@hayd - could use your help here. In http://lightgraphsjl.readthedocs.org/en/latest/pathing/, under Graph Traversal, the source shows:

## Graph Traversal

*Graph traversal* refers to a process that traverses vertices of a graph following certain order (starting from user-input sources). This package implements three traversal schemes:
* `BreadthFirst`,
* `DepthFirst`, and
* `MaximumAdjacency`.

This results in the following markdown:

Graph Traversal

Graph traversal refers to a process that traverses vertices of a graph following certain order (starting from user-input sources). This package implements three traversal schemes:

BreadthFirst,
DepthFirst, and
MaximumAdjacency.

But what's being rendered in rtd isn't right.

Totally not urgent, but I'd appreciate any advice.

Please use unsafe=true with parse_dict

hi, if you're using ParserCombinator.Parsers.GML.parse_dict to parse GML format data, please can you add "unsafe=true" to the call if you don't care about lists other graph, edge and node?

see the docs at https://github.com/andrewcooke/ParserCombinator.jl#parsers for a brief justification.

thanks + sorry for all the emails you must be getting from me on various issues over the last 24 hours.

RFC: inconsistency between readgraphml and readgml

readgraphml can handle multiple graphs and returns a vector of (name::String, g::SimpleGraph) tuples.

readgml can handle multiple graphs but only returns the first one as a SimpleGraph.

read can't handle multiple graphs at the moment so returns SimpleGraph.

Ideally, these functions would be consistent. What's the best way of handling this?

a) leave as-is
b) change readgml to return tuples (will require a change to ParserCombinator)
c) change readgraphml to return the first graph

If b), then possibly
d) retool the proprietary format to allow multiple graphs.

connected components ordering and singletons

I noticed when writing tests for my new version of connected components that the old version does not produced the components in sorted order. The faster version does produce them in sorted order.

also the vertices without neighbors were not listed as singleton components by the old version.
The new version gives them as singleton Vectors

Pagerank computation

Rita asked on the list for Pagerank computation.
This paper corso2005 shows how to compute it as either and eigenvector problem or a linear system in the laplacian. The fastest and easiest way to do this in julia is to dump the adjacency matrix out of the Graph and do the linear algebra with either eigs or Krylov.jl

explanation of outneighbors!

@jpfairbanks -

function outneighbors!(outneighborhood, g, j)
    edgecollection = out_edges(g, j)
    degj = outdegree(g, j)
    for i =1:degj
        outneighborhood[i] = dst(edgecollection[i])
    end
    return sub(outneighborhood, 1:degj)
end

is there ever a case where outdegree(g,j) will NOT equal length(edgecollection)?

[PkgEval] LightGraphs may have a testing issue on Julia 0.3 (2015-08-12)

PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.3) and the nightly build of the unstable version (0.4). The results of this script are used to generate a package listing enhanced with testing results.

On Julia 0.3

On 2015-08-03 the testing status was Tests pass.
On 2015-08-12 the testing status changed to Tests fail.

This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.

Test log:

>>> 'Pkg.add("LightGraphs")' log
INFO: Installing AutoHashEquals v0.0.5
INFO: Installing LightGraphs v0.3.0
INFO: Installing LightXML v0.2.0
INFO: Installing ParserCombinator v1.6.3
INFO: Building LightXML
INFO: Package database updated

>>> 'Pkg.test("LightGraphs")' log
Julia Version 0.3.11
Commit 483dbf5* (2015-07-27 06:18 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
INFO: Testing LightGraphs
ERROR: BoundsError()
 in indexed_next at tuple.jl:20
 in readgraph at /home/vagrant/.julia/v0.3/LightGraphs/src/persistence.jl:44
 in readgraph at /home/vagrant/.julia/v0.3/LightGraphs/src/persistence.jl:39
 in include at ./boot.jl:245
 in include_from_node1 at loading.jl:128
 in process_options at ./client.jl:285
 in _start at ./client.jl:354
while loading /home/vagrant/.julia/v0.3/LightGraphs/test/runtests.jl, in expression starting on line 32
=============================[ ERROR: LightGraphs ]=============================

failed process: Process(`/home/vagrant/julia/bin/julia /home/vagrant/.julia/v0.3/LightGraphs/test/runtests.jl`, ProcessExited(1)) [1]

================================================================================
INFO: No packages to install, update or remove
ERROR: LightGraphs had test errors
 in error at error.jl:21
 in test at pkg/entry.jl:718
 in anonymous at pkg/dir.jl:28
 in cd at ./file.jl:20
 in cd at pkg/dir.jl:28
 in test at pkg.jl:67
 in process_options at ./client.jl:213
 in _start at ./client.jl:354

>>> End of log

Pkg.test error with 6-day-old master

ERROR: LoadError: LoadError: MethodError: `convert` has no method matching convert(::Type{LightGraphs.MinCutVisitor{T}}, ::LightGraphs.Graph, ::BitArray{1}, ::Array{Int64,1}, ::Float64, ::Float64, ::Int64, ::Base.SparseMatrix.SparseMatrixCSC{Float64,Int64}, ::Array{Int64,1})
This may have arisen from a call to the constructor LightGraphs.MinCutVisitor{T}(...),
since type constructors fall back to convert methods.
Closest candidates are:
  LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.DiGraph,LightGraphs.Graph}, ::AbstractArray{Bool,1}, ::Array{Int64,1}, ::T, ::T, ::Integer, ::AbstractArray{T,2}, ::Array{Int64,1})
  LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.DiGraph,LightGraphs.Graph}, ::AbstractArray{T,2})
  call{T}(::Type{T}, ::Any)
  ...
 in call at /Users/seth/.julia/v0.4/LightGraphs/src/maxadjvisit.jl:100
 in mincut at /Users/seth/.julia/v0.4/LightGraphs/src/maxadjvisit.jl:174
 in include at ./boot.jl:254
 in include_from_node1 at loading.jl:133
 in anonymous at no file:68
 in include at ./boot.jl:254
 in include_from_node1 at loading.jl:133
 in process_options at ./client.jl:306
 in _start at ./client.jl:406
while loading /Users/seth/.julia/v0.4/LightGraphs/test/maxadjvisit.jl, in expression starting on line 39
while loading /Users/seth/.julia/v0.4/LightGraphs/test/runtests.jl, in expression starting on line 65

but what I don't understand:

convert(::Type{LightGraphs.MinCutVisitor{T}}, ::LightGraphs.Graph, ::BitArray{1}, ::Array{Int64,1}, ::Float64, ::Float64, ::Int64, ::Base.SparseMatrix.SparseMatrixCSC{Float64,Int64}, ::Array{Int64,1})

should map to
LightGraphs.MinCutVisitor{T}(::Union{LightGraphs.DiGraph,LightGraphs.Graph}, ::AbstractArray{Bool,1}, ::Array{Int64,1}, ::T, ::T, ::Integer, ::AbstractArray{T,2}, ::Array{Int64,1})

Performance comparison

How well a library performs compared to others is a major factor (along with comprehensiveness and ease of use) in its adoption. This information is in particular useful for people like me who are generally curious about Julia's alleged performance, but are not ready to make the jump unless there is something to be gained---the typical position of someone in technical computing. So far, there is only speedups evidence for scripts that are not used very often (fibonacci numbers, anyone?), a performance comparison in an applied context would go a long way to show Julia's relevance.

A first list of contenders based on my own experience of graph libraries (add yours!)

NetworkX is a pure Python library.
Graph-tool is a Python interface to the C++ Boost Graph Library.
igraph is a C library with a Python and R interface.
Boost Graph Library is a C++ library.

A first list of benchmark algorithms

Building large graphs (Erdos-Renyi, BA...)
Random deletion of nodes/edges
PageRank
Single-source shortest path
Betweenness centrality

All these would go in test/perf along with the LG-only benchmarks, as mentioned in #45 and following what is done in Julia's repo.

WIP: Datasets

Now that we've got our own github org, it may make sense to move smallgraphs out from LightGraphs and into its own module. The graphs are reasonably stable (despite an enhancement we made last week - see #178 / #181 ). Ref #185 for additional discussion.

Two approaches:

keep Smallgraphs as a submodule of LightGraphs (similar to how @andrewcooke does ParserCombinator)
move Smallgraphs into its own peer package.

I'm leaning towards 1). The submodule would be inextricably linked to LightGraphs, since we're calling unsafe_add_edge!() which is unlikely to exist in other graphs packages. Plus, we need smallgraphs for tests, so we'd have to REQUIRE smallgraphs in any case.

Thoughts?

Images for docs

RFC: Parameterize graphs

Right now we have Graph with vertices as a range of Integer. It might make more sense (per this thread: https://groups.google.com/forum/#!topic/julia-users/dmO6vx6icmo) to parameterize the number of vertices. This is likely a significant effort.

unnecessary BFSVisitorVector as parameter in bfs_tree?

@jpfairbanks
https://github.com/JuliaGraphs/LightGraphs.jl/blob/master/src/traversals/bfs.jl#L168-L173

Why do we pass in a BFSVisitorVector when we're creating one inside the function?

add_edge! without error on duplicate

Is there a performance benefit to saying for some applications we won't check if the edge already exists? the two possible semantics are:

you insert a duplicate edge and nothing changes, you don't see an error.
you insert a duplicate and it gets appended to the incidence list, when iterating over out_neighbors(g,v) you see the duplicate edges twice.

Do these lead to difficulties later?

Redesign floyd_warshall_shortest_paths?

floyd_warshall_shortest_paths is quite a long function name. Given that the name Floyd-Warshall is sufficient to identify the algorithm, can we rename this function to floyd_warshall?

Also, can we make FloydWarshallState.dists a matrix rather than the vector of vectors that it currently is? The distance matrix is a key input to graph layout algorithms, such as GraphLayout.layout_stressmajorize_adj.

Should help with #23 jpfairbanks/GraphMatrices.jl#1

(Mistakenly posted as JuliaIO/LightXML.jl#24)

Placeholder for graph parallelism

Just wanted to start this to collect thoughts on good ways of taking advantage of parallelism in Julia.

erdos_renyi is n^2 algorithm

You can instead compute the number of edges to draw as ne ~ binomial((n^2-n)/2 , p) and then pick i,j uniformly at random until you hit ne edges. This leads to an O(ne) algorithm for small p For large p you end up doing O(n^2) work anyway.

Slow to create large sub-graphs?

Hello,

I'm trying to create large sub-graphs efficiently. I'm not sure if I'm doing this in the correct way (I'm personally new to Julia), but it seems very slow at present?

e.g.

using StatsBase
using LightGraph
g = Graph(1000000)
m = sample(1:1000000, 200000, replace=false)
@time g[m] # also tried induced_subgraph(g,m)

For some benchmarks on my machine (Julia 0.4.0):

length(m)    @time
300000       329.899664 seconds (900.20 k allocations: 86.183 MB, 0.04% gc time)
200000       145.917879 seconds (600.19 k allocations: 64.346 MB, 0.06% gc time)
100000       38.010782 seconds (300.17 k allocations: 30.509 MB)

Is there any way to speed this up? I'm asking for a general problem with lots of edges, but even in this simple case with no edges, this seems far too slow.

I just tried writing my own naive function and tried it on one of my real graphs with lots of edges (e.g. 262144 vertices, 786432 edges):

new_g = Graph(length(m))
map = [m[i] => i for i = 1:length(m)]
for edge in g.edges
    try
        new_source = map[e[1]]
        try
            new_destination = map[e[2]]
            add_edge!(new_g, new_source, new_destination)
        end
    end
end

This seems like a terrible function, and it does run much much slower for small sub-graphs, but it gives the same result (g[m]==new_g) and seems to scale far better, e.g. for selecting a random sub-graph containing half the original vertices of my {262144, 786432} graph, the standard way takes about 10 minutes, and my naive function above takes about 30 seconds. It actually gets faster as the size of my subgraph gets larger (presumably because fewer of those try statements fail). For large random sub-graphs (e.g. 90% of the vertices in the original graph), the naive function takes about 7.5s while the standard way never seems to actually finish.

Thanks for any help, and apologies if this is the wrong place to ask!

Split persistence functions into separate files based on format

persistence.jl is getting kinda crowded.

add docs to read_graphml

Just a tickler so I don't forget.

decision: make mincut() return induced subgraphs?

Right now, mincut returns a vector of booleans and the cut cost. I'd like to explore whether it makes more sense to return two induced subgraphs indicating the partitions instead of the vector of booleans. It will be an easy change to make, but are there any obvious drawbacks here?

Memory might be one. A boolean vector of length n takes up n bytes.

is there a reason that Laplacian matrices aren't defined for DiGraphs?

This one's for @jpfairbanks :)

tests for connected_components!

Hey @jpfairbanks - would you mind writing some tests for the new connected_components!() functions? Thanks.

Enhancement request: DOT file parser

ref: https://en.wikipedia.org/wiki/DOT_(graph_description_language)

cc @andrewcooke :)

[PkgEval] LightGraphs may have a testing issue on Julia 0.4 (2015-06-30)

On Julia 0.4

On 2015-06-27 the testing status was Tests pass.
On 2015-06-30 the testing status changed to Tests fail.

Test log:

>>> 'Pkg.add("LightGraphs")' log
INFO: Cloning cache of LightGraphs from git://github.com/JuliaGraphs/LightGraphs.jl.git
INFO: Installing DataStructures v0.3.10
INFO: Installing Distributions v0.7.3
INFO: Installing LightGraphs v0.1.14
INFO: Installing LightXML v0.1.11
INFO: Installing PDMats v0.3.3
INFO: Building LightXML
INFO: Package database updated

>>> 'Pkg.test("LightGraphs")' log
Julia Version 0.4.0-dev+5700
Commit 147fa0b* (2015-06-29 20:31 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Nehalem)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3
INFO: Testing LightGraphs
Warning: could not import Base.@math_const into Distributions
ERROR: LoadError: LoadError: LoadError: LoadError: UndefVarError: @math_const not defined
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in reload_path at ./loading.jl:157
 in _require at ./loading.jl:69
 in require at ./loading.jl:55
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in reload_path at ./loading.jl:157
 in _require at ./loading.jl:69
 in require at ./loading.jl:52
 in include at ./boot.jl:254
 in include_from_node1 at loading.jl:133
 in process_options at ./client.jl:305
 in _start at ./client.jl:405
while loading /home/vagrant/.julia/v0.4/Distributions/src/constants.jl, in expression starting on line 6
while loading /home/vagrant/.julia/v0.4/Distributions/src/Distributions.jl, in expression starting on line 252
while loading /home/vagrant/.julia/v0.4/LightGraphs/src/LightGraphs.jl, in expression starting on line 5
while loading /home/vagrant/.julia/v0.4/LightGraphs/test/runtests.jl, in expression starting on line 2
=============================[ ERROR: LightGraphs ]=============================

failed process: Process(`/home/vagrant/julia/bin/julia --check-bounds=yes --code-coverage=none --color=no /home/vagrant/.julia/v0.4/LightGraphs/test/runtests.jl`, ProcessExited(1)) [1]

================================================================================
INFO: No packages to install, update or remove
ERROR: LightGraphs had test errors
 in error at ./error.jl:21
 in test at pkg/entry.jl:746
 in anonymous at pkg/dir.jl:31
 in cd at file.jl:22
 in cd at pkg/dir.jl:31
 in test at pkg.jl:71
 in process_options at ./client.jl:281
 in _start at ./client.jl:405

>>> End of log

matvec issues

@jpfairbanks

Pkg.test("LightGraphs") is throwing warnings here:

running /Users/seth/.julia/v0.4/LightGraphs/test/graphmatrices.jl ...
WARNING: replacing module TestLightGraphs

and doesn't seem to be returning. Is there a reason TestLightGraphs is its own module?

Also, CombinatorialAdjacency(g) doesn't seem to work as I'd expect (that is, at all).

length(graph)

What should the length of a graph be? It looks like it is nv^2. Why is that?

DISCUSSION: new version of LG will be for Julia 0.4 (and higher) only.

I've been working on a v4 branch for a bit and I think it's ready to merge. My thought is that it's been a pain to support 0.3 and 0.4 (which we've done since we started), and that I'd like to cut the 0.3 cord now that 0.4 has been out for a bit. To this end, here's what I propose:

sometime in the next week we tag a final version of LG that passes tests on 0.3
I merge the v4 branch (this has some fixes and gets rid of Compat.jl, among other things)
further development happens on master, which is now 0.4-only.

In the event there's a "critical" bug we can always backport it, but all new development should be in the 0.4-specific master.

Thoughts?

explicit cc @jpfairbanks @hayd @pranavtbhat

adjacency matrices are technically incorrect.

Adjacency matrix values for self-loops should be 2, not 1. Easily fixed in the old code, need to find an efficient way of doing it for sparse matrices.

Bug?: strongly_connected_components

(Copied from QuantEcon/QuantEcon.jl#32, comment)

I was considering the digraph in Figure 8 on page 12 in "Graph-Theoretic Analysis of Finite Markov Chains" by J. P. Jarvis and D. R. Shier, which is clearly strongly connected, but LightGraphs.strongly_connected_components returns three components:

julia> g = DiGraph(6)
{6, 0} directed graph

julia> eds = [(1, 3), (3, 4), (4, 2), (2, 1), (3, 5), (5, 6), (6, 4)]
7-element Array{Tuple{Int64,Int64},1}:
 (1,3)
 (3,4)
 (4,2)
 (2,1)
 (3,5)
 (5,6)
 (6,4)

julia> for (ed_from, ed_to) in eds
           add_edge!(g, ed_from, ed_to)
       end

julia> edges(g)
Set([edge 3 - 5,edge 5 - 6,edge 6 - 4,edge 3 - 4,edge 1 - 3,edge 2 - 1,edge 4 - 2])

julia> strongly_connected_components(g)
3-element Array{Array{Int64,1},1}:
 [6]      
 [5]      
 [1,3,4,2]

I wonder if I am missing anything.

Just in case, NetworkX returns the whole set of nodes as a unique strongly connected component:

>>> import networkx as nx
>>> nodes = list(range(1, 7))
>>> edges = [(1, 3), (3, 4), (4, 2), (2, 1), (3, 5), (5, 6), (6, 4)]
>>> G = nx.DiGraph()
>>> G.add_nodes_from(nodes)
>>> G.add_edges_from(edges)
>>> G.edges()
[(1, 3), (2, 1), (3, 4), (3, 5), (4, 2), (5, 6), (6, 4)]
>>> nx.number_strongly_connected_components(G)
1
>>> list(nx.strongly_connected_components(G))
[[1, 3, 5, 6, 4, 2]]

Precompilation interferes with _HAVE_GRAPHMX

I installed LightGraphs on a machine without GraphMatrices, and ran the following session:

Pkg.add("LightGraphs")
Pkg.test("LightGraphs")
# GraphMatrices is not installed warning
Pkg.add("GraphMatrices")
Pkg.test("LightGraphs")
# GraphMatrices is still not installed! WAT?!

I added a print statement to LightGraphs which forced a reprecompile so now it works.
I am not sure if it is worth fixing, but it is definitely worth posting so that people (myself included) don't have to figure it out again. Maybe a note in the warning to recompile after installing Pkg.add(GraphMatrices) is useful.

Problems reading a xml graph

Hi,

I am new in LightGraphs.jl. It seems to be a great library when our graph files and our needs fit with the library's features.

I am trying to read a simple xml graph that was created using Julia + PyCall + python-Igraph (command write_graphml(file) - http://igraph.org/python/doc/igraph-pysrc.html#GraphBase.write_graphml) .

I have Julia 0.4.0-dev+5920 (2015-07-10) and LightGraphs.jl installed. I am using the following commands:

using LightGraphs
net1 = readgraphml("network.xml")

And I got the following error:
ERROR: Unknown node key
in error at ./error.jl:21
in readgraphml at /home/cdesantana/.julia/v0.4/LightGraphs/src/persistence.jl:106

I couldn't find anything about this error in other forums.

In file persistence.jl the error corresponds to the following piece of code:

        if name(e) == "graph"
            nodes = Dict{String,Int}()
            edges = @compat Tuple{Int, Int}[]
            graphname = has_attribute(e, "id") ? attribute(e, "id") : nothing
            edgedefault = attribute(e, "edgedefault")
            isdirected = edgedefault=="directed" ? true :
                         edgedefault=="undirected" ? false : error("Unknown value of edgedefault: $edgedefault")
        else
            error("Unknown node $(name(e))")
        end

Does it mean that my XML file is bad? The file is here: https://github.com/cndesantana/filestoshare/blob/master/network.xml

If so, what is the best way to generate an XML file to be read by LightGraphs.jl?

Thanks for your attention! Sorry the large message.

Charles

possible bug in betweenness_centrality with self-loops

NetworkX:

>>> g = nx.DiGraph()
...
>>> g.edges()
[(1, 2), (2, 3), (3, 3)]
>>> nx.betweenness_centrality(g)
{1: 0.0, 2: 0.5, 3: 0.0}

LightGraphs:

julia> betweenness_centrality(g)
3-element Array{Float64,1}:
 0.0
 0.25
 1.0

Discrepancy only exists with self-loops. Email sent to Dr. Hagberg requesting clarification.

MASSIVE performance regression with recent 0.4 and latest LightGraphs

Here for tracking. Betweenness centrality for the pgp2 graph didn't finish after 45 minutes when before it was ~1160 seconds. In order to bisect, I'm timing a subset of the graph:

julia> h = g[1:3000]
{3000, 39576} directed graph

julia> @time z = betweenness_centrality(h);
 10.568626 seconds (170.98 M allocations: 3.982 GB, 5.85% gc time)

Small graphs for debugging

I used BullGraph and TutteGraph to catch a bug in my code. Having test cases available like that is really handy. I drew them and visually computed the correct answer and then used that to debug against.

More of a public positive feedback than an issue.

Removing edges in digraphs

This excerpt from the test for max flow is suspicious

# Construct DiGraph
flow_graph = DiGraph(8);

# Add edges
add_edge!(flow_graph,1,2);
add_edge!(flow_graph,1,3);
add_edge!(flow_graph,1,4);
add_edge!(flow_graph,2,3);
add_edge!(flow_graph,2,5);
add_edge!(flow_graph,2,6);
add_edge!(flow_graph,3,4);
add_edge!(flow_graph,3,6);
add_edge!(flow_graph,4,7);
add_edge!(flow_graph,5,6);
add_edge!(flow_graph,5,8);
add_edge!(flow_graph,6,7);
add_edge!(flow_graph,6,8);
add_edge!(flow_graph,7,3);
add_edge!(flow_graph,7,8);


@show neighbors(flow_graph, 1)
@show flow_graph.fadjlist
@show flow_graph.badjlist
for dst in neighbors(flow_graph, 1)
    rem_edge!(flow_graph, 1, dst) 
end
print(neighbors(flow_graph, 1))
# neighbors of 1 is not empty!
@show flow_graph.fadjlist
@show flow_graph.badjlist
for dst in neighbors(flow_graph, 1)
    rem_edge!(flow_graph, 1, dst) 
end
print(neighbors(flow_graph, 1))
v, P, S, flag = LightGraphs.fetch_path(flow_graph, 1, 8, flow_matrix, capacity_matrix)
@test flag != 0

produces the following output

neighbors(flow_graph,1) = [2,3,4]
flow_graph.fadjlist = [[2,3,4],[3,5,6],[4,6],[7],[6,8],[7,8],[3,8],Int64[]]
flow_graph.badjlist = [Int64[],[1],[1,2,7],[1,3],[2],[2,3,5],[4,6],[5,6,7]]
[3]flow_graph.fadjlist = [[3],[3,5,6],[4,6],[7],[6,8],[7,8],[3,8],Int64[]]
flow_graph.badjlist = [Int64[],Int64[],[1,2,7],[3],[2],[2,3,5],[4,6],[5,6,7]]
Int64[]

Docile / docstrings

OK, I'm convinced we need them. Opening this up to track.