Giter Club home page Giter Club logo

gsql-graph-algorithms's Introduction

README for GSQL Algorithm Library
for TigerGraph version 3.0 or higher
8/19/2020

The GSQL Graph Algorithm Library is a collection of high-performance GSQL queries,
each of which implements a standard graph algorithm. Each algorithm is ready to be
installed and used, either as a stand-alone query or as a building block of a larger
analytics application.GSQL running on the TigerGraph platform is particularly
well-suited for graph algorithms:

* Turing-complete with full support for imperative and procedural programming,
 ideal for algorithmic computation.

* Parallel and Distributed Processing, enabling computations on larger graphs.

* User-Extensible.
  Because the algorithms are written in standard GSQL and compiled by the user,
  they are easy to modify and customize.

* Open-Source. Users can study the GSQL implementations to learn by
  example, and they can develop and submit additions to the library.

Key Changes since version 3.0
-----------------------------
* New algorithms: estimate_diameter, kcore, maximal_indep_set
* Nearly all the algorithms now work as schema-free algorithms, making them much easier
  to use
* Instead of having up to 3 versions of an algorithm, to handle 3 different choices
  for output style, they have been merge to one version, with parameters for output
  filename and/or attribute to store the result.
* Parameter lists have been updated:
  - to make an algorithm schema-free, runtime parameters are needed
  - to specify how to handle the results
  - other parameters may have been added to make the algorithms easier to tune

Library Structure
-----------------

You can download the library from github:
https://github.com/tigergraph/gsql-graph-algorithms

The library contains two main sections: Algorithms and Tests.

The Algorithms folder contains template algorithms and scripts to help you customize
and install them. There are three folders:
		
	schema-free/
	    contains GSQL queries which are ready to use as is, relying on
	    input parameters to specify vertex types and edge types.
	        
	templates/
		contains template algorithms with some placeholder code
		and markers which need to be acted on by the installation script.

	examples/
		contains GSQL queries generated from the templates by the installation script.
		
The Tests folder contains small sample graphs that you can use to experiment with the
algorithms. In our online documentation, we use the test graphs to show you the expected
result for each algorithm. The graphs are small enough that you can manually calculate
and sometimes intuitively see what the answers should be.


List of GSQL Graph Algorithms
-----------------------------
as of Aug 19, 2020

betweenness_cent                Betweenness Centrality
closeness_cent                  Closeness Centrality
conn_comp                       Connected Component Detection
kcore                           K-Core
wcc_fast                        Connected Components (Fast)
scc                             Strongly Connected Component Detection
label_prop                      Label Propagation Method for Community Detection
louvain_parallel                Parallel Louvain Modularity Method with Refinement for Community Detection
pageRank [2]                    PageRank measurement of relative influence of each vertex
pageRank_wt [2]                 Weighted PageRank
pageRank_pers [2]               Personalized PageRank
shortest_ss_no_wt               Single-Source Shortest Paths without weight
shortest_ss_pos_wt              Single-Source Shortest Paths with positive weight
shortest_ss_any_wt              Single-Source Shortest Paths
maximal_indep_set               Maximal Independent Set
mst                             Minimum Spanning Tree (MST)
msf                             Minimum Spanning Forest (MSF)
cycle_detection                 Rocha–Thatte algorithm for cycle detection
estimate_diameter               Heuristic estimate of graph diameter
tri_count [1]                   Count all the triangles, memory effient
tri_count_fast [1]              Count all the triangles, faster but using more memory
cosine_nbor_ss                  Cosine Similarity from a single vertex
cosine_nbor_ap                  Cosine Similarity for each pair of vertices
jaccard_nbor_ss [2]             Jaccard Similarity from a single vertex
jaccard_nbor_ap [2]             Jaccard Similarity for each pair of vertices
knn_cosine_ss       k-Nearest Neighbor classification, using Cosine Similarity, single source
knn_cosine_all      k-Nearest Neighbor classification, using Cosine Similarity, batch
knn_cosine_cv       Cross validation for k-Nearest Neighbor, using Cosine Similarity

Notes:
[1] This algorithm is not in the schema-first folder; use the install.sh script to customize it
    for your target graph schema
[2] The schema-free version of this algorithm can use only one edge type.
    If you have a set of edge types, use the template algorithm and the install.sh script

Each non-schema-free template algorithm comes in three forms:

pageRank.gsql - base version. Results are provided as JSON output.
		Not persisted to the graph database.

pageRank_file.gsql - Results are in CSV format to a file.
		Not persisted to the graph database.

pageRank_attr.gsql - Results are written to vertex or edge attributes
		which the user specifies.

The schema-free algorithms ofter all three options in one algorithm.

Get Started
-----------
If you want to use one of the test graphs, load it before installing the algorithms:
See the README.test file in the tests folder


* Install Algorithms:
1) You should create a graph schema in GSQL first.
2) Change into the algorithms folder.
3) Run a installation script, i.e.,
   bash install.sh
   and answer the questions.
   
* Schema free Algorithms:
1) Change the graph name specified in CREATE statement.
2) Use the script directly.

More detailed documentation and examples are available on the web at
https://docs.tigergraph.com/graph-algorithm-library





gsql-graph-algorithms's People

Contributors

rkahhaleh avatar suxiaocai avatar jerrychentg avatar victorleetg avatar ramko9999 avatar y-tigergraph avatar jonherke-tg avatar xchang1986 avatar changranliu avatar gaolk avatar

Stargazers

Chengsr avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.