Giter Club home page Giter Club logo

nebula-algorithm's Introduction

Welcome to Nebula Algorithm


English | 中文

nebula-algorithm is a Spark Application based on GraphX with the following Algorithm provided for now:

Name Use Case
PageRank page ranking, important node digging
Louvain community digging, hierarchical clustering
KCore community detection, financial risk control
LabelPropagation community detection, consultation propagation, advertising recommendation
Hanp community detection, consultation propagation
ConnectedComponent community detection, isolated island detection
StronglyConnectedComponent community detection
ShortestPath path plan, network plan
TriangleCount network structure analysis
GraphTriangleCount network structure and tightness analysis
BetweennessCentrality important node digging, node influence calculation
ClosenessCentrality important node digging, node influence calculation
DegreeStatic graph structure analysis
ClusteringCoefficient recommended, telecom fraud analysis
Jaccard similarity calculation, recommendation
BFS sequence traversal, Shortest path plan
Node2Vec graph machine learning, recommendation

You could submit the entire spark application or invoke algorithms in lib library to apply graph algorithms for DataFrame.

Get Nebula Algorithm

  1. Build Nebula Algorithm

    $ git clone https://github.com/vesoft-inc/nebula-algorithm.git
    $ cd nebula-algorithm
    $ mvn clean package -Dgpg.skip -Dmaven.javadoc.skip=true -Dmaven.test.skip=true
    

    After the above buiding process, the target file nebula-algorithm-3.0-SNAPSHOT.jar will be placed under nebula-algorithm/target.

  2. Download from Maven repo

    Alternatively, it could be downloaded from the following Maven repo:

    https://repo1.maven.org/maven2/com/vesoft/nebula-algorithm/

Use Nebula Algorithm

  • Option 1: Submit nebula-algorithm package

    • Configuration

    Refer to the configuration example.

    • Submit Spark Application
    ${SPARK_HOME}/bin/spark-submit --master <mode> --class com.vesoft.nebula.algorithm.Main nebula-algorithm-3.0—SNAPSHOT.jar -p application.conf
    
    • Limitation

    Due to Nebula Algorithm jar does not encode string id, thus during the algorithm execution, the source and target of edges must be in Type Int (The vid_type in Nebula Space could be String, while data must be in Type Int).

  • Option2: Call nebula-algorithm interface

    Now there are 10+ algorithms provided in lib from nebula-algorithm, which could be invoked in a programming fashion as below:

    • Add dependencies in pom.xml.
     <dependency>
          <groupId>com.vesoft</groupId>
          <artifactId>nebula-algorithm</artifactId>
          <version>3.0.0</version>
     </dependency>
    
    • Instantiate algorithm's config, below is an example for PageRank.
    import com.vesoft.nebula.algorithm.config.{Configs, PRConfig, SparkConfig}
    import org.apache.spark.sql.{DataFrame, SparkSession}
    
    val spark = SparkSession.builder().master("local").getOrCreate()
    val data  = spark.read.option("header", true).csv("src/test/resources/edge.csv")
    val prConfig = new PRConfig(5, 1.0)
    val prResult = PageRankAlgo.apply(spark, data, prConfig, false)
    

    If your vertex ids are Strings, see Pagerank Example for how to encoding and decoding them.

    For examples of other algorithms, see examples

    Note: The first column of DataFrame in the application represents the source vertices, the second represents the target vertices and the third represents edges' weight.

Nebula config

If you want to write the algorithm result into Nebula, make sure there is corresponding property name in your tag.
|        Algorithm         |     property name       |property type|
|:------------------------:|:-----------------------:|:-----------:|
|         pagerank         |         pagerank        |double/string|
|          louvain         |          louvain        | int/string  |
|          kcore           |           kcore         | int/string  |
|     labelpropagation     |           lpa           | int/string  |
|   connectedcomponent     |            cc           | int/string  |
|stronglyconnectedcomponent|            scc          | int/string  |
|         betweenness      |         betweenness     |double/string|
|        shortestpath      |        shortestpath     |   string    |
|        degreestatic      |degree,inDegree,outDegree| int/string  |
|        trianglecount     |       trianglecount     | int/string  |
|  clusteringcoefficient   |    clustercoefficient   |double/string|
|         closeness        |         closeness       |double/string|
|            hanp          |            hanp         | int/string  |
|            bfs           |            bfs          |    string   |
|         jaccard          |          jaccard        |    string   |
|        node2vec          |          node2vec       |    string   |

Version match

Nebula Algorithm Version Nebula Version
2.0.0 2.0.0, 2.0.1
2.1.0 2.0.0, 2.0.1
2.5.0 2.5.0, 2.5.1
2.6.0 2.6.0, 2.6.1
2.6.1 2.6.0, 2.6.1
2.6.2 2.6.0, 2.6.1
3.0.0 3.0.x, 3.1.x
3.0-SNAPSHOT nightly

Contribute

Nebula Algorithm is open source, you are more than welcomed to contribute in the following ways:

  • Discuss in the community via the forum or raise issues here.
  • Compose or improve our documents.
  • Pull Request to help improve the code itself here.

nebula-algorithm's People

Contributors

amber1990zhang avatar codelone avatar cooper-lzy avatar darionyaphet avatar dutor avatar guojun85 avatar harrischu avatar jievince avatar jude-zhu avatar laura-ding avatar nicole00 avatar oldlady344 avatar randomjoe211 avatar riverzzz avatar shinji-ikarig avatar sophie-xie avatar wengzhenjie avatar wey-gu avatar whitewum avatar yixinglu avatar zaki-cmd avatar zhongqishang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.