Giter Club home page Giter Club logo

spark-milvus's Introduction

Spark-Milvus

This is a Milvus connector library for Spark. It can be used by building the jar from the source code and including it in your application.

Dependencies

Please ensure you have Spark 3.1.2, Scala 2.11.12 and Java 8 installed. The other dependencies are handled by the library itself.

Building the jar

git clone https://github.com/kpan2034/spark-milvus.git
cd spark-milvus
sbt assembly

You can include the jar spark-milvus_2.12-0.1.0-SNAPSHOT.jar in the spark submit command for your Spark application.

Example spark-submit command: spark-submit --class org.bdad.sparkmilvus.Main --jars jars/milvus-sdk-java-2.3.3.jar target/scala-2.12/spark-milvus_2.12-0.1.0-SNAPSHOT.jar

Usage

You can use this connector whereever you use spark.read, in the following way:

val df = spark.read
    .format("com.milvus.spark.connector.MilvusTableProvider") // specify connector
    .option("spark.milvus.uri", "<HOST URI") // host for your Milvus cluster
    .option("spark.milvus.token", "<TOKEN>") // connectioni token
    .option("spark.milvus.collectionName", "search_article_in_medium") // collection name in Milvus
    .option("spark.milvus.numPartitions", 128) // number of partitions you want to read in
    .option("spark.milvus.predicateFilter", "publication==\"The Startup\"") // any filtering you want to perform at Spark level
    .load()

spark-milvus's People

Contributors

kpan2034 avatar jayanthreddy1997 avatar manasvegi avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.