Giter Club home page Giter Club logo

annoy4s

Build Status

A JNA wrapper around spotify/annoy which calls the C++ library of annoy directly from Scala/JVM.

Installation

For linux-x86-64 or Mac users, just add the library directly as:

libraryDependencies += "net.pishen" %% "annoy4s" % "0.10.0"

If you meet an error like below when using annoy4s, you may have to compile the native library by yourself.

java.lang.UnsatisfiedLinkError: Unable to load library 'annoy': Native library

To compile the native library and install annoy4s on local machine:

  1. Clone this repository.
  2. Check the values of organization and version in build.sbt, you may change it to the value you want, it's recommended to let version have the -SNAPSHOT suffix.
  3. Run compileNative in sbt (Note that g++ installation is required).
  4. Run test in sbt to see if the native library is successfully compiled.
  5. Run publishLocal in sbt to install annoy4s on your machine.

Now you can add the library dependency as (organization and version may be different according to your settings):

libraryDependencies += "net.pishen" %% "annoy4s" % "0.10.0-SNAPSHOT"

The library file generated by the g++ command in compileNative can also be installed independently on your machine. Please reference to library search paths for more details on how to make JNA able to load the library.

Usage

Create and query the index in memory mode:

import annoy4s._

val annoy = Annoy.create[Int]("./input_vectors", numOfTrees = 10, metric = Euclidean, verbose = true)

val result: Option[Seq[(Int, Float)]] = annoy.query(itemId, maxReturnSize = 30)
  • The format of ./input_vectors is <item id> <vector> for each line, here is an example:
3 0.2 -1.5 0.3
5 0.4 0.01 -0.5
0 1.1 0.9 -0.1
2 1.2 0.8 0.2
  • <item id> could be Int, Long, String, or UUID, just change the type parameter at Annoy.create[T]. You can also implement a KeyConverter[T] by yourself to support your own type.
  • metric could be Euclidean, Angular, Manhattan or Hamming.
  • result is a tuple list of id and distances, where the query item is itself contained.

To use the index in disk mode, one need to provide an outputDir:

val annoy = Annoy.create[Int]("./input_vectors", 10, outputDir = "./annoy_result/", Euclidean)

val result: Option[Seq[(Int, Float)]] = annoy.query(itemId, maxReturnSize = 30)

annoy.close()

// load an created index
val reloadedAnnoy = Annoy.load[Int]("./annoy_result/")

val reloadedResult: Option[Seq[(Int, Float)]] = reloadedAnnoy.query(itemId, 30)

annoy4s's Projects

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.