Giter Club home page Giter Club logo

google-all-pairs-similarity-search's People

Contributors

roberto-bayardo avatar

Stargazers

 avatar

Watchers

 avatar

google-all-pairs-similarity-search's Issues

How to use two data sets to compute their intersection?

Roberto: Sorry for the late reply but for whatever reason, the first
notification about your Jan 2nd question got lost in my spam filter.
Since you closed the original ticket I am opening a new one with
clarifications.

What I meant is the ability to provide as an input not one dataset but two
dataset. 

In this setting, one dataset would be some "reference" and the second
dataset a "query" dataset. 
The goal would be to find all items in the "query" set that are similar to
items in the "reference" data set above a certain threshold: basically
returning the similarity intersection between the two sets as opposed to
the current setting where only pairs within the same are considered. I
guess one way could be to merge the sets and discard pairs returned from
the same set, though that does seem pretty naive.  

Original issue reported on code.google.com by [email protected] on 26 Jan 2010 at 6:59

dblp_le.bin dataset has sets containing duplicate values

The dblp_le.bin from the downloads section uploaded Aug 2007 does not satisfy 
the input requirements of the all-pairs implementation: there are a handful 
of vectors which contain duplicate features. This leads to a few strange 
results.

Original issue reported on code.google.com by [email protected] on 8 Jan 2010 at 12:28

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.