Giter Club home page Giter Club logo

csharp-vptree's Introduction

CSharp-vptree

CSharp (C#) version of vantage-point tree (VP tree)

Build status

.NET

Nuget

vptree

Introduction to this project

This is one implementation of vantage-point tree (VP tree) for C#. As usual most of the error handling code has been stripped away, so YMMV.

Introduction to vantage-point tree (VP tree)

Vantage-point tree as Wiki says: "A vantage-point tree (or VP tree) is a metric tree that segregates data in a metric space by choosing a position in the space (the "vantage point") and partitioning the data points into two parts: those points that are nearer to the vantage point than a threshold, and those points that are not. By recursively applying this procedure to partition the data into smaller and smaller sets, a tree data structure is created where neighbors in the tree are likely to be neighbors in the space."

So VP tree (once created) can be used to find data that is distance-wise near certain point. e.g. this project provides implementation (in Example.cs) that allows one to find X nearest cities of given latitude and longitude point.

Implementation

This project is a C# port of C++ code that was given to the world by Steve Hanov in his blog.

VpTree.cs has the VP tree related code (you can copy + paste this file to your project).
LinearSearch.cs provides a basic linear search that you can use for performance comparision.
Example.cs gives sample code and some performance measurements.
tests folder contains test cases.

You can get the full cities.txt file used in Example.cs and Test.cs from here. When extracted the size of the file is 125 806 517 bytes (~119MB). The cities.txt contains duplicate latitude and longitude entries for certain places, so order of results might vary on different runs.

You can find smaller (and zipped) cities example file from tests/testfiles folder

Differences between C++ and C# version

Biggest change is that this C# version uses zero based indexing and not iterators (like the C++ version does).

Don't optimize Math.Sqrt away!

As Steve Hanov said in his blog: "It is worth repeating that you must use a distance metric that satisfies the triangle inequality. I spent a lot of time wondering why my VP tree was not working. It turns out that I had not bothered to find the square root in the distance calculation. This step is important to satisfy the requirements of a metric space, because if the straight line distance to a <= b+c, it does not necessarily follow that a^2 <= b^2 + c^2."

Which means that square root calculations for distance metrics are MANDATORY!

Examples

VpTree<Point> vpTree = new VpTree<Point>();
vpTree.Create(points, CalculatePointDistance);
vpTree.Search(targetPoint, 5, out resultsVpTree, out distancesVpTree);

Benchmarks

You can run benchmarks (which compare vptree and linear search) by moving to benchmarks folder and running following command

dotnet run -c Release

License

This document and source code files are released into the public domain. See LICENSE file

csharp-vptree's People

Contributors

mcraiha avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

csharp-vptree's Issues

Consider publishing the library to Nuget

I'm using CSharp-vptree library and find it very useful for my project. However, I am facing difficulty in using it as it is not available on NuGet. I have to manually download and include the library in my project, which is not an optimal solution.

Therefore, I would like to request you to publish the binary of CSharp-vptree to NuGet. This would make it much easier for developers to use your library in their projects and can increase your library's reach and visibility.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.