Approximate port of scikit-learn neighbors using Numba.
If you want to install/modify (recommented at this point):
git clone https://github.com/jackd/numba-neighbors.git
pip install -e numba-neighbors
Quick-start:
pip install git+git://github.com/jackd/ifp-sample.git
You may see performance benefits from fastmath
by installing Intel's short vector math library (SVML).
conda install -c numba icc_rt
Debugging is often simpler without jit
ting. To disable numba
,
export NUMBA_DISABLE_JIT=1
and re-enable with
export NUMBA_DISABLE_JIT=0
Be wary of using os.environ["NUMBA_DISABLE_JIT"] = "1"
from python code - this must be set above imports.
- All operations are done using reduced distances. E.g. provided
KDTree
implementations use squared distances rather than actual distances both for inputs and outputs. query_radius
-like functions must specify a maximum number of neighbors. Over-estimating this is fairly cheap - it just means we allocate more data than necessary - but if the limit is reached the firstmax_count
neighbors that are found are returned. These aren't necessarily the closestmax_count
neighbors.- Query outputs aren't sorted, though can be using
binary_tree.simultaneous_sort_partial
. - Use of Interl's short vector math library (SVML) if instaled. This makes computation faster, but may result in very small errors.
n_nodes
is inconsistent with scikit-learn implementation... - scikit bug?- Port
NeighborsHeap
from scikit learn. query
implementations.