Comments (6)
By the way, I often got the following error in my run: "Error! startEpsilon should be reduced for the specified range.". Can you see any obvious implications from it? Thank you so much!
"got a train set of size (941 * 1024)
got 100 queries
ngt::reconstructGraph: Extract the graph data.
Reconstruction time=0.00072567:0.00114803:0.00380066
original edge size=10
reverse edge size=60
ngt::Graph reconstruction time=0.00719446 (sec)
GraphReconstructor::adjustPaths: graph preparing time=0.000302463 (sec)
GraphReconstructor::adjustPaths extracting removed edge candidates time=0.152378 (sec)
ngt::Path adjustment time=0.182657 (sec)
adjustBaseSearchEdgeSize::Extract queries for GT...
adjustBaseSearchEdgeSize::create GT...
adjustBaseSearchEdgeSize::explore for the mergin 0.2...
adjustBaseSearchEdgeSize::Base edge size=10, distance computation=1.48209
Warning: Cannot adjust the base edge size for mergin 0.2. /home/app/ngt/lib/NGT/Optimizer.h:427: exploreEpsilonForAccuracy:
Error! startEpsilon should be reduced for the specified range.
Try again for the next mergin.
adjustBaseSearchEdgeSize::explore for the mergin 0.25...
adjustBaseSearchEdgeSize::Base edge size=10, distance computation=1.48209
Warning: Cannot adjust the base edge size for mergin 0.25. /home/app/ngt/lib/NGT/Optimizer.h:427: exploreEpsilonForAccuracy:
Error! startEpsilon should be reduced for the specified range.
Try again for the next mergin.
adjustBaseSearchEdgeSize::explore for the mergin 0.3...
adjustBaseSearchEdgeSize::Base edge size=10, distance computation=1.48209
Warning: Cannot adjust the base edge size for mergin 0.3. /home/app/ngt/lib/NGT/Optimizer.h:427: exploreEpsilonForAccuracy:
Error! startEpsilon should be reduced for the specified range.
Try again for the next mergin."
from ngt.
Since the “Byte” type is used for our applications, it basically works well. Note that the “Byte” is implemented as an unsigned variable. If you properly implement your distance function, the “Byte” works. According to your logs, you use NGT on ann benchmarks, don’t you? If so, the size of dataset (941 objects) you used is too small, because NGT on ann benchmarks try to optimize search parameters for the specified dataset. However, even though you find errors during the optimization, default search parameters are used for the benchmarks.
from ngt.
Thank you so much for the response!
You are right, I am running the ann benchmarking. I noticed that you updated the python driver "onng_ngt.py" under that project.
[https://github.com/erikbern/ann-benchmarks/blob/master/ann_benchmarks/algorithms/onng_ngt.py]
For the fit function, could you please give some descriptions to help me understand the logic? Thank you!
from ngt.
For the fit function, could you please give some descriptions to help me understand the logic? Thank you!
The process is the same of the following.
https://github.com/yahoojapan/NGT/blob/master/bin/ngt/README.md#onng
First, create ANNG. Second, convert the ANNG to ONNG.
The conversion consists of
- Adjust incoming and outgoing edges
- Reduce shortcut edges
- Optimize search parameters.
Does that answer your question?
from ngt.
Yes, that makes sense!
What confused me was the log information, e.g. print('ONNG: create ANNG'), print('ONNG: ANNG construction time(sec)=' + str(time.time() - t)). In some runs, the log of the subprocess seems to dominate the stdout, and the log I mentioned above does not appear. Maybe this is because of the subprocess calling. But I believe the code were executed successfully because the self.index was successfully set.
I think we are approaching to close this issue. Thanks!
from ngt.
There are messages of the onng_ngt.py in our log below, although they are a little out of order. The difference might be due to the docker or os environment.
Trying to instantiate ann_benchmarks.algorithms.onng_ngt.ONNG(['euclidean', 'Float', 0.1, {'edge': 100, 'outdegree': 10, 'indegree': 120}])
ONNG: edge_size=100
ONNG: outdegree=10
ONNG: indegree=120
ONNG: edge_size_for_search=-2
ONNG: epsilon=0.1
ONNG: metric=euclidean
ONNG: object_type=Float
got a train set of size (1000000 * 128)
got 10000 queries
Processed 100000 objects. time= 199.102 (sec)
Processed 200000 objects. time= 449.442 (sec)
Processed 300000 objects. time= 538.9 (sec)
Processed 400000 objects. time= 613.441 (sec)
Processed 500000 objects. time= 627.649 (sec)
Processed 600000 objects. time= 708.047 (sec)
Processed 700000 objects. time= 696.557 (sec)
Processed 800000 objects. time= 779.332 (sec)
Processed 900000 objects. time= 748.134 (sec)
Processed 1000000 objects. time= 837.672 (sec)
ngt::reconstructGraph: Extract the graph data.
Processed 1000000 objects.
Processed 100000 nodes
Processed 200000 nodes
Processed 300000 nodes
Processed 400000 nodes
Processed 500000 nodes
Processed 600000 nodes
Processed 700000 nodes
Processed 800000 nodes
Processed 900000 nodes
Processed 1000000 nodes
Reconstruction time=0.739276:18.1838:7.54052
original edge size=10
reverse edge size=120
ngt::Graph reconstruction time=27.9895 (sec)
GraphReconstructor::adjustPaths: graph preparing time=0.992566 (sec)
GraphReconstructor::adjustPaths extracting removed edge candidates time=719.449 (sec)
ngt::Path adjustment time=905.001 (sec)
adjustBaseSearchEdgeSize::Extract queries for GT...
adjustBaseSearchEdgeSize::create GT...
adjustBaseSearchEdgeSize::explore for the mergin 0.2...
adjustBaseSearchEdgeSize::Base edge size=10, distance computation=2.25153
adjustBaseSearchEdgeSize::Base edge size=20, distance computation=2.2905
Reconstruct Graph: adjust the base search edge size. 10
ngtpy::Index read only!
ONNG: start indexing...
ONNG: # of data=1000000
ONNG: dimensionality=128
ONNG: index=indexes/ONNG-100-10-120
ONNG: create ANNG
ONNG: ANNG construction time(sec)=6211.593377828598
ONNG: degree adjustment
ONNG: degree adjustment time(sec)=951.6256878376007
ONNG: index already exists! indexes/ONNG-100-10-120
ONNG: open time(sec)=5.4101881980896
ONNG: end of fit
Built index in 7168.630221128464
Index size: 2625880.0
Running query argument group 1 of 8...
ONNG: epsilon=0.6
Run 1/1...
from ngt.
Related Issues (20)
- How to create a QBG with Capi ? HOT 7
- Specify num_threads for searching HOT 3
- Linking issue HOT 2
- Add more ngt_insert_index_as_TYPE methods to C API HOT 2
- Feature request: Command line output option that doesn't require intensive deserialization
- Add new QBG methods to C API HOT 7
- Is there any benchmark result for NGT QG/QBG? HOT 4
- file descriptor leak on `index.build_index` HOT 2
- Missing functions and types in the C API HOT 14
- Python bindings for QG/QBG HOT 3
- bugs HOT 3
- Fixed seeds for deterministic results HOT 1
- Building with -DNGT_QBG_DISABLED=ON still trying to link with LAPACK and BLAS HOT 2
- How to update NGT from older version to new one? HOT 2
- pip install in python==3.11 fails : could not find a version that satisfies HOT 4
- Running sample codes (e.g., qg-l2-float) HOT 6
- Remove vector causes error, only for "Normalised" distance types: Index.h:remove:1544: Not found the specified id HOT 7
- Quantization in qg-l2-float.cpp HOT 1
- colab import ngt dose not work HOT 1
- Paper and references for QG HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ngt.