Hi, is the object type "Byte" well developed and fully tested in the current code? By

For the fit function, could you please give some deions to help me

"Byte" object type about ngt HOT 6 CLOSED

yahoojapan commented on July 17, 2024

"Byte" object type

from ngt.

Comments (6)

chunjiangzhu commented on July 17, 2024

By the way, I often got the following error in my run: "Error! startEpsilon should be reduced for the specified range.". Can you see any obvious implications from it? Thank you so much!

"got a train set of size (941 * 1024)
got 100 queries
ngt::reconstructGraph: Extract the graph data.
Reconstruction time=0.00072567:0.00114803:0.00380066
original edge size=10
reverse edge size=60
ngt::Graph reconstruction time=0.00719446 (sec)
GraphReconstructor::adjustPaths: graph preparing time=0.000302463 (sec)
GraphReconstructor::adjustPaths extracting removed edge candidates time=0.152378 (sec)
ngt::Path adjustment time=0.182657 (sec)
adjustBaseSearchEdgeSize::Extract queries for GT...
adjustBaseSearchEdgeSize::create GT...
adjustBaseSearchEdgeSize::explore for the mergin 0.2...
adjustBaseSearchEdgeSize::Base edge size=10, distance computation=1.48209
Warning: Cannot adjust the base edge size for mergin 0.2. /home/app/ngt/lib/NGT/Optimizer.h:427: exploreEpsilonForAccuracy:
Error! startEpsilon should be reduced for the specified range.
Try again for the next mergin.
adjustBaseSearchEdgeSize::explore for the mergin 0.25...
adjustBaseSearchEdgeSize::Base edge size=10, distance computation=1.48209
Warning: Cannot adjust the base edge size for mergin 0.25. /home/app/ngt/lib/NGT/Optimizer.h:427: exploreEpsilonForAccuracy:
Error! startEpsilon should be reduced for the specified range.
Try again for the next mergin.
adjustBaseSearchEdgeSize::explore for the mergin 0.3...
adjustBaseSearchEdgeSize::Base edge size=10, distance computation=1.48209
Warning: Cannot adjust the base edge size for mergin 0.3. /home/app/ngt/lib/NGT/Optimizer.h:427: exploreEpsilonForAccuracy:
Error! startEpsilon should be reduced for the specified range.
Try again for the next mergin."

from ngt.

masajiro commented on July 17, 2024

Since the “Byte” type is used for our applications, it basically works well. Note that the “Byte” is implemented as an unsigned variable. If you properly implement your distance function, the “Byte” works. According to your logs, you use NGT on ann benchmarks, don’t you? If so, the size of dataset (941 objects) you used is too small, because NGT on ann benchmarks try to optimize search parameters for the specified dataset. However, even though you find errors during the optimization, default search parameters are used for the benchmarks.

from ngt.

chunjiangzhu commented on July 17, 2024

Thank you so much for the response!
You are right, I am running the ann benchmarking. I noticed that you updated the python driver "onng_ngt.py" under that project.
[https://github.com/erikbern/ann-benchmarks/blob/master/ann_benchmarks/algorithms/onng_ngt.py]
For the fit function, could you please give some descriptions to help me understand the logic? Thank you!

from ngt.

masajiro commented on July 17, 2024

For the fit function, could you please give some descriptions to help me understand the logic? Thank you!

The process is the same of the following.
https://github.com/yahoojapan/NGT/blob/master/bin/ngt/README.md#onng
First, create ANNG. Second, convert the ANNG to ONNG.
The conversion consists of

Adjust incoming and outgoing edges
Reduce shortcut edges
Optimize search parameters.

Does that answer your question?

from ngt.

chunjiangzhu commented on July 17, 2024

Yes, that makes sense!

What confused me was the log information, e.g. print('ONNG: create ANNG'), print('ONNG: ANNG construction time(sec)=' + str(time.time() - t)). In some runs, the log of the subprocess seems to dominate the stdout, and the log I mentioned above does not appear. Maybe this is because of the subprocess calling. But I believe the code were executed successfully because the self.index was successfully set.

I think we are approaching to close this issue. Thanks!

from ngt.

masajiro commented on July 17, 2024

There are messages of the onng_ngt.py in our log below, although they are a little out of order. The difference might be due to the docker or os environment.

Trying to instantiate ann_benchmarks.algorithms.onng_ngt.ONNG(['euclidean', 'Float', 0.1, {'edge': 100, 'outdegree': 10, 'indegree': 120}])
ONNG: edge_size=100
ONNG: outdegree=10
ONNG: indegree=120
ONNG: edge_size_for_search=-2
ONNG: epsilon=0.1
ONNG: metric=euclidean
ONNG: object_type=Float
got a train set of size (1000000 * 128)
got 10000 queries
Processed 100000 objects. time= 199.102 (sec)
Processed 200000 objects. time= 449.442 (sec)
Processed 300000 objects. time= 538.9 (sec)
Processed 400000 objects. time= 613.441 (sec)
Processed 500000 objects. time= 627.649 (sec)
Processed 600000 objects. time= 708.047 (sec)
Processed 700000 objects. time= 696.557 (sec)
Processed 800000 objects. time= 779.332 (sec)
Processed 900000 objects. time= 748.134 (sec)
Processed 1000000 objects. time= 837.672 (sec)
ngt::reconstructGraph: Extract the graph data.
Processed 1000000 objects.
Processed 100000 nodes
Processed 200000 nodes
Processed 300000 nodes
Processed 400000 nodes
Processed 500000 nodes
Processed 600000 nodes
Processed 700000 nodes
Processed 800000 nodes
Processed 900000 nodes
Processed 1000000 nodes
Reconstruction time=0.739276:18.1838:7.54052
original edge size=10
reverse edge size=120
ngt::Graph reconstruction time=27.9895 (sec)
GraphReconstructor::adjustPaths: graph preparing time=0.992566 (sec)
GraphReconstructor::adjustPaths extracting removed edge candidates time=719.449 (sec)
ngt::Path adjustment time=905.001 (sec)
adjustBaseSearchEdgeSize::Extract queries for GT...
adjustBaseSearchEdgeSize::create GT...
adjustBaseSearchEdgeSize::explore for the mergin 0.2...
adjustBaseSearchEdgeSize::Base edge size=10, distance computation=2.25153
adjustBaseSearchEdgeSize::Base edge size=20, distance computation=2.2905
Reconstruct Graph: adjust the base search edge size. 10
ngtpy::Index read only!
ONNG: start indexing...
ONNG: # of data=1000000
ONNG: dimensionality=128
ONNG: index=indexes/ONNG-100-10-120
ONNG: create ANNG
ONNG: ANNG construction time(sec)=6211.593377828598
ONNG: degree adjustment
ONNG: degree adjustment time(sec)=951.6256878376007
ONNG: index already exists! indexes/ONNG-100-10-120
ONNG: open time(sec)=5.4101881980896
ONNG: end of fit
Built index in 7168.630221128464
Index size:  2625880.0
Running query argument group 1 of 8...
ONNG: epsilon=0.6
Run 1/1...

from ngt.

"Byte" object type about ngt HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent