Comments (2)
I also encountered the same issue: qps and recall didn't change as expected. Here is the code I tested with the default configuration of sift1m dataset
vector_number = 100000
vector_dimension = 1000
x = np.random.rand(vector_number, vector_dimension).astype(np.float32)
q = np.random.rand(1000, vector_dimension).astype(np.float32)
m = ''
for i in range(vector_number):
m += str(i) + '\n'
index = SPTAG.AnnIndex('SPANN', 'Float', vector_dimension)
index.SetBuildParam("IndexAlgoType", "BKT", "Base")
index.SetBuildParam("IndexDirectory", "spann_index", "Base")
index.SetBuildParam("DistCalcMethod", "L2", "Base")
index.SetBuildParam("isExecute", "true", "SelectHead")
index.SetBuildParam("NumberOfThreads", '64', "SelectHead")
index.SetBuildParam("Ratio", "0.16", "SelectHead") # index.SetBuildParam("Count", "200", "SelectHead")
index.SetBuildParam("TreeNumber", "1", "SelectHead")
index.SetBuildParam("BKTKmeansK", "32", "SelectHead")
index.SetBuildParam("BKTLeafSize", "8", "SelectHead")
index.SetBuildParam("SaveBKT", "false", "SelectHead")
index.SetBuildParam("SplitFactor", "6", "SelectHead")
index.SetBuildParam("SplitThreshold", "100", "SelectHead")
index.SetBuildParam("BKTLambdaFactor", "-1", "SelectHead")
index.SetBuildParam("SamplesNumber", "1000", "SelectHead")
index.SetBuildParam("SelectThreshold", "50", "SelectHead")
index.SetBuildParam("isExecute", "true", "BuildHead")
index.SetBuildParam("NeighborhoodSize", "32", "BuildHead")
index.SetBuildParam("TPTNumber", "32", "BuildHead")
index.SetBuildParam("TPTLeafSize", "2000", "BuildHead")
index.SetBuildParam("MaxCheck", "8192", "BuildHead")
index.SetBuildParam("MaxCheckForRefineGraph", "8192", "BuildHead")
index.SetBuildParam("RefineIterations", "3", "BuildHead")
index.SetBuildParam("NumberOfThreads", "64", "BuildHead")
index.SetBuildParam("BKTLambdaFactor", "-1", "BuildHead")
index.SetBuildParam("isExecute", "true", "BuildSSDIndex")
index.SetBuildParam("BuildSsdIndex", "true", "BuildSSDIndex")
index.SetBuildParam("InternalResultNum", "64", "BuildSSDIndex")
index.SetBuildParam("ReplicaCount", "8", "BuildSSDIndex")
index.SetBuildParam("PostingPageLimit", "12", "BuildSSDIndex")
index.SetBuildParam("NumberOfThreads", "64", "BuildSSDIndex")
index.SetBuildParam("MaxCheck", "8192", "BuildSSDIndex")
if (os.path.exists("spann_index")):
shutil.rmtree("spann_index")
print ("Build.............................")
st = time.time()
index.BuildWithMetaData(x, m, vector_number, False, False)
et = time.time()
build_time = et - st
print("Build time : ", build_time)
maxcheck = [100, 200, 400, 1000, 2000]
searchPostingPageLimit = [1, 5, 10, 40, 100]
for m in maxcheck:
for s in searchPostingPageLimit:
index.SetSearchParam("isExecute", "true", "SearchSSDIndex")
index.SetSearchParam("BuildSsdIndex", "false", "SearchSSDIndex")
index.SetSearchParam("InternalResultNum", "32", "SearchSSDIndex")
index.SetSearchParam("NumberOfThreads", "4", "SearchSSDIndex")
index.SetSearchParam("HashTableExponent", "4", "SearchSSDIndex")
index.SetSearchParam("ResultNum", "10", "SearchSSDIndex")
index.SetSearchParam("MaxCheck", str(m) , "SearchSSDIndex")
index.SetSearchParam("MaxDistRatio", "10000", "SearchSSDIndex")
index.SetSearchParam("SearchPostingPageLimit", str(s), "SearchSSDIndex")
st = time.time()
for t in tqdm(range(q.shape[0])):
result = index.SearchWithMetaData(q[t], 3) # Search k=3 nearest vectors for query vector q
et = time.time()
search_time = et - st
print(f"{m}/{s} Search time : ", et - st)
from sptag.
Hi, I encountered the same issue. Have you figure out the reason?
Thanks in advance!
from sptag.
Related Issues (20)
- Dockerfile doesn't work
- Logger is not configurable until after it's been used
- SPATAG build failure with cmake HOT 3
- Not enough memory on host devices offered by Azure
- thread local context (#359) causes test issue on Linux: `1: [4] fid:0 channel 2, to submit:64, submitted:Operation not permitted` HOT 13
- This repo is missing important files
- build failure with GCC 13 due to missing `#include <cstdint> in `AnnService/inc/Helper/DiskIO.h`
- index.Save function doesn't create the file on disk HOT 1
- How to Search SPANN SSD Index???
- Import SPTAG Failed
- Missing m_SPTQueue.insert()?
- [QUESTION] How to start a online server for service.ini?
- Multiple connection for one server (to be assigned for each client)
- How to use distribute server? HOT 1
- compile failed with bug
- KMeans clustering
- Improve the BalancedDataPartition program
- Building a 1000W BKT index crashes
- Unable to download vectors_9.bin and vectors_12.bin using git lfs pull
- double free or corruption (out) during Search
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sptag.