-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark for vector indexing and searching #2481
Comments
Thanks for helping benchmark it! I'll fix the bug ASAP |
Hiii @PragmaTwice, I have some updates on the issue. <3 I tried debugging using ➜ kvrocks git:(unstable) ✗ gdb ./build/kvrocks
(gdb) run -c kvrocks.conf After running
From line 13, we can see that there is a closure attempting to access a However, logically, there should be no situation where
Given this, I guess addressing the consistency issue will probably in turn resolve the issue here. I saw there is a PR #2310 to solve this? And should we solve this issue after the PR is finalized and merged? Let me know if you believe there are other possibilities/solutions we should consider, as I'm not entirely certain this is the underlying cause. Thanks! Patch
|
Thank you for looking into this! I agree that it's likely related to rocksdb concurrency and transaction management. However, besides ensuring the consistency of rocksdb snapshots, do we need to consider anything else when constructing HNSW indexes in multiple threads? Should we also introduce some locks? |
Yea introducing locks sounds good to me. Also I think this is probably a common issue for all index rather than only HNSW (including text index in the future if also using quite complicated algorithm?), so we probably want to have a lock mechanism applicable for all. I'll look into it these two days, and open a new issue to discuss my idea & ask for suggestions if needed |
Search before asking
Motivation
We can try to use this repo: https://github.com/qdrant/vector-db-benchmark
After some simple patching:
We can start a kvrocks instance and run this benchmark:
python3 run.py --engines redis-default --datasets 'glove-25-angular'
cc @Beihao-Zhou
Solution
Currently we'll get some coredumps in the vector indexing phase, e.g.:
Maybe we can try to solve it first.
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: