Comments (2)
在使用bge-large-zh
向量数据库和设置Temperature
参数为0.1时,知识匹配分数阈值的设置主要依赖于你的具体需求。阈值的选择影响着招回数据的严格程度:较低的阈值(如0.4)会招回更多数据,但可能包含相关性较低的结果;较高的阈值(如0.7)则会减少招回的数据量,但提高了数据的相关性。
在bge-large-zh
向量数据库中,知识匹配分数是通过FAISS库计算的,使用的是内积距离策略和L2规范化。这意味着,向量嵌入在计算相似度分数前会进行L2规范化处理,而分数本身是基于向量间的内积计算的。这些分数用于过滤和选择匹配的知识条目。
对于阈值的具体设置,没有一成不变的规则。建议从一个中等阈值开始实验,然后根据招回的数据质量逐步调整。例如,如果你发现阈值为0.7时招回的数据太少或没有数据,那么可以尝试降低阈值,如调整到0.4,以观察是否能够招回更多相关数据。重要的是要找到一个平衡点,既能够招回足够的数据,又能保证数据的相关性。
希望这能帮助到你!如果还有其他问题,随时欢迎提问。
Details
Sources
- [FEATURE] 使用bge系列embedding时,SCORE_THRESHOLD取值0-1无法满足要求
- 如何提升根据问题搜索到对应知识的准确率
- server/agent/tools/search_knowledgebase_once.py
- configs/kb_config.py.example
- server/knowledge_base/kb_service/faiss_kb_service.py
- server/chat/knowledge_base_chat.py
- server/knowledge_base/kb_cache/faiss_cache.py
- server/chat/completion.py
- server/reranker/reranker.py
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.
from langchain-chatchat.
Temperature 根数据库没有关系啊?
from langchain-chatchat.
Related Issues (20)
- 文件比较多时,要运行几十个小时,一旦出错或者意外终止,数据库就损害了要重新生成。这里每处理100个重新保存一下 HOT 2
- Qwen1.5-14B-Chat在Langchain-Chatchat启动,长时间停止在Loading checkpoint shards: 75% HOT 4
- 运行不通,有AssertionError错误 HOT 1
- 数据库被锁定[BUG] 简洁阐述问题 / Concise description of the issue
- 使用文件对话上传文件能够正常的检索文件,但使用知识库上传相同文件却显示知识库无法匹配 HOT 4
- 关于请求API并发性能问题
- 是否支持非docker arm架构下部署,或者未来会支持
- 针对知识库中的文档对话 HOT 2
- [询问]文件对话的存储与释放机制 HOT 1
- 如何处理doc文档
- 如何分批进行数万个文件知识库初始化?
- [BUG] 添加文件至向量库失败 HOT 2
- [BUG] milvus向量库执行python init_database.py --recreate-vs一系列报错 HOT 1
- 为什么升级到.10后 它的LLm对话也好,知识库问答也好,它的速度会比.09要慢一些 HOT 3
- [BUG] 创建删除知识库和对应构建向量库报错/error when post /knowledge_base/delete_docs :time out
- 这个框架没做并发啊
- 能否出个只访问各种模型api的版本
- 加载微调模型 HOT 1
- 知识库问答模式可以同时使用多个知识库
- 报错stderr之后没有反应
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from langchain-chatchat.