Comments (3)
To address the issues you're experiencing with Langchain-ChatGLM, particularly after enabling VLLM acceleration, consider the following steps:
-
Model Loading: Ensure the model is fully loaded before it starts serving requests. Implement a health check that confirms the model's readiness. This can help avoid initial call failures due to the model not being loaded.
-
Concurrency Settings: The
InvalidChunkLength
errors suggest a need to adjust how concurrent requests are handled. Experiment with thelimit_worker_concurrency
parameter to find an optimal setting for your workload. -
VLLM Configuration: Since enabling VLLM acceleration introduces the issue, closely review and adjust VLLM-related configurations such as
tensor_parallel_size
,block_size
,swap_space
, andgpu_memory_utilization
. -
System Resources: Monitor and ensure your system resources (CPU, GPU, memory) are not overwhelmed by the load, as this can lead to errors.
-
Error Handling and Retries: Implement error handling and retry mechanisms for intermittent
InvalidChunkLength
errors, which can improve application stability. -
Software Updates: Consider updating to the latest version of Langchain-ChatGLM if available, as it may contain fixes and improvements for your issues.
These steps are derived from analyzing the provided configuration and the issues you're facing. Adjusting these areas should help mitigate the problems with initial call failures and intermittent InvalidChunkLength
errors.
Details
Sources
- 使用vllm加速报错
- startup.py
- configs/server_config.py.example
- webui_pages/utils.py
- tests/api/test_stream_chat_api_thread.py
About Dosu
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.
from langchain-chatchat.
问题1同。只有第一次调用会有这个问题。
from langchain-chatchat.
这个项目没有做并发。相关问题需要自己兼容
from langchain-chatchat.
Related Issues (20)
- 如何处理doc文档
- 如何分批进行数万个文件知识库初始化?
- [BUG] 添加文件至向量库失败 HOT 2
- [BUG] milvus向量库执行python init_database.py --recreate-vs一系列报错 HOT 1
- 为什么升级到.10后 它的LLm对话也好,知识库问答也好,它的速度会比.09要慢一些 HOT 3
- [BUG] 创建删除知识库和对应构建向量库报错/error when post /knowledge_base/delete_docs :time out
- 这个框架没做并发啊
- 能否出个只访问各种模型api的版本
- 加载微调模型 HOT 1
- 知识库问答模式可以同时使用多个知识库
- 报错stderr之后没有反应
- 如何在知识库问答时,让模型回答内容中的每句话添加上角标引用来源及链接,像秘塔AI搜索问答结果那样? HOT 1
- 在知识库问答模式如何同时使用多个知识库 HOT 1
- 关于reranker重排的使用方式 HOT 1
- vllm支持qwen1.5-32B和Yi1.5-34B HOT 2
- 在知识库问答模式如何同时调用多个知识库 HOT 1
- 怎么通过url的方式调用知识库问答 HOT 1
- inotify watch limit reached
- [BUG] search_knowledgebase_complex.py如何加载本地模型进行Rag?model_container.MODEL应该如何修改成自己的本地模型?
- 读取知识库的介绍
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from langchain-chatchat.