Giter Club home page Giter Club logo

Comments (5)

medwang1 avatar medwang1 commented on August 16, 2024 1

#1619

感觉这几个是一个问题,content 内容很长,开的并发很高,就会触发这个问题

from lmdeploy.

zhyncs avatar zhyncs commented on August 16, 2024

Please provide the code for the client that can be used for reproduction, thanks.

from lmdeploy.

josephrocca avatar josephrocca commented on August 16, 2024

I will do my best to get a reliable reproduction of this detokenize issue. Possibly not related, since there doesn't seem to be any tokenizer issues in the logs, but maybe worth referencing since it has the same "an illegal memory access" message:

from lmdeploy.

medwang1 avatar medwang1 commented on August 16, 2024

@zhyncs 可以这样复现:

wrk -t10 -c100 -d30s -s 01_post.lua --latency http://0.0.0.0:8081/v1/chat/completions

01_post.lua file:

wrk.method = "POST"
wrk.body = [[
	{
		"model": "yi",
		"temperature": 0.7,
		"messages": [
			{
				"role": "user",
				"content": "worker_rlimit_nofile 是一个在 Nginx 或其他基于 Unix-like 系统的 Web 服务器配置中的指令,用于设置工作进程可以打开的最大文件描述符数。这个设置对于服务器性能有重要影响,因为它决定了服务器可以同时处理多少个并发连接。在这里,655350 是设置的具体数值。这个数值设置的相当高,意味着服务器配置了非常高的并发处理能力。在 Unix-like 系统中,文件描述符用于访问所有类型的文件,包括网络套接字。因此,增加这个限制可以让服务器处理更多的并发请求,特别是对于需要处理大量静态文件或者提供大量 Web 服务的场景。设置这个值通常需要服务器管理员有适当的权限,并且可能需要在系统级别进行相应的调整,因为操作系统也有自己的限制。在实际应用中,服务器管理员需要根据服务器的硬件资源、预期的负载以及实际的应用场景来合理设置这个值,以确保服务器既能充分利用资源,又不会因为超过系统限制而导致性能问题。"
			}
		],
		"stream": false,
		"max_tokens": 0
	}]]
wrk.headers["Content-Type"] = "application/json"

部署模型的模型是:
CUDA_VISIBLE_DEVICES=0 lmdeploy serve api_server ./Yi-1.5-9B-Chat --server-port 8081 --model-name yi --cache-max-entry-count 0.9 --tp 1 --session-len 4096 --enable-prefix-caching

from lmdeploy.

lvhan028 avatar lvhan028 commented on August 16, 2024

@lzhangzz could you please investigate this issue?

from lmdeploy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.