Comments (3)
Do you see the same results using vLLM directly?
We do not maintain the Langchain integration, so I don't know how its implemented.
from vllm.
Yes when i used
def process_request(question:str):
client = OpenAI(
base_url="http://151.106.13.150:8088/v1",
api_key="EMPTY",
)
completion = client.chat.completions.create(
model="meta-llama/Meta-Llama-3-8B-Instruct",
messages=[
{"role": "user", "content": question}
]
)
return (completion.choices[0].message.content)
multiple_lines_text_2 = "Hello, world!\nThis is me Mario!\nI am a bot!\nHere's my brother luigi"
print(process_request(multiple_lines_text_2))
It gets normal results
Welcome to the world, Mario!
Ah, you're a bot, eh? That's-a cool!
And you've brought your brother Luigi along, too! The dynamic duo of the Mushroom Kingdom!
from vllm.
Closing since it seems to be an issue on LangChain rather than on our end. You should instead open an issue at LangChain's repo.
from vllm.
Related Issues (20)
- [Bug]: errors when loading mixtral 8x7b HOT 1
- [Bug]: The MixtralForCausalLM architecture and the mistralai/Mixtral-8x7B-Instruct-v0.1 model are stated to be supported by vLLM, but an error occurs during model loading. HOT 4
- [Bug]: Unable to use fp8 kv cache with chunked prefill on ampere HOT 15
- [Doc]: AutoAWQ quantization example fails HOT 5
- [Bug]: Error loading microsoft/Phi-3.5-vision-instruct HOT 14
- [Bug]: torch.OutOfMemoryError: CUDA out of memory HOT 5
- [Bug]: Using CPU for inference, an error occurred. [Engine iteration timed out. This should never happen! ] HOT 4
- [Usage]: How to use FP8 or other quantization algorithms for Minicpmv2_6
- [Bug]: Unexpected non-determinism with vLLM 0.5.4 and Llama 3.1 HOT 3
- [New Model]: MiniCPM-V-2_6-int4 HOT 5
- [Usage]: Potential Hardware Failure when running vllm HOT 4
- [New Model]: ValueError: Model architectures ['PhiMoEForCausalLM'] are not supported for now HOT 1
- [Bug]: vLLM server not supporting stabilityai/stablelm-3b-4e1t model on CPU
- [Usage]: Is there an option to obtain attention matrices during inference, similar to the output_attentions=True parameter in the transformers package? HOT 1
- [Usage]: About bitsandbytes HOT 1
- [Feature]: phi-3.5 is a strong model for its size, including vision support. Has multi-image support, but vllm does not support HOT 2
- [Usage]: Wait for the response for each prediction
- [Bug]: Requesting Prompt Logprobs with an MLP Speculator Crashes the Server HOT 1
- [Installation]: Failed to build vLLM from source due to https://github.com/vllm-project/vllm/pull/7174 by bisecting the most recent changes HOT 4
- [Bug]: llama 3.1 405B RuntimeError: start (1024) + length (256) exceeds dimension size (1024)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vllm.