Comments (14)
@tsantra We are trying to reproduce the error and would you please share the code above :)
from bigdl.
As ipex_llm.langchain.llms.TransformersLLM
does not support load LLaVA directly, you could follow these steps instead.
- Please following LLaVA repo guide to load the model
- Optimize the model by using
ipex_llm.optimize_model()
to transform it to low-bit. - Integrate with
TransformersLLM
by passing it as parametermodel
so that you could use it in langchain.
Example code:
from ipex_llm import optimize_model
# Load the pretrained model.
# Adapted from LLaVA.llava.model.builder.load_pretrained_model.
def load_pretrained_model(model_path, model_base, model_name, load_8bit=False, load_4bit=False,
device_map="auto", device="cpu"):
# Refer LLaVA repo to load model
......
model = optimize_model(model)
from ipex_llm.langchain.llms import TransformersLLM
llm = TransformersLLM(model_id="",model=model,tokenizer=tokenizer)
from bigdl.
@ivy-lv11 thank you ! This works!
Finally, I want to use the llm in the RAG pipeline. Using the llm = TransformersLLM(model_id="",model=model,tokenizer=tokenizer), is not generating any output for me. My code uses base64 encoded image as input, as this is the way Ollama works and my solution works with LLaVA model from Ollama. Now I am trying to use Ipex-LLM LLaVA model instead from Ollama. Here is my code:
Create chroma
vectorstore = Chroma(
collection_name="mm_rag_clip_photos", embedding_function=OpenCLIPEmbeddings(), persist_directory= "./chroma_test_ipex_llm/"
)
vectorstore.persist()
Get image URIs with .jpg extension only
image_uris = sorted(
[
os.path.join(output_folder, image_name)
for image_name in os.listdir(output_folder)
if image_name.endswith(".png")
]
)
with open (text_path, 'r') as textfile:
texts = textfile.readlines()
Add images
vectorstore.add_images(uris=image_uris)
Add documents
vectorstore.add_texts(texts=texts)
Make retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": 1})
def resize_base64_image(base64_string, size=(128, 128)):
"""
Resize an image encoded as a Base64 string.
Args:
base64_string (str): Base64 string of the original image.
size (tuple): Desired size of the image as (width, height).
Returns:
str: Base64 string of the resized image.
"""
# Decode the Base64 string
img_data = base64.b64decode(base64_string)
img = Image.open(io.BytesIO(img_data))
# Resize the image
resized_img = img.resize(size, Image.LANCZOS)
# Save the resized image to a bytes buffer
buffered = io.BytesIO()
resized_img.save(buffered, format=img.format)
# Encode the resized image to Base64
return base64.b64encode(buffered.getvalue()).decode("utf-8")
def is_base64(s):
"""Check if a string is Base64 encoded"""
try:
return base64.b64encode(base64.b64decode(s)) == s.encode()
except Exception:
return False
def split_image_text_types(docs):
"""Split numpy array images and texts"""
images = []
text = []
for doc in docs:
doc = doc.page_content # Extract Document contents
if is_base64(doc):
print (" found image doc ")
# Resize image to avoid OAI server error
images.append(
resize_base64_image(doc, size=(250, 250))
) # base64 encoded str
else:
text.append(doc)
return {"images": images, "texts": text}
def prompt_func(data_dict):
# Joining the context texts into a single string
formatted_texts = "\n".join(data_dict["context"]["texts"])
messages = []
# Adding image(s) to the messages if present
if data_dict["context"]["images"]:
for x in data_dict["context"]["images"]:
image_message = {
"type": "image_url",
"image_url": f"data:image/jpeg;base64,{x}",
}
messages.append(image_message)
# Adding the text message for analysis
text_message = {
"type": "text",
"text": (
"\n You are an AI Assistant for summarizing videos.\n"
),
}
messages.append(text_message)
return [HumanMessage(content=messages)]
RAG pipeline
chain = (
{
"context": retriever | RunnableLambda(split_image_text_types),
"question": RunnablePassthrough(),
}
| RunnableLambda(prompt_func)
| llm
| StrOutputParser()
)
d = chain.invoke("what are the key takeaways from the images?")
Please could you help. How do I format the image and text input for the model?
from bigdl.
Hi, we will try to reproduce your error.
from bigdl.
hi @lzivan could you please let me know if you have any update. Thank you very much.
from bigdl.
Hi @tsantra , we are now debugging the "load_image" issue, will get back to you once we have a solution.
from bigdl.
hi @lzivan Could you please let me know if there is any update? thanks a lot!
from bigdl.
We are still getting "nobody knows" for the output. I'm not sure whether Llava is compatible with Langchain.
from bigdl.
We tried on our RTX machine, using local Llava-1.5-7b model, still get the same "nobody knows" output.
from bigdl.
@lzivan LlaVA works fine with Langchain using Ollama. So LLaVA is compatible.
from bigdl.
@lzivan what is the "nobody knows" output?
from bigdl.
@lzivan LlaVA works fine with Langchain using Ollama. So LLaVA is compatible.
If (official) Ollama works okay, could you please try use IPEX-LLM enabled Ollama on GPU and see if it works?
For how to install IPEX-LLM enabled Ollama and run ollama serve
on GPU (e.g. iGPU and Arc), refer to this guide
Ipex-llm enabled Ollama is used the same way as offical ollama, so you don't have to change your langchain code. Just remember to change the base_url
if you're running ollama serve
on another machine. For example, if you're running Ollama serve on your_machine_ip
, set the ollama base_url as below in your langchain code .
from langchain_community.llms import Ollama
llm = Ollama(
base_url = 'http://your_machine_ip:11434',
model="llava"
)
from bigdl.
@shane-huang does it work only on GPU with Ollama? Also, one of the reasons I do not want to continue with Ollama is it seems unstable in performance. The official Ollama.
from bigdl.
Hi @tsantra , would you please append the image and the textfile you are using?
from bigdl.
Related Issues (20)
- Add option to make ipex-llm's ollama default on the user's machine HOT 1
- RuntimeError: Expected all tensors to be on the same device HOT 3
- [documentation] "Quick Start" needs an actual Quick start HOT 4
- Run vLLM CPU benchmarking with BenchmarkWrapper API HOT 1
- RuntimeError: PyTorch is not linked with support for xpu devices HOT 14
- mistral_model_forward_4_36() got an unexpected keyword argument 'cache_position' HOT 1
- GPU memory usage is unbalanced in the pipeline mode HOT 1
- Where are the sources for bigdl-core-cpp? HOT 1
- OOM on multiple-ARC with vllm serving HOT 1
- Low parallel requests on Arc with VLLM serving HOT 2
- 6K input OOM on ARC with VLLM-serving HOT 1
- ipex-llm can not load qwen awq quantized models.
- ipex-llm fast_tokenizer error for loading the model Mistral-7B-Instruct-v0.3 HOT 1
- dGPU driver installation failed HOT 5
- Qwen2-7B-int4 function calling failed with ipex-llm.transformers.AutoModelForCausalLM.generate() HOT 3
- release stable version for inference LLM with 2x or 4x Arc A770 HOT 1
- Does IPEX-LLM support Flash Attention ? HOT 3
- vLLM serving qwen1.5-14B-Chat with 3.5k & 7.5k input, 500 output
- llama runner process has terminated exit status 1 HOT 2
- Langchain-Chatchat error while uploading pdf
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bigdl.