Comments (4)
test.py.txt
The demo python code
from bigdl.
It's the right behavior of Python virtual machine, we can't force Python VM releasing it's memory on CPU. After you del the model, the memory is empty in VM. If you load a new model, python process won't apply new memory.
from bigdl.
Hi Qiuxin,
Based on my observations, if you do this step(load model+model.generate+del model) for multiple times(in the same process), the process' vm usage will become huge, then system oom will kill this process.
Maybe you guys can try to reproduce this case, to see if we can improve this situation, or not.
Thanks
Gang
from bigdl.
I can't reproduce after 20 times, on current nightly 2.1.0b20240701+ oneapi 2024.0 + intel-extension-for-pytorch 2.1.10+xpu
from bigdl.
Related Issues (20)
- Determining if AMX is in use by ollama HOT 1
- Unable to run LanguageBind/Video-LLaVA-7B-hf using ipex-llm HOT 1
- Add option to make ipex-llm's ollama default on the user's machine HOT 1
- RuntimeError: Expected all tensors to be on the same device HOT 3
- [documentation] "Quick Start" needs an actual Quick start HOT 4
- Run vLLM CPU benchmarking with BenchmarkWrapper API HOT 1
- RuntimeError: PyTorch is not linked with support for xpu devices HOT 14
- mistral_model_forward_4_36() got an unexpected keyword argument 'cache_position' HOT 1
- GPU memory usage is unbalanced in the pipeline mode HOT 1
- Where are the sources for bigdl-core-cpp? HOT 1
- OOM on multiple-ARC with vllm serving HOT 1
- Low parallel requests on Arc with VLLM serving HOT 2
- 6K input OOM on ARC with VLLM-serving HOT 1
- ipex-llm can not load qwen awq quantized models.
- ipex-llm fast_tokenizer error for loading the model Mistral-7B-Instruct-v0.3 HOT 1
- dGPU driver installation failed HOT 5
- Qwen2-7B-int4 function calling failed with ipex-llm.transformers.AutoModelForCausalLM.generate() HOT 3
- release stable version for inference LLM with 2x or 4x Arc A770 HOT 1
- Does IPEX-LLM support Flash Attention ? HOT 3
- vLLM serving qwen1.5-14B-Chat with 3.5k & 7.5k input, 500 output
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bigdl.