Comments (4)
Alright, that is quite a lot to break down, this does not appear to be a KoboldAI bug as your running into issues with the dependencies itself but lets dig in.
So lets begin at the start with your setup process, the bundled requirements.txt really is only meant to be used with the Colabs and may or may not work well on your system. For most use cases we advice creating a conda environment instead using the bundled files (Make sure to use the official Miniconda3 install script, not something like apt-get install python-conda) instead of using a venv yourself, for now this will also be the finetuneanon version (Unless you really want that CPU mode to work). I don't think your specific issue is related to that, but its good to know for the future if you do ever run into dependency issues.
The second issue is that the Half mode does not work on the CPU, which is indeed correct and to be expected (Like you mentioned we documented this). We are phasing out Finetune's branch entirely in the upcoming version but since we ran into similar issues in the upcoming 0.17 builds it may or may not be fixed by other fixing efforts. We implemented the Half mode for the GPU in that upcoming version for the official transformers so that you get a GPU mode with even more efficient loading, and you also get fully functional CPU support. Finetune's branch got quite far behind upstream and most of its features have been integrated inside KoboldAI on that development branch so very soon we will no longer be recommending anyone to use it once 0.17 is finished.
Then the next part will get a bit more complicated. First of all the behavior of the models inside the official version of Transformers is entirely to be expected. Its supposed to load wrong, spew gibberish or not load at all. The reason for this is that Finetune invented his own format for 6B that the upstream version chose not to use. In the version of KoboldAI you are using we did not implement the official format yet and because of that it will all end up loading completely wrong, as it is trying to load a Neo model that is in reality a fake 6B format. So for now you will need either Finetune's fork or you will need the development version of KoboldAI (Currently at https://github.com/henk717/koboldai ) along with models converted to the official format for 6B which we to avoid confusion dubbed the HFJ format.
So, that explains away the error your getting on the CPU mode, and the gibberish you are getting when your not using the appropriate version for the models you are using.
That leaves one more issue we need to tackle, and thats the fact its not working for you, i downloaded the same model and loaded up the same branch of KoboldAI as your using (0.16). Then i loaded the Finetune version of Transformers (In my case ROCm since i have AMD). Generation went smooth and i did not run into any generation errors.
So the issue is most likely somewhere in your environment, are you certain its the model you list in your notes? Because on our downloads we also have the gpt-hfj-6b-adventure.7z which is that newer model format. If it is, then i highly recommend retesting using our tried and tested conda environments. Or the play-cuda.sh docker launcher if you have a Docker configured to be able to use your Nvidia.
If you'd like to get more one on one support i recommend joining our Discord https://discord.gg/XuQWadgU9k we can help you get going there and its quicker than resolving it over a Github issue.
from koboldai-client.
Thanks for a quick reply! It really clarified a lot of things.
I've tried the new version with the gpt-hfj-6b-adventure
model and it indeed works. Also I am really surprised at how fast it works in GPU+CPU mode. When specifying only 4 VRAM blocks it generates the output after 60 seconds. I think it's at least tolerable. And the RAM usage is also relatively low: only 12GB or so.
BTW, I used venv just as before. But I also tested the current version with Conda (using environments/finetuneanon.yml
) and gpt-j-6b-adventure-hf
model. And got exactly the same error (module 'keras.backend' has no attribute 'is_tensor'
). So I guess it's really some sort of model-transformer incompability and not a package issue. Maybe it's NVIDIA-only since there's no issue on AMD. Or maybe it's some weird problem with my OS or host environment (but I ran a lot of other CUDA projects and they worked). I didn't test it in Docker though. Tried play-cuda.sh
, but it stuck while building the image in the middle of installing the packages. Not sure if it supposed to be like that and if I just needed to wait longer. May be I'll re-test it in the future.
the bundled requirements.txt really is only meant to be used with the Colabs and may or may not work well on your system.
But in the docs it says:
If you do not want to use conda install the requirements listed in requirements.txt and make sure that CUDA is properly installed.
If it's really not recommended or not supported then I think the docs should say so. Though I don't think there should be any difference. But I agree that the Conda way may be more fool-proof (but it didn't change anything in my case).
from koboldai-client.
In 1.17 you can now use the regular version of transformers (huggingface.yml) for everything. I also updated the readme to let people know requirements.txt is not recommended.
The suitable models are all in the menu now. Let me know if your still having issues.
from koboldai-client.
Yeah, I've been using this version for a while now, and everything works correctly (except play-cuda.sh
). 6B models are loaded without errors (though I loaded them from a folder, not from the menu). Also, requirements.txt
works for me (didn't try Conda again).
This can probably be closed.
from koboldai-client.
Related Issues (20)
- Having issues with united versions HOT 1
- Installation error saying cannot find HOME HOT 1
- Issue downloading models HOT 1
- Kobold AI not utilize all cpus HOT 1
- delete
- softprompt tuner not working HOT 2
- Slow HOT 1
- How to set an API secret ? HOT 3
- No longer update? HOT 1
- ImportError: cannot import name 'dependency_versions_check'
- Colab readme.md link 404
- Micromamba URI can get corrupted by wget [Linux] HOT 1
- Change chat format with API
- The process suddenly stops for some reason
- Problem at remote-play HOT 3
- Colab only showing one model HOT 1
- Error when generating story HOT 1
- Fix def get_folder_path_info(base) for MacOS - any input?
- model type issue
- ImportError: cannot import name 'BaseResponse' from 'werkzeug.wrappers'
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from koboldai-client.