Comments (4)
Hi @raghavbj24 -- thank you for submitting this issue! Question for you: I see that your base_model
is /home/ubuntu/llama-2-7b-hf_for_merge
. Would the same "small size" phenomenon happen if you try to use meta-llama/Llama-2-7b-hf
from https://huggingface.co/meta-llama/Llama-2-7b-hf
? Please let me know. Thank you.
from ludwig.
Hi @alexsherstinsky-- as per your suggestion I tried the base model as meta-llama/Llama-2-7b-hf
from huggingface...but there is no difference and the size of the saved model is very small.
from ludwig.
@raghavbj24 Could you please point me to the HuggingFace location where your model is saved and enable me to access it with "read" privileges? I am going to look into it thoroughly in the next few days. Thank you.
from ludwig.
@raghavbj24 In parallel, if you do not mind: could you please rerun your experiment using this base model: alexsherstinsky/Mistral-7B-v0.1-sharded
-- and let me know here what you see for the merged model size (and please also tell me the location where it will be saved). Thank you very much for your collaboration.
from ludwig.
Related Issues (20)
- Add Ludwig config json to output directory containing model weights HOT 3
- Improve docker build times for `ludwig-ray` and `ludwig-ray-gpu`
- Ray - protobuf issue HOT 5
- Support for Models stored in GCS bucket HOT 1
- GPU is not available
- Wandb on ludwigai/ludwig-ray-gpu:latest + ray throws AttributeError: module 'pydantic.fields' has no attribute 'ModelField'
- Token-level Probability Always 0.0 When Fine-tuning Llama2-7b Model on Single GPU
- Dependency issue HOT 2
- `RESPONSE` contains lot longer text than is expected based on the `output_features` and `max_sequence_length`.
- PyYAML error while installing with python 3.12 HOT 1
- Ray retraining fails with StopIteration exception when retraining a model with small datasets HOT 1
- Issues fine tuning Mistral HOT 1
- Issue fien tuning Falcon HOT 2
- Uploading model to HF HOT 1
- phi 3 error HOT 1
- Error running inference on Llama3 model HOT 4
- Add llava support in ludwig
- MNIST Dataset can't be downloaded
- 4/5 trial fails due to lack of memory HOT 2
- Twitter Bots Example Overfits "Out-of-the-Box" HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ludwig.