Comments (5)
The goal of the project is to end up with weights trained from scratch yes. The license on this repo is for the source code, we don't distribute the checkpoints from Meta. So yes users need to get them through the form and can use it only for research purposes as stated there in the agreement. This is temporary until there are checkpoints trained from scratch using the apache code.
from lit-llama.
@Rock-Anderson on top of what @awaelchli said:
if we're still loading official Llama weights (under GPL License)
note that the LLaMA weights from Meta are not GPL licensed. They are released under a custom Meta license:
To maintain integrity and prevent misuse, we are releasing our model under a noncommercial license focused on research use cases. Access to the model will be granted on a case-by-case basis to academic researchers; those affiliated with organizations in government, civil society, and academia; and industry research laboratories around the world. People interested in applying for access can find the link to the application in our research paper.
from https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
from lit-llama.
Got it, thanks for educating, and for this repo contribution.
The Readme mentioned truly open-sourced at the beginning and jumped to the section where we load official Llama weights, so I was confused a bit.
Maybe adding that loading-weights and conversion to Lit-Llama for inference is only for research purposes, would help.
But I guess that is meant to be understandable.
Thanks anyway, closing this.
from lit-llama.
Maybe adding that loading-weights and conversion to Lit-Llama for inference is only for research purposes, would help.
@lantiga Do we want to change the wording in the "use the model" section to mention "for research purposes only" as suggested?
from lit-llama.
Yes for sure, we want there to be no misunderstandings. I’ll change it shortly, thank you @Rock-Anderson for the input
from lit-llama.
Related Issues (20)
- Mistral Model HOT 1
- Adapter finetuning do not run on two cards (A100 40G)
- [question] error message while finetuning HOT 2
- Question about 'validating...' from lora.py
- [question] assert lora_path.is_file() error
- [question] nan loss value and run time error
- How to do conversation with fine tuned model?
- Running into StopIteration with single node multi GPU pretraining against the redpajama sample HOT 5
- Why is LLaMA response to queries in the conversation so wrong?
- RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
- Full fine-tuning on Alpaca dataset with 4 L40s GPUs fails 8 hours into the training job with index_copy_ HOT 2
- How to convert hf weight of 70b to lit-lamma weights?
- How to quantize LLama in fine-tuning ?
- RuntimeError: cutlassF: no kernel found to launch! HOT 1
- RuntimeError: Expected x1.dtype() == cos.dtype() to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.) HOT 1
- Can I use Lightning fabirc to pre train llama2 on v100?
- Ban some tokens
- Error: git submodule update --init --recursive -q did not run successfully HOT 1
- Beam search generation
- Issue with Rotary Embedding Initialization when the number of devices is > 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lit-llama.