Comments (5)
Thank you for your interest in CatLIP. Our checkpoints are ready for use, and we would greatly appreciate your assistance in converting them to HuggingFace format.
from corenet.
Hey hey @sacmehta - I hope you are doing well. We've now uploaded all the models on the Hugging Face Hub under the corenet-community
org: https://huggingface.co/corenet-community
Should we move this under the Apple org?
from corenet.
Hi @sacmehta! We can certainly help with that :)
In parallel, please note that you can still upload the weights in native format to the Hub, there's no requirement for checkpoints to follow any particular format or be compatible with any given library! If you upload them, people can easily download and use them with your own inference code (or with MLX, if they're compatible). This would allow the community to test them immediately :)
from corenet.
Thanks @Vaibhavs10 and @pcuenca . Really appreciate your help in creating corenet-comunity page + converting OpenELM models to CoreML.
In my opinion, it is good to have these models under coronet-community
so that people outside Apple can also contribute to it (similar to MLX community).
I have a suggestion regarding the organization of the models: Currently, they're structured as corenet-community/place365-512x512-vit-huge
, which lacks clarity regarding their origin CoreNet project and intended tasks. Perhaps renaming them (e.g., corenet-community/catlip/image_classification/place365-512x512-vit-huge
) and mirroring the structure in the CoreNet projects folder would enhance user understanding. This adjustment would also make it easier for future research efforts focusing on improving specific models (say ViT) on a specific task (say image classification on Places365). What do you think?
from corenet.
Thanks for the comments @sacmehta!
Hub repositories do not support arbitrary hierarchy. Similar to GitHub, they are structured as a namespace (corenet-community
in this case), and then a flat list of repos under that namespace. We could potentially create a repo per task and place all models for that task in the same repo. In our experience, however, we've found that this is more confusing for users, leads to worse discoverability and makes it more difficult for you to collect usage stats. In general, we recommend the one-model-per-repo approach.
I do understand your sentiment that something like corenet-community/vit-large
is a bit too opaque. Part of the problem could be solved by populating all the model cards with tags and other searchable metadata fields. For example, we could potentially have a corenet
library name just as there's an MLX one. As another example, see how a model like mlx-community/Llama-3-8B-Instruct-1048k-8bit
also contains metadata for the task it supports (Text Generation), the file format (Safetensors), language, and other details. We could also use longer names (instead of just vit-large
) for additional clarity.
The following practices can also be used to communicate your intended model structure:
- An organization card, where you can add tables with the different model families and links to the individual models. For example, the
meta-llama
organization simply enumerates the model families, but we could be much more detailed in the case of CoreNet. - The use of collections to group related models together. For example, I created a collection for the Core ML versions of OpenELM when I uploaded them, and the official Apple organization has a few collections set up.
As per the target organization, I think it depends on the message you want to convey and the goals you have. Using the apple
org sends the message that these are Apple-sanctioned, official assets. Using the community approach is perfectly fine, and beneficial if you expect or encourage community contributions, as you said. What type of contributions do you expect from the community? We can set up a process where people can easily get accepted, similar to how it works for the MLX community, and we can help communicate your goals to incentivize engagement.
from corenet.
Related Issues (17)
- 'freeze_modules_based_on_opts()' is freezing module parameters twice HOT 4
- torchtext version issue HOT 6
- Why use interpreted language ? HOT 3
- where is the `corenet-train` entrypoint ? HOT 1
- /bin/bash: corenet-trainοΌζͺζΎε°ε½δ»€ HOT 1
- Instruct Template HOT 1
- A HF/Docker/Modal reproducible training/inference example
- Corenet detect M2 GPU HOT 1
- Do OpenELM's training datasets contain copyrighted material?
- Request for Access to OpenELM Training Logs HOT 2
- NameError: name 'corenet' is not defined OpenELM Parameter-Efficient Finetuning (PEFT) HOT 2
- When are you going to license as MIT or other FOSS license?
- Streaming HuggingFace Datasets
- Import "corenet.internal.cli.entrypoints" could not be resolvedPylancereportMissingImports
- How OpenELM 1.1B and larger LLMs are initialized during pre-training
- where can I get checkpoints HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from corenet.