Comments (11)
That would be awesome, thanks.
Also, it comes at a good time, as I did several changes to the package, and wrapped several other models to MLJ.
Just one thing... why you don't use DocStringExtensions
? It also has a nice template feature that looks great to make documentation homogeneous...
from betaml.jl.
I have now implemented a standard docstring for all MLJ models. Please feel free to correct and amend it.
from betaml.jl.
Hi @ablaom and @sylvaticus,
The MLJ interface and implementation for clustering models is a bit rough. Some models like KMeans
inherit from MLJ.Probabilistic
and I can take the mode(output)
as the final result. Other models like GMMClusterer
do not inherit from MLJ.Probabilistic
and I cannot know beforehand that they are probabilistic to call the mode(output)
.
What is the correct generic code to perform clustering with all available models in MLJ? How to make the final assignment of integer labels to samples in a model-agnostic manner?
from betaml.jl.
haven't look properly, but isn't it the opposite regarding kmeans and gmmclusterer?
from betaml.jl.
Yes, sorry for misplaced the models. KMeans is the deterministic one.
from betaml.jl.
Thanks for that! It seems some MLJ-specific elements (and particular, Examples) are missing, but this is great improvement. (The full spec is here). Unfortunately, as BetaML uses DocStringExtensions.jl, this may be a bit awkward to integrate... Probably, we just append the missing stuff in the standard way. The ordering of elements will be a bit different, but I don't think that matters too much.
If @josephsdavid get's time, he can take a look at it.
from betaml.jl.
Hy @ablaom , how can I check how the standard doc "renders" in the user case you have in mind ? It's not a problem to manually implement the docstring without DocStringExtensions
...
Conversely, I am not sure about the examples.. at the end they all follow the same API, so perhaps the examples should be in the API description rather than in each individual model.. I mean, I don't find a great added value in it..
from betaml.jl.
Hy @ablaom , how can I check how the standard doc "renders" in the user case you have in mind
It should suffice to query the ordinary doc string for the model, as in julia> @doc BetaML.Trees.RandomForestRegressor
. It's possible the string won't get correctly recorded in the MLJ model registry, but only if you overload the trait MLJModelInterface.docstring
(which has descr
as alias). But I checked your code and I don't think that is the case.
The ultimate check is to do using MLJModels; doc("RandomForestRegressor", pkg="BetaML")
but that won't work until the MLJ model registry is updated. After that the docstring is available to the MLJ user without loading code (eg, in a search such as models("Clusterer")
to find all models with "Clusterer" in their docstring).
Conversely, I am not sure about the examples.. at the end they all follow the same API, so perhaps the examples should be in the API description rather than in each individual model.. I mean, I don't find a great added value in it..
I beg to differ. I think beginners really like an example for whatever model they decided to try out. Many of them don't have the depth of understanding required to generalize one example to the broader context. And some models have specific features worth demonstrating. It's okay to autogenerate these examples when they are basically the same each time (we did somewhere...). And the example can be pretty basic. You might find Generating synthetic data section of the manual helpful. And there are built-in datasets loaded with @load_iris
, @load_boston
, @load_crabs
(binary) and @load_reduced_ames
. There are now quite of few packages with MLJ compliant docstrings now you can mimic.
from betaml.jl.
Hello, I have added an Example session to all MLJ models.
from betaml.jl.
Wow. That's great. Ping me when you have a new release tagged with the changes and I'll update the MLJ model registry (and this issue can be closed).
from betaml.jl.
done it (v0.9.3)..
The MLJ interfaced models documentation appears also in the BetaML documentation, e.g. here
still.. if @juliohm has the time to review the documentation, would be great...
from betaml.jl.
Related Issues (20)
- Trouble interpolating feature names in a wrapped tree HOT 13
- MLJ model docstrings HOT 3
- GaussianMixtureModelClusterer docstring has formatting issues HOT 1
- Can we have floats rounded to 4 significant digits in decision tree displays? HOT 3
- Add PAM algorithm to fit KMedoidsClusterer
- `target_scitype` for MultitargetNeuralNetworkRegressor is too broad HOT 3
- Scaler() of Int matrix result in error
- Scaler() of vectors (instead of matrices) result in errors
- Deprecation warning from ProgressMeter.jl HOT 3
- Rename/Alias `GeneralImputer` to `MICE` HOT 5
- Separate into subpackages? HOT 1
- Iplement comments for AutoEncoderMLJ
- Bug in GMM caused by spelling mistake HOT 1
- Bug in Clustering_MLJ caused by spelling mistake HOT 3
- BetaML v11.0 Gaussian Mixture Model not compatible with MLJ HOT 7
- Problem with MLJ interface for KMedoidsClusterer HOT 1
- Correct the predict in AutoEncoder to consider non-vector layer outputs
- "`findall` is ambiguous" error HOT 3
- MLJ Interface is not working anymore HOT 6
- Cosine distance HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from betaml.jl.