Comments (13)
It seems to me, that the renaming of the named parameter for feature names of wrap
which @sylvaticus introduced with commit 86a003f is causing some confusion. In DecisionTree.jl
the named parameter for this purpose is called featurenames
.
In BetaML
it got somehow feature_names
and then with the above mentioned commit features_names
(additional s). But the documentation for wrap
still says featurenames
and in the example above feature_names
is used. I.e. the InfoNode
created by wrap
in the example has the list of names in an attribute called feature_names
, but printnode
is looking for an attribute called features_names
.
So we have every possible combination and a bit of a chaos 😀.
My suggestion is to go back to featurenames
, in order to be consistent with 'DecsionTrees.jl' (and the documentation).
from betaml.jl.
Sorry, I have missed the original comment notification. I'll gonna look on this tomorrow .....
from betaml.jl.
Thanks @ablaom for reporting and @roland-KA for the deep research of the cause of the issue. I followed your suggestion and just reset it to "featurenames". This should be in the newly released v0.9.6
from betaml.jl.
As this issue shows, it is quite easy to run into trouble when using wrap
, I'm thinking about adding a parameter check to each wrap
implementation that verifies that only the keywords featurenames
and classnames
are used. It could throw an ArgumentError
, if something is wrong.
@ablaom , @sylvaticus What's your opinion about this?
from betaml.jl.
To extend the answer of @roland-KA , this works:
julia> using BetaML
julia> X = [1.8 2.5; 0.5 20.5; 0.6 18; 0.7 22.8; 0.4 31; 1.7 3.7];
julia> y = 2 .* X[:,1] .- X[:,2] .+ 3;
julia> mod = DecisionTreeEstimator(max_depth=10)
DecisionTreeEstimator - A Decision Tree model (unfitted)
julia> ŷ = fit!(mod,X,y);
julia> hcat(y,ŷ)
6×2 Matrix{Float64}:
4.1 3.4
-16.5 -17.45
-13.8 -13.8
-18.4 -17.45
-27.2 -27.2
2.7 3.4
julia> println(mod)
DecisionTreeEstimator - A Decision Tree regressor (fitted on 6 records)
Dict{String, Any}("job_is_regression" => 1, "fitted_records" => 6, "max_reached_depth" => 4, "avg_depth" => 3.25, "xndims" => 2)
*** Printing Decision Tree: ***
1. BetaML.Trees.Question{Float64}(2, 18.0)
--> True :
1.2. BetaML.Trees.Question{Float64}(2, 31.0)
--> True : -27.2
--> False:
1.2.3. BetaML.Trees.Question{Float64}(2, 20.5)
--> True : -17.450000000000003
--> False: -13.8
--> False: 3.3999999999999995
julia> wmod = wrap(mod,featurenames=["dim1","dim2"])
A wrapped Decision Tree
julia> import AbstractTrees:print_tree
julia> print_tree(wmod)
dim2 >= 18.0?
├─ dim2 >= 31.0?
│ ├─ -27.2
│ │
│ └─ dim2 >= 20.5?
│ ├─ -17.450000000000003
│ │
│ └─ -13.8
│
└─ 3.3999999999999995
(I modified the docstring to consider print_tree
)
from betaml.jl.
@roland-KA Thanks for for looking into this and for the diagnosis.
from betaml.jl.
Sound like a good idea.
from betaml.jl.
Mmm. I'm still pretty confused. Now I don't get any nice print out at all, just this:
julia> wrapped_tree = Trees.wrap(raw_tree, (featurenames=DF.names(X),))
A wrapped Decision Tree
Same if I use feature_names
.
from betaml.jl.
I understood that the wrap
function was intender for plotting only, not for printing.
The decision tree is already plot in full (but without feature names) when the DecisionTreeEstimator
is explicitly printed, but I may have misunderstood the needs. If there is a need to get the tree printed other than plotted, perhaps at this time it is better if I add another parameter featurenames
directly in the estimator constructor... what do you think ?
from betaml.jl.
@sylvaticus you are right, the wrap
-function was intended for plotting only. But the plot recipe uses also AbstractTrees.printnode
(which is implemented together which each wrap
-version). And AbstractTrees.print_tree
-function is based on printnode
. So it is also possible to print a text-based version of the tree using print_tree
.
from betaml.jl.
Mmm. I'm still pretty confused. Now I don't get any nice print out at all, just this:
julia> wrapped_tree = Trees.wrap(raw_tree, (featurenames=DF.names(X),)) A wrapped Decision TreeSame if I use
feature_names
.
@ablaom How did you print the text-based version? Using AbstractTrees.print_tree
?
show
doesn't use the wrap
-logic; so just print
ing the wrapped_tree
won't show the feature names.
from betaml.jl.
@sylvaticus @roland-KA Thanks for the detailed explanations. I must have been sloppy with my first post and dropped the print_tree
. I apologise for not checking this more carefully - very bad form.
from betaml.jl.
No problem, we are here to clarify and explain things 🤓
from betaml.jl.
Related Issues (20)
- MLJ model docstrings HOT 3
- GaussianMixtureModelClusterer docstring has formatting issues HOT 1
- Can we have floats rounded to 4 significant digits in decision tree displays? HOT 3
- Add PAM algorithm to fit KMedoidsClusterer
- `target_scitype` for MultitargetNeuralNetworkRegressor is too broad HOT 3
- Scaler() of Int matrix result in error
- Scaler() of vectors (instead of matrices) result in errors
- Deprecation warning from ProgressMeter.jl HOT 3
- Rename/Alias `GeneralImputer` to `MICE` HOT 5
- Separate into subpackages? HOT 1
- Iplement comments for AutoEncoderMLJ
- Bug in GMM caused by spelling mistake HOT 1
- Bug in Clustering_MLJ caused by spelling mistake HOT 3
- BetaML v11.0 Gaussian Mixture Model not compatible with MLJ HOT 7
- Problem with MLJ interface for KMedoidsClusterer HOT 1
- Correct the predict in AutoEncoder to consider non-vector layer outputs
- "`findall` is ambiguous" error HOT 3
- MLJ Interface is not working anymore HOT 6
- Cosine distance HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from betaml.jl.