Giter Club home page Giter Club logo

Comments (4)

miranov25 avatar miranov25 commented on August 15, 2024

In order to include new regression, classifiers - MLpipeline code to be restructured
https://alice.its.cern.ch/jira/browse/ATO-459

Current version (TO BE deprecated)

  • design influenced by TMVA - does not scale
  • fitter, regressor created in fit function based on the names and options
    • method parameter defined in Register_Method
    • model created during the fit method
    • many if, does not scale

New version - to be implemented

  • models (regression, quantile regression wrappers) to be constructed by users
  • wrappers implement additional common functionality
  • models registered in Register_model
  • models reused for fits

from rootinteractive.

miranov25 avatar miranov25 commented on August 15, 2024

Reference - GradientBoostingRegressor https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingRegressor.html

quantiles obtained during training time - using appropriate cost function

loss{‘ls’, ‘lad’, ‘huber’, ‘quantile’}, default=’ls’

loss function to be optimized. ‘ls’ refers to least squares regression. ‘lad’ (least absolute deviation) is a highly robust loss
function solely based on order information of the input variables. ‘huber’ is a combination of the two. ‘quantile’ allows quantile > regression (use alpha to specify the quantile).

Other References

https://towardsdatascience.com/regression-prediction-intervals-with-xgboost-428e0a018b
https://heartbeat.fritz.ai/5-regression-loss-functions-all-machine-learners-should-know-4fb140e9d4b0
https://www.evergreeninnovations.co/blog-quantile-loss-function-for-machine-learning/

from rootinteractive.

miranov25 avatar miranov25 commented on August 15, 2024

Deep quantile regression:

based on the cost function discussed in the: 

https://alice.its.cern.ch/jira/browse/ATO-459?focusedCommentId=253647&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-253647 

describe scalar version - one quantile per Neural network.

Quantile vector implementation in Jupyter notebook:
https://github.com/strongio/quantile-regression-tensorflow/blob/master/Quantile%20Loss.ipynb

from rootinteractive.

miranov25 avatar miranov25 commented on August 15, 2024

Quantile regression interface:

In general - quantiles should be defined before fitting (not needed in the Scikit -garden - but skgradern not supported anymore)

BDTs and neural nets should be constructed knowing which quantiles are needed

  • BDTs
    • For BDT array of the regressor to be created for each quantile
  • Deep neural nets options
    • Array of neural nets create for each quantile:
      • Slower - 
      • Non consistent - quantiles prediction could be not sorted
      • Bigger variance
    • One neural net for all quantile predictions

Proposed interface:

  • init
  • fit
  • predict(+index)
  • appendStatPandas(options)
    • append statistics to the panda data frame
    • by default all options
  • RMS estimators based on the quantiles
    • some approximation has to be done
      appendStatTree ?
    • append statistic to the tree for later usage

from rootinteractive.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.