Giter Club home page Giter Club logo

nndl-errata's People

Contributors

fkdosilovic avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Forkers

tfvieira

nndl-errata's Issues

0/1 loss function

For the 0/1 loss function $L_i^{(0/1)}$ defined in equation (1.7) of page 9, my understanding of "0/1 loss function" is that its non-zero value should be 1. But when $y_i\ne{\rm sign}(\overline W\cdot\overline{X_i})$, the value of $L_i^{(0/1)}$ is 2 instead of 1. So, should we change equation (1.7) to $$L_i^{(0/1)}=\frac{1}{4}(y_i-{\rm sign}(\overline W\cdot\overline{X_i}))^2=\frac{1}{2}(1-y_i\cdot{\rm sign}(\overline W\cdot\overline{X_i}))?$$

Update of Tikhonov regularization, and in turn the Tikhonov regularizer

Equation 1.33 on page 26 gives formula of updating Tikhonov regularization for perceptron model. The sum term of the equation suggests that the step-size $\alpha$ here is the same as that used in Equation 1.6 on page 8. We know that the standard gradient decent update formula for perceptron (for a single training example) is $\overline W\Leftarrow\overline W-\alpha'\nabla_W L_i$, This update formula is also given in the first line of page 10. However, a careful verification shows that the $\alpha$ in the first line of page 10 is not the $\alpha$ in Equation 1.6 on page 8, if we are to make the two equations consistent. That's why I use a prime in the above update formula. Specifically, we can derive that $\alpha'=2\alpha$ because error $E(\overline X)$ is twice as large as $y$ when a prediction error occurs. So, the gradient decent update formula for perceptron is $$\overline W\Leftarrow\overline W-2\alpha\nabla_W L_i.\tag{1}$$

Now, as section 1.4.1.1 of the text says, the Tikhonov regularization for perceptron is an addition of penalty $\lambda||\overline W||^2$ to the loss function, here the perceptron criterion. To apply gradient decent, we need to take partial derivative of this term with regard to $\overline W$, which is $2\lambda\overline W$. Since it is part of the total loss function, coefficient $-2\alpha$ in $(1)$ will be multiplied, leading to $-2\alpha\cdot(2\lambda\overline W)=-4\alpha\lambda\overline W$. Taking out the common factor, the coefficient of $\overline W$ should be $(1-4\alpha\lambda)$ in Equation 1.33. To make it clearer, Equation 1.33 should be $$\overline W\Leftarrow\overline W(1-4\alpha\lambda)+\alpha\sum\limits_{\overline X\in S} E(\overline X)\overline X.\tag{2}$$

But Equation 1.33 will be used in later chapters of the book, so instead of the change in $(2)$, we keep Equation 1.33 unchanged by adjusting $\lambda$ to $\frac{\lambda}{4}$ to cancel 4. Under such adjustment, the Tikhonov regularizer should be $$\frac{\lambda}{4}||\overline W||^2,$$ which is what I propose to change.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.