Giter Club home page Giter Club logo

exercise's People

Contributors

2017alan avatar jingjing-gong avatar schizophreni avatar xpqiu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

exercise's Issues

chap2_linear_regression 的一些建议

  • 按照书中的最小二乘法公式貌似有点问题,下面是我感觉合理的公式写法
    $$w^{*}=(x^{T}x)(x^{T}*y)$$

另外,梯度下降法的第二个多项式,我试了好多中方法都无法比较好的收敛,求大神指点

这是我的git作业答案

作业答案

邱老师好:
非常感谢您的辛苦付出,书籍出版了,一定先看为敬。
想请问邱老师,这些练习有对应答案吗?
谢谢邱老师~

chap_1 warmup 练习题中的一些错误与建议 (2019年8月29日)

  • 语句问题: 第7题 第一句话表达不清晰,建议改为“把第5题的数组a的最后两行所有元素放到c中”
  • 建议错误: 第7题 提示应该改为“a[1:3,:]”
  • 错词:第10题 “从新” 应该为 “重新”
  • 答案不严谨:第11题 答案应该为"int 32 或者 int 64"
  • 错词:第12题 “数据类洗净” 应该为 “数据类型”
  • 多逗号:第16、17、18题 “中的”后面的逗号建议删去
  • 中英括号混用:第19题 第(2)问中有一个中文括号最好改为英文括号
  • 提示错误:第23题 axis这个参数应该在argmax()的参数里面
  • markdown问题:第23题的答题栏 应该改为code格式,而不是markdown格式
  • 描述顺序:第24题 “y=x*x, x = np.arange(0, 100, 0.1) ” 建议将x和y的顺序调整一下

chap4_ simple neural network 的一点小建议

  • tf2.0-exercise的开头 比较值大小的时候,
softmax(test_data).numpy()==tf.nn.softmax(test_data, axis=-1).numpy()

这样的浮点数直接严格比较大小的方式,我个人感觉不太合适近似相等的值

(softmax(test_data).numpy()-tf.nn.softmax(test_data, axis=-1).numpy())**2<0.001

换这个,和下面的sigmoid的一样,感觉合适点,作者感觉如何呢?

这是我的作业

在chap5_CNN发现的关于tensorflow模型搭建的一些问题

据我所见,作者在所有

@tf.function
    def call(self, x):
        h1 = self.l1_conv(x) 
        h1_pool = self.pool(h1) 
        h2 = self.l2_conv(h1_pool)
        h2_pool = self.pool(h2) 
        flat_h = self.flat(h2_pool)
        dense1 = self.dense1(flat_h)
        logits = self.dense2(dense1)
        probs = tf.nn.softmax(logits, axis=-1)
        return probs

call函数前面,都会添加@tf.function,但这样似乎会在使用dropout和BatchNormalization时出现问题
具体问题报错如下(dropout):

Epoch 1/10
      1/Unknown - 0s 490ms/step
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
…………………………这是华丽的省略号…………………………………………
_SymbolicException: Inputs to eager execution function cannot be Keras symbolic tensors, but found [<tf.Tensor 'keras_learning_phase:0' shape=() dtype=bool>]

具体问题报错如下(dropout):

---------------------------------------------------------------------------
InaccessibleTensorError                   Traceback (most recent call last)
<ipython-input-23-12a7599f911a> in <module>
      4 train_ds, test_ds = cifar10_dataset()
      5 model.is_train=True
----> 6 model.fit(train_ds, epochs=10)
      7 model.is_train=False
      8 model.evaluate(test_ds)
……………………………………这是华丽的省略号……………………………………………………
InaccessibleTensorError: The tensor 'Tensor("batch_normalization_24/batch_normalization_24_trainable:0", dtype=bool)' cannot be accessed here: it is defined in another function or code block. Use return values, explicit Python locals or TensorFlow collections to access it. Defined in: FuncGraph(name=call, id=2038740490784); accessed from: FuncGraph(name=keras_graph, id=2033355272936).

当我经过一天的反复试错之后(所以给加鸡腿🍖不?),发现锅在这个@tf.function,只要去掉,就可以顺利运行,代码如下(dropout):

def __init__(self):
        super(myConvModel, self).__init__()
        self.l1_conv = Conv2D(filters=32, 
                              kernel_size=(5, 5), 
                              activation='relu', padding='same')
        
        self.l2_conv = Conv2D(filters=64, 
                              kernel_size=(3, 3), 
                              activation='relu',padding='same')
        
        self.pool = MaxPooling2D(pool_size=(2, 2), strides=2)
        self.l3_conv = Conv2D(filters=128, 
                              kernel_size=(3, 3), 
                              activation='relu', padding='same')
        
        self.l4_conv = Conv2D(filters=128, 
                              kernel_size=(3, 3), 
                              activation='relu',padding='same')
        self.drop=Dropout(0.5)
        self.flat = Flatten()
        self.dense1 = layers.Dense(100, activation='tanh')
        self.dense2 = layers.Dense(10)
    
    def call(self, x):
        h1 = self.drop(self.l1_conv(x))
        h1_pool = self.pool(h1) 
        h2 = self.drop(self.l2_conv(h1_pool))
        h2_pool = self.pool(h2) 
        h3 = self.drop(self.l3_conv(h2_pool))
        h4 = self.drop(self.l4_conv(h3))
        h4_pool=self.pool(h4)
        flat_h = self.flat(h4_pool)
        dense1 = self.dense1(flat_h)
        logits = self.dense2(dense1)
        probs = tf.nn.softmax(logits, axis=-1)
        return probs
Epoch 1/10
200/200 [==============================] - 4s 20ms/step - loss: 1.9084 - accuracy: 0.2942
Epoch 2/10
200/200 [==============================] - 3s 16ms/step - loss: 1.5522 - accuracy: 0.4316
Epoch 3/10
200/200 [==============================] - 3s 16ms/step - loss: 1.4182 - accuracy: 0.4845
Epoch 4/10
200/200 [==============================] - 3s 16ms/step - loss: 1.3322 - accuracy: 0.5161
Epoch 5/10
200/200 [==============================] - 3s 16ms/step - loss: 1.2656 - accuracy: 0.5444
Epoch 6/10
200/200 [==============================] - 3s 17ms/step - loss: 1.1927 - accuracy: 0.5668
Epoch 7/10
200/200 [==============================] - 3s 17ms/step - loss: 1.1255 - accuracy: 0.5969
Epoch 8/10
200/200 [==============================] - 3s 17ms/step - loss: 1.0688 - accuracy: 0.6187
Epoch 9/10
200/200 [==============================] - 3s 17ms/step - loss: 1.0302 - accuracy: 0.6305
Epoch 10/10
200/200 [==============================] - 3s 16ms/step - loss: 0.9981 - accuracy: 0.6439
1/1 [==============================] - 1s 574ms/step - loss: 1.1820 - accuracy: 0.5783
[1.1819568872451782, 0.5783]

代码如下(BatchNormalization):

def __init__(self):
        super(myConvModel, self).__init__()
        self.l1_conv = Conv2D(filters=32, 
                              kernel_size=(5, 5), 
                              activation='relu', padding='same')
        
        self.l2_conv = Conv2D(filters=64, 
                              kernel_size=(3, 3), 
                              activation='relu',padding='same')
        
        self.pool = MaxPooling2D(pool_size=(2, 2), strides=2)
        self.l3_conv = Conv2D(filters=128, 
                              kernel_size=(3, 3), 
                              activation='relu', padding='same')
        
        self.l4_conv = Conv2D(filters=128, 
                              kernel_size=(3, 3), 
                              activation='relu',padding='same')
        self.bn1=BatchNormalization()
        self.bn2=BatchNormalization()
        self.bn3=BatchNormalization()
        self.bn4=BatchNormalization()
        self.flat = Flatten()
        self.dense1 = layers.Dense(100, activation='tanh')
        self.dense2 = layers.Dense(10)
    def call(self, x):
        h1 = self.bn1(self.l1_conv(x))
        h1_pool = self.pool(h1) 
        h2 = self.bn2(self.l2_conv(h1_pool))
        h2_pool = self.pool(h2) 
        h3 = self.bn3(self.l3_conv(h2_pool))
        h4 = self.bn4(self.l4_conv(h3))
        h4_pool=self.pool(h4)
        flat_h = self.flat(h4_pool)
        dense1 = self.dense1(flat_h)
        logits = self.dense2(dense1)
        probs = tf.nn.softmax(logits, axis=-1)
        return probs
Epoch 1/10
200/200 [==============================] - 4s 20ms/step - loss: 1.5309 - accuracy: 0.4478404 - 
Epoch 2/10
200/200 [==============================] - 3s 15ms/step - loss: 1.1861 - accuracy: 0.5785
Epoch 3/10
200/200 [==============================] - 3s 15ms/step - loss: 0.9970 - accuracy: 0.6466
Epoch 4/10
200/200 [==============================] - 3s 15ms/step - loss: 0.8535 - accuracy: 0.6977
Epoch 5/10
200/200 [==============================] - 3s 15ms/step - loss: 0.7295 - accuracy: 0.7469
Epoch 6/10
200/200 [==============================] - 3s 15ms/step - loss: 0.6173 - accuracy: 0.7850
Epoch 7/10
200/200 [==============================] - 3s 15ms/step - loss: 0.5188 - accuracy: 0.8222
Epoch 8/10
200/200 [==============================] - 3s 15ms/step - loss: 0.4174 - accuracy: 0.8595
Epoch 9/10
200/200 [==============================] - 3s 16ms/step - loss: 0.3266 - accuracy: 0.8929
Epoch 10/10
200/200 [==============================] - 3s 16ms/step - loss: 0.2286 - accuracy: 0.9290
1/1 [==============================] - 1s 708ms/step - loss: 1.1636 - accuracy: 0.6567
[1.163588285446167, 0.6567]

此外,这是官方示例代码
可以看见,在call前面没有@标签的,我个人猜测,这是因为@标签,破坏了一些东西,导致这类需要超参数控制训练和预测时的参数training,出现了问题,而无法train下去,这是我个人浅显的认知,如果作者有确切的答案,希望告知!谢谢!

关于numpy教程中softmax的前向反向传播的简化,原代码存在上溢下溢问题,反向传播过程还可进一步简化

  1. 原代码中softmax没有进行上溢下溢处理,即减掉每一行最大值再计算
  2. 关于softmax反向传播,由于原代码中没有上溢下溢处理,可能导致exp爆炸,学习率稍大一点无法训练。关于softmax反向传播,可先对softmax取对数再用对数化的函数对输入求导。
    以下是我推导后得到的代码,注释中为原来的代码(需要利用exp的值),并且需要进行matrix乘法
    简化后的代码,只需要利用softmax的输出即可,简化了内存和计算量
    s = self.mem['out']
    """
    sisj = np.matmul(np.expand_dims(s,axis=2), np.expand_dims(s, axis=1)) # (N, c, c)
    g_y_exp = np.expand_dims(grad_y, axis=1)
    tmp = np.matmul(g_y_exp, sisj) #(N, 1, c)
    tmp = np.squeeze(tmp, axis=1)
    tmp = -tmp+grad_y*s
    """
    tmp = np.multiply(grad_y, s)
    grad_x = tmp - np.multiply(s, np.sum(tmp, axis=-1, keepdims=True))
    return grad_x
    理论结果如下:x为输入, s=softmax(x), l为loss
    CodeCogsEqn

练习题中的一些问题

问题:第一张绪论的练习题中第23题,不应该还用13题中的x。因为这样根本感受不到axis变换带来的区别。
建议:使用 x = np.array([[1,2],[4,3]])

chap5_CNN 的一些建议

  • 在tutorial_mnist_conv-keras-sequential.ipynb 这个给定的样例中,我发现 如果最后一个全连接层的激活函数如果为 tanh的话,无法得到像 样例中那样的稳定收敛的结果,是否当初没写好?

这是我的代码

请求在科赛网搭建一个专栏方便运行代码

您好,我是科赛网的高朋,我们是一家 jupyter notebook 发行版的供应商,我们能否把您的课程代码放到科赛网上建一个专栏这样方便学生一键运行和复现代码,类似 PracticeAI 这样的,这个是社区的功能。除此之外我们也能提供学校的科研版,方便作业提交管理和打分的需求,这个已经给清华的数据科学课程用起来了。不知道能否允许我们把您的课程代码建一个专栏,甚至更进一步提供后面的作业管理的服务。

求习题答案

感谢邱老师的书籍,看了后对深度学习的概念更加深了。请问邱老师,每章的习题答案有吗

chap4_ simple neural network 遇到的问题求助!

我在tensorflow的练习,和numpy实现的二层神经网络训练中,均有一些疑惑

其中,tensorflow中,不知道是不是参数初始化的原因,训练过程比原来的练习模板里面的长很多,而且一开始的预测为0,搞不懂。另外,中间出了好多warning,不知道怎么解决。

然后numpy的二层神经网络,我可以很顺利的运行,就是结果很奇怪,模型无法像模板那样顺利收敛,不知原因为何,希望老师可以解惑,也期待其他看见的大神,可以施以援手,在下感激不尽!

这是我的代码

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.