Giter Club home page Giter Club logo

ai-system's Issues

textbook 8.3

  • 8.3.5 是否可以考虑放到8.7去,叫做“8.7 - 实验”
  • 图 8.3.6 看起来很模糊,可能需要重绘
  • 参考文献 7625 字,太长

textbook 11

  • 11.1 模型压缩,能不能实现一个简单的算法,把一个现成的模型压缩后看看前后比对的inference效果 ? 如果training过程比较难的话就写个Inference的。数据量化,稀疏化,知识蒸馏,轻量化,张量分解,随便选一个当例子。
  • 不要有四级目录 a.b.c.d,最多只能三级,用 xxxxx 表示四级。
  • 图11-1-1这种,点和文字重合在一起,不仔细看都看不出来,图上也没有网格。需要保证类似图的质量。

textbook 3 and 4

  • 个人感觉第3章和第4章可以合并(从内容连贯性上看)。另外一个理由就是这两章内容都比较少。

basiclab lab1 mnist_tensorboard.py have some problems

my environment : CUDAtoolkit 10.0 pytorch 1.5.0 tensorflow 1.15.0
when i run python mnist_tensorboard.py , some error happened:

2021-03-21 20:15:01.978418: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2021-03-21 20:15:04.781186: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same
Error occurs, No graph saved
Traceback (most recent call last):
  File "mnist_tensorboard.py", line 199, in <module>
    main()
  File "mnist_tensorboard.py", line 182, in main
    writer.add_graph(model, images)
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\utils\tensorboard\writer.py", line 707, in add_graph
    self._get_file_writer().add_graph(graph(model, input_to_model, verbose))
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\utils\tensorboard\_pytorch_graph.py", line 291, in graph
    raise e
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\utils\tensorboard\_pytorch_graph.py", line 285, in graph
    trace = torch.jit.trace(model, args)
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\jit\__init__.py", line 875, in trace
    check_tolerance, _force_outplace, _module_class)
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\jit\__init__.py", line 1027, in trace_module
    module._c._create_method_from_trace(method_name, func, example_inputs, var_lookup_fn, _force_outplace)
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\nn\modules\module.py", line 548, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\nn\modules\module.py", line 534, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "mnist_tensorboard.py", line 61, in forward
    x = self.conv1(x)
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\nn\modules\module.py", line 548, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\nn\modules\module.py", line 534, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\nn\modules\conv.py", line 349, in forward
    return self._conv_forward(input, self.weight)
  File "D:\Program_Files\Anaconda3\envs\ai-system-learn\lib\site-packages\torch\nn\modules\conv.py", line 346, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

I think there are something wrong with tensorboard. How can I fix this?

openpai 1.2 报错”No worker node is detected.“

在执行bash quick-start-service.sh -m ~/master.csv -w ~/worker.csv -c ~/config.yaml报错”No worker node is detected.“,查询发现错误在执行/contrib/kubespray/script/openpai-generator.py (line 304)脚本时出现。请问是否有解决方法。

textbook 10

  • 以图10.1.1举例,这里没有a,s,r的符号,读者无法和后面的解释对应起来
  • 10.1.2 的Gt公式是可以有展开形式的,如果写出来会更容易理解
  • $\gamma$ 我记得是 (0,1], 不是 [0,1)
  • $\pi (a|s)=p(a_t=a|s_t=s)$ 这类的公式(我认为的)标准写法是 $\pi (a|s)=p(A_t=a|S_t=s)$
  • 10.1 参考文献内容太长,与正文不成比例
  • 10.1.2 可能需要一个三个圆叠加的图来展示三者的关系
  • 10.1.1 和 10.1.2 请放在一个md文件中
  • 10.2.2 中为什么会有四级索引序号的图?比如 图10.2.2.1,可以改成 10.2.1
  • 10.2.2 和 10.2.3 可以考虑合并为 10.3,这样篇幅和内容上都可以和 10.2.1(变成10.2)来匹配

Content in Security

Dear Xian and Peichen,

我们可以在安全章节添加总览内容并展开到各章吗?例如课程中这两张图分别放入内容。统揽全局
image

image

lab 6,7,9

应该是对应到哪个章节的呢?能否合并到正文中?

title update ? 安全与隐私章节

  1. 原则 一致性和简洁

服务-> 推理,减少问题词出现次数

2 标题修改
原始
12. 人工智能安全与隐私
12.1 人工智能内在安全与隐私
12.1.1 深度神经网络的内在安全问题
12.1.2 深度神经网络的内在隐私问题
12.2 人工智能训练安全与隐私
12.2.1 深度学习训练时的安全问题
12.2.2 深度学习训练时的隐私问题
12.2.3 联邦学习及其训练时的隐私问题
12.3 人工智能服务安全与隐私
12.3.1 深度学习服务时的安全问题
12.3.2 深度学习服务时的用户隐私问题
12.3.3 深度学习服务时的模型隐私问题

->


12. 人工智能安全与隐私
12.1 人工智能内在安全与隐私
12.1.1 深度神经网络的安全问题
12.1.2 深度神经网络的隐私问题
12.2 人工智能训练安全与隐私
12.2.1 训练系统安全
12.2.2 训练系统隐私
12.2.3 联邦学习隐私
12.3 人工智能推理安全与隐私
12.3.1 推理系统安全
12.3.2 推理系统用户隐私
12.3.3 推理系统隐私

textbook 8.1

正文6000字,参考文献4000字,比例失衡,缩短后者

textbook 1.1

markdown文件

  • Line 27~36,第二层缩进后的,没有并列子内容,就可以取消前面的 bullet,比如 “- 谷歌、百度、..." -> "\t谷歌、百度、..."
  • Line49~58,不能即用 bullet 又用 序号 (1)(2)(3)
  • 图序号的标准写法是 ”图 1.1.1“
  • 1.1.3,神经网络的基本理论在深度学习前已基本奠定,有点儿啰嗦,可以是”神经网络基本理论的奠定“
  • Line 70 - 90, 可以用 bullet缩进
  • line 92, 逗号改为顿号

questions on lab6

Dear teachers, we have problems doing our homework-lab6, we are now stuck in one step, the problem is shown in the picture, could you please help us?thanks a lot!

textbook 6

  • 表6-5-1,横向不够的话改成纵向,否则无法印刷
  • 表6-3-1,表头不需要用 ** 加黑,它自黑。该表格的内容最好居中或者右对齐。
  • 并行算法,有对应的 lab 吗?没有的话是否可以简单实现一个来加以说明?

textbook 1.2

  • line 20, 开头加回车
  • 表 1-2-1,标准写法是 表 1.2.1,并且应该放在表的上方
  • line 96-106,二级缩进时,不需要bullet
  • line 115, 10的8次方倍,直接写成 $10^8$
  • 图 1.2.3 中的右上侧的蜻蜓、老鼠、人脑,没有解释其含义。V100 上有个 3?
  • line 131,137, 为何与 line 57,75,90 不一致?

部署OpenPai遇到问题

老师,您好。我们在部署openpai的过程的第二步部署部署 Kubernetes的过程中,运行sudo bash quick-start-kubespray.sh -m ~/master.csv -w ~/worker.csv -c ~/config.yaml后master与worker机器出现了源冲突的问题,如下图。

image

请教您该如何解决?

add GammarRegressor

usually, only the training change, so it is just about using the same converter for a new model.

textbook 12

  • 图变形了,不用设置 width, height
  • 12.1 差分隐私SGD算法,可不可以给个具体的数据例子来写一段代码跑一下?
  • 12.2.1 数据隐私保护,能不能做一个简单的prototype,写代码来实现并说明一下?
  • 12.2.2 模型隐私保护,能不能实现一个简单的水印技术来说明?

video

Great job! Will there be an open video course?

An Error by Lecture 3 - Computation frameworks for DNN

In page 19 by calculation of the gradient of L(x), there is probably an error in the second part for sin(exp(x)+exp(x)^2

Should the gradient be cos(exp(x)+exp(x)^2)(exp(x)+2exp(x)^2), i.e. there is one unnecessary exp in the answer?

Thanks

openpai k8s error

Starting kubernetes...
setup k8s cluster

PLAY [localhost] *******************************************************************************************************************************************************************************************
[WARNING]: Could not match supplied host pattern, ignoring: bastion

PLAY [bastion[0]] ******************************************************************************************************************************************************************************************
skipping: no hosts matched

PLAY [k8s-cluster:etcd] ************************************************************************************************************************************************************************************
included: /home/openpai/pai-deploy/kubespray/roles/bootstrap-os/tasks/bootstrap-debian.yml for stu-276, iair279, stu-282

PLAY [k8s-cluster:etcd] ************************************************************************************************************************************************************************************

TASK [kubernetes/preinstall : Stop if access_ip is not pingable] *******************************************************************************************************************************************
changed: [iair279]
changed: [stu-276]
changed: [stu-282]
included: /home/openpai/pai-deploy/kubespray/roles/container-engine/docker/tasks/set_facts_dns.yml for stu-276, iair279, stu-282
[WARNING]: flush_handlers task does not support when conditional

TASK [download : prep_download | Create staging directory on remote node] **********************************************************************************************************************************
changed: [stu-276]
changed: [iair279]
changed: [stu-282]
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/prep_kubeadm_images.yml for stu-276
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_file.yml for stu-276

TASK [download : download_file | Create dest directory on node] ********************************************************************************************************************************************
changed: [stu-276]
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/extract_file.yml for stu-276
[WARNING]: noop task does not support when conditional
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_file.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_file.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_file.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_file.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/download_container.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276

TASK [download : download_file | Create dest directory on node] ********************************************************************************************************************************************
changed: [stu-282]
changed: [iair279]
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/extract_file.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/extract_file.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/extract_file.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/extract_file.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276, iair279, stu-282

TASK [download : download_container | Download image if required] ******************************************************************************************************************************************
changed: [stu-276 -> 192.168.1.187]
changed: [iair279 -> 192.168.1.187]
changed: [stu-282 -> 192.168.1.187]
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276, iair279, stu-282

TASK [download : download_container | Download image if required] ******************************************************************************************************************************************
changed: [stu-276 -> 192.168.1.187]
changed: [stu-282 -> 192.168.1.187]
changed: [iair279 -> 192.168.1.187]
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276, iair279, stu-282

TASK [download : download_container | Download image if required] ******************************************************************************************************************************************
changed: [stu-276 -> 192.168.1.187]
changed: [iair279 -> 192.168.1.187]
changed: [stu-282 -> 192.168.1.187]
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276, iair279, stu-282
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276, iair279, stu-282

TASK [download : download_container | Download image if required] ******************************************************************************************************************************************
changed: [iair279 -> 192.168.1.187]
changed: [stu-276 -> 192.168.1.187]
changed: [stu-282 -> 192.168.1.187]
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276

TASK [download : download_container | Download image if required] ******************************************************************************************************************************************
changed: [stu-276 -> 192.168.1.187]
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/set_docker_image_facts.yml for stu-276
included: /home/openpai/pai-deploy/kubespray/roles/download/tasks/check_pull_required.yml for stu-276
FAILED - RETRYING: download_container | Download image if required (4 retries left).
FAILED - RETRYING: download_container | Download image if required (3 retries left).
FAILED - RETRYING: download_container | Download image if required (2 retries left).
FAILED - RETRYING: download_container | Download image if required (1 retries left).

TASK [download : download_container | Download image if required] ******************************************************************************************************************************************
fatal: [stu-276 -> 192.168.1.187]: FAILED! => {"attempts": 4, "changed": true, "cmd": ["/usr/bin/docker", "pull", "k8s.gcr.io/cluster-proportional-autoscaler-amd64:1.6.0"], "delta": "0:00:15.027078", "end": "2021-05-11 19:29:07.932611", "msg": "non-zero return code", "rc": 1, "start": "2021-05-11 19:28:52.905533", "stderr": "Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)", "stderr_lines": ["Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)"], "stdout": "", "stdout_lines": []}

NO MORE HOSTS LEFT *****************************************************************************************************************************************************************************************

PLAY RECAP *************************************************************************************************************************************************************************************************
iair279 : ok=204 changed=7 unreachable=0 failed=0 skipped=192 rescued=0 ignored=0
localhost : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
stu-276 : ok=295 changed=8 unreachable=0 failed=1 skipped=258 rescued=0 ignored=0
stu-282 : ok=204 changed=7 unreachable=0 failed=0 skipped=192 rescued=0 ignored=0

Issues in chapter 10

Hi Hui , 请增加类似摘要的一些问题,吸引读者。例如,RL的引出,Recall哪些传统问题,重要性motivation之类的words。Please reference chapter 13

image

部署 OpenPAI

2021-05-03 21-40-14 的屏幕截图
2021-05-03 21-41-44 的屏幕截图
老师,我在执行这条指令的时候出现了这样的错误,请问一下该怎样解决?

Lack part of contents at chapter 2

Dear Xiaowu哥,

Seems Chapter 2 lacks some content below. Will add it in the later version?

2.2 深度学习系统基础
2.2.1 深度学习运算的表示
2.2.2 编译框架与中间表达
2.2.3 运行态和硬件
2.2.4 分布式执行
2.2.5 深度学习系统性能优化

figure broken link in 12.3

textbook 9

  • 自动化机器学习,是我们做过的NNI吗?我看到了 lab 8, 将来这个lab 8 会印到书里吗?如果不会,建议把lab 8的内容添加到第9章中以丰满之。

textbook 13.2

  • 13.2.4, 每行包括不多的字,没有形成整体段落,很难理解作者要表达什么。
  • 13.2.3 GP, MPI, EI, UCB, 都是很好的展开知识点。即使假设读者都知道这些概念(我觉得可能性不大),也应该结合具体应用场景来讲解一下细节。书就是这么写的。

2021_USTC_JointPhD - 人工智能系统实践 - 成果提交

本 issue 将作为2021_USTC_JointPhD-人工智能系统实践项目成果的提交地址
项目截止日期: 5月26日
请在北京时间2021年5月27日17点前,在本issue下提交你的成果。

提交步骤:

  1. 创建公共仓库
    请各位同学使用个人的Github账户创建名为MSRA-USTC-AISystemProject-2021的公开仓库
  2. 上传成果
    请各位同学按以下目录结构,将个人成果进行整理,删去个人隐私信息,并上传至第一步中创建的公开仓库中
# 实验1
|-- lab 1
    # 图片均放在images目录下
    |-- images
        |-- image1.png
        |-- image2.jpg
    # 代码均放在src目录下
    |-- src
        |-- code1.py
        |-- code2.ipynb
    # 其他文件均放在resources目录下
    |-- resources
    # 请在README.md中放置各个文件的内容说明,你的实验流程和成果展示
    |-- README.md
# 实验6
|-- lab 6
    |-- ...
  1. 提交成果
    按以下内容格式回复本issue,请注意保护个人隐私
1. 报名时提供的邮件地址: yourmail[at]your.domain
2. 个人仓库地址: https://github.com/yourname/yourrepositories
3. 补充信息(可选)

有其他疑问的地方请随时与助教联系

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.