Giter Club home page Giter Club logo

sklearn-doc-zh's Introduction

scikit-learn (sklearn) 官方文档中文版

logo


sklearn 0.21.3 中文文档 sklearn 0.21.3 中文示例 sklearn 英文官网


介绍

sklearn (scikit-learn) 是基于 Python 语言的机器学习工具

  1. 简单高效的数据挖掘和数据分析工具
  2. 可供大家在各种环境中重复使用
  3. 建立在 NumPy ,SciPy 和 matplotlib 上
  4. 开源,可商业使用 - BSD许可证

组织构建[网站]

第三方站长[网站]

  • 地址A: xxx (欢迎留言,我们完善补充)

其他补充

下载

Docker

docker pull apachecn0/sklearn-doc-zh
docker run -tid -p <port>:80 apachecn0/sklearn-doc-zh
# 访问 http://localhost:{port} 查看文档

PYPI

pip install sklearn-doc-zh
sklearn-doc-zh <port>
# 访问 http://localhost:{port} 查看文档

NPM

npm install -g sklearn-doc-zh
sklearn-doc-zh <port>
# 访问 http://localhost:{port} 查看文档

目录

历史版本

如何编译使用历史版本:

  • 解压 0.19.x.zip 文件夹
  • master/img 的图片资源, 复制到 0.19.x 里面去
  • gitbook 正常编译过程,可以使用 sh run_website.sh

贡献指南

为了不断改进翻译质量,我们特此启动了【翻译、校对、笔记整理活动】,开设了多个校对项目。贡献者校对一章之后可以领取千字2~4元的奖励。进行中的校对活动请见活动列表。更多详情请联系飞龙(Q562826179,V:wizardforcel)。

DOCX:开放共享科研记录行动倡议

我们积极响应科研开源计划(DOCX)。如今开源不仅仅是开放源码,还包括数据集、模型、教程和实验记录。我们也在探讨其它类别的开源方案和协议。

希望大家了解这个倡议,把这个倡议与自己的兴趣点结合,做点力所能及的事情。每个人的微小的贡献,汇聚在一起就是整个开源生态。

项目负责人

格式: GitHub + QQ

第一期 (2017-09-29)

第二期 (2019-06-29)

-- 负责人要求: (欢迎一起为 sklearn 中文版本 做贡献)

  • 热爱开源,喜欢装逼
  • 长期使用 sklearn(至少0.5年) + 提交Pull Requests>=3
  • 能够有时间及时优化页面 bug 和用户 issues
  • 试用期: 2个月
  • 欢迎联系: 片刻 529815144

贡献者

【0.19.X】贡献者名单

建议反馈

项目协议

  • 最近有很多人联系我们,关于内容授权问题!
  • 开源是指知识应该重在传播和迭代(而不是禁止别人转载)
  • 不然你TM在GitHub开源,然后又说不让转载,你TM有病吧!
  • 禁止商业化,符合协议规范,备注地址来源,重点: 不需要发邮件给我们申请
  • ApacheCN 账号下没有协议的项目,一律视为 CC BY-NC-SA 4.0

温馨提示:

  • 对于个人想自己copy一份再更新的人
  • 我也是有这样的经历,但是这种激情维持不了几个月,就泄气了!
  • 不仅浪费了你的心血,还浪费了更多人看到你的翻译成果!很可惜!你觉得呢?
  • 个人的建议是: fork -> pull requests 到 https://github.com/apachecn/sklearn-doc-zh
  • 那为什么要选择 ApacheCN 呢?
  • 因为我们做翻译这事情是觉得开心和装逼,比较纯粹!
  • 你如果喜欢,你可以来参与/甚至负责这个项目,没有任何学历和背景的限制

赞助我们

微信&支付宝

sklearn-doc-zh's People

Contributors

barrycg avatar cfgbd avatar creling avatar dansyu avatar jiangzhonglian avatar junxnonex avatar kaiwangkw avatar lingrenkong avatar loopyme avatar lovelybuggies avatar mahaoyang avatar mba1398 avatar qinhanmin2014 avatar vprincekin avatar wizardforcel avatar yeyun1999 avatar zhangxinlong633 avatar zhangzhuobys avatar zhouchunpong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sklearn-doc-zh's Issues

2.3.10.6. Calinski-Harabasz Index

原文是:sklearn.metrics.calinski_harabasz_score,用法如下:
import numpy as np
from sklearn.cluster import KMeans
kmeans_model = KMeans(n_clusters=3, random_state=1).fit(X)
labels = kmeans_model.labels_
metrics.calinski_harabasz_score(X, labels)
中文版的这里,少了一个字母
image

2.3.2. K-means: 流行 -> 流形?

惯性假设簇是凸(convex)的和各项同性(isotropic),这并不是总是对的。它对 细长的簇或具有不规则形状的流行反应不佳。

并行化(Parallelization)通常以内存的代价(cost of memory)加速计算(在这种情况下,需要存储多个质心副本,每个作业(job)使用一个副本)。
-》
并行化(Parallelization)通常以内存为代价(cost of memory)来加速计算(在这种情况下,需要存储多个质心副本,每个作业(job)使用一个副本)。

编译离线html报错

编译离线html报错,具体如下:

git clone https://github.com/apachecn/scikit-learn-doc-zh.git

cd scikit-learn-doc-zh/doc/zh/

make html

错误信息如下

# These two lines make the build a bit more lengthy, and the
# the embedding of images more robust
rm -rf _build/html/_images
#rm -rf _build/doctrees/
sphinx-build -b html -d _build/doctrees   . _build/html/stable
Running Sphinx v1.6.6
making output directory...
loading pickled environment... not yet created
[autosummary] generating autosummary for: about.rst, data_transforms.rst, datasets/covtype.rst, datasets/index.rst, datasets/kddcup99.rst, datasets/labeled_faces.rst, datasets/mldata.rst, datasets/olivetti_faces.rst, datasets/rcv1.rst, datasets/twenty_newsgroups.rst, ..., tutorial/statistical_inference/index.rst, tutorial/statistical_inference/model_selection.rst, tutorial/statistical_inference/putting_together.rst, tutorial/statistical_inference/settings.rst, tutorial/statistical_inference/supervised_learning.rst, tutorial/statistical_inference/unsupervised_learning.rst, tutorial/text_analytics/working_with_text_data.rst, unsupervised_learning.rst, user_guide.rst, whats_new.rst
/Users/heng/anaconda3/lib/python3.6/site-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)
/Users/heng/anaconda3/lib/python3.6/site-packages/sklearn/grid_search.py:42: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. This module will be removed in 0.20.
  DeprecationWarning)
/Users/heng/anaconda3/lib/python3.6/site-packages/sklearn/learning_curve.py:22: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the functions are moved. This module will be removed in 0.20
  DeprecationWarning)
WARNING: [autosummary] failed to import 'sklearn.metrics.dcg_score': no module named sklearn.metrics.dcg_score
WARNING: [autosummary] failed to import 'sklearn.metrics.ndcg_score': no module named sklearn.metrics.ndcg_score
[autosummary] generating autosummary for: /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.base.BaseEstimator.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.base.ClassifierMixin.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.base.ClusterMixin.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.base.RegressorMixin.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.base.TransformerMixin.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.base.clone.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.calibration.CalibratedClassifierCV.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.calibration.calibration_curve.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.cluster.AffinityPropagation.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.cluster.AgglomerativeClustering.rst, ..., /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.utils.sparsefuncs.incr_mean_variance_axis.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.utils.sparsefuncs.inplace_column_scale.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.utils.sparsefuncs.inplace_row_scale.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.utils.sparsefuncs.inplace_swap_column.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.utils.sparsefuncs.inplace_swap_row.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.utils.sparsefuncs.mean_variance_axis.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.utils.validation.check_is_fitted.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.utils.validation.check_symmetric.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.utils.validation.column_or_1d.rst, /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/modules/generated/sklearn.utils.validation.has_fit_parameter.rst
Generating gallery
________________________________________________________________________________
Example directory /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/../examples does not have a README.txt file
Skipping this directory
________________________________________________________________________________

Exception occurred:
  File "/Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/sphinxext/sphinx_gallery/gen_gallery.py", line 179, in generate_gallery_rst
    .format(examples_dir))
FileNotFoundError: Main example directory /Users/heng/Desktop/scikit-learn-doc-zh/doc/zh/../examples does not have a README.txt file. Please write one to introduce your gallery.
The full traceback has been saved in /var/folders/tq/yl5cpg0n0wn7st6njdlv4v100000gn/T/sphinx-err-phnhi5pn.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!
make: *** [html] Error 1

TODO

  • 补充示例部分的翻译

图片链接错误

我按照 readme 中制作本地的静态html版本
首先 这个版本的doc ,限制了必须 0.19.0 sklearn 才有效, 其他高版本,会报错,特别是 0.20的报错更多。估计和很多 api修改有关。然后我降级到指定版本 0.19

make html 报告 需要一个 gallery readme.txt文件?clone下来文件中没有,我只好随便复制了一个。之后 倒是可以创建 静态页面了,但是一大堆warning ,应该都是图片相关的。 进入文档后,发现大部分图片链接都是失效额。

能否提供合理的gallery readme.txt文件? 保证图片链接正确

【0.19.X】贡献者名单

翻译者(人人皆大佬~):

校验者(人人皆大佬~)(现在还不齐全,贡献者大佬们可随意修改)

1.4.2 回归 第三段笔误

支持向量分类有三种不同的实现形式: SVR, NuSVR 和 LinearSVR.

应该为:
支持向量 分类 回归有三种不同的实现形式: SVR, NuSVR 和 LinearSVR.

TSNE示例代码有问题,报错

import numpy as np
from sklearn.manifold import TSNE
X = np.array([[0, 0, 0], [0, 1, 1], [1, 0, 1], [1, 1, 1]])
X_embedded = TSNE(n_components=2, learning_rate='auto', init='random').fit_transform(X)

error:TypeError: ufunc 'multiply' did not contain a loop with signature matching types dtype('<U32') dtype('<U32') dtype('<U32')

链接问题

校正时我发现有些链接指向了原文档,我们应该让它这样做么?我们是否应该让它指向中文文档。

命令行输入sklearn-doc-zh报错,该如何解决?

C:\XXX>sklearn-doc-zh 8889
Traceback (most recent call last):
File "c:\XXX\anaconda3\lib\runpy.py", line 194, in _run_module_as_main
return run_code(code, main_globals, None,
File "c:\XXX\anaconda3\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\XXX\anaconda3\Scripts\sklearn-doc-zh.exe_main
.py", line 7, in
File "c:\XXX\anaconda3\lib\site-packages\SklearnDocZh_main
.py", line 17, in main
svr = ThreadingHTTPServer(addr, SimpleHTTPRequestHandler)
File "c:\XXX\anaconda3\lib\socketserver.py", line 452, in init
self.server_bind()
File "c:\XXX\anaconda3\lib\http\server.py", line 140, in server_bind
self.server_name = socket.getfqdn(host)
File "c:\XXX\anaconda3\lib\socket.py", line 756, in getfqdn
hostname, aliases, ipaddrs = gethostbyaddr(name)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa4 in position 3: invalid start byte

整体进度 v0.21.3(校对)

认领须知

提交的时候不要改动文件名称,因为文件名和原文的链接是对应的!!!

留言格式:昵称 + QQ + 章节

昵称: 表示已经该用户任务过期,他人可认领替代!

章节 校验者 进度
安装 scikit-learn
用户指南 - -
1. 监督学习 @LovelyBuggies 100%
1.1. 广义线性模型 @qinhanmin2014 100%
1.2. 线性和二次判别分析 @VPrincekin
@LovelyBuggies
100%
1.3. 内核岭回归 @qinhanmin2014 100%
1.4. 支持向量机 @qinhanmin2014 100%
1.5. 随机梯度下降 @qinhanmin2014 100%
1.6. 最近邻 @qinhanmin2014 100%
1.7. 高斯过程 @LingrenKong
1.8. 交叉分解 @qinhanmin2014 100%
1.9. 朴素贝叶斯 @qinhanmin2014 100%
1.10. 决策树 @wanruixiang
1.11. 集成方法 @qinhanmin2014 100%
1.12. 多类和多标签算法
1.13. 特征选择
1.14. 半监督学习 @LovelyBuggies 100%
1.15. 等式回归
1.16. 概率校准
1.17. 神经网络模型(有监督) @LovelyBuggies 100%
2. 无监督学习
2.1. 高斯混合模型 @barrycg 100%
2.2. 流形学习 @barrycg 100%
2.3. 聚类 @barrycg 100%
2.4. 双聚类 @barrycg 100%
2.5. 分解成分中的信号(矩阵分解问题) @barrycg 100%
2.6. 协方差估计 @barrycg 100%
2.7. 新奇和异常值检测 @barrycg 100%
2.8. 密度估计 @barrycg 100%
2.9. 神经网络模型(无监督) @barrycg
@LovelyBuggies
100%
3. 模型选择和评估
3.1. 交叉验证:评估估算器的表现 @P3n9W31
3.2. 调整估计器的超参数 @P3n9W31
3.3. 模型评估: 量化预测的质量 @P3n9W31
3.4. 模型持久化 @P3n9W31
3.5. 验证曲线: 绘制分数以评估模型 @P3n9W31
4. 检验
4.1. 部分依赖图
5. 数据集转换 @VPrincekin 100%
5.1. Pipeline(管道)和 FeatureUnion(特征联合): 合并的评估器 @VPrincekin 100%
5.2. 特征提取 @VPrincekin 100%
5.3 预处理数据 @VPrincekin
5.4 缺失值插补 @VPrincekin
5.5. 无监督降维 @VPrincekin
5.6. 随机投影
5.7. 内核近似
5.8. 成对的矩阵, 类别和核函数
5.9. 预测目标 (y) 的转换
6. 数据集加载工具
6.1. 通用数据集 API
6.2. 玩具数据集
6.3 真实世界中的数据集
6.4. 样本生成器
6.5. 加载其他数据集
7. 使用scikit-learn计算
7.1. 大规模计算的策略: 更大量的数据
7.2. 计算性能
7.3. 并行性、资源管理和配置
教程
使用 scikit-learn 介绍机器学习
关于科学数据处理的统计学习教程
机器学习: scikit-learn 中的设置以及预估对象
监督学习:从高维观察预测输出变量 @LovelyBuggies 100%
模型选择:选择估计量及其参数
无监督学习: 寻求数据表示
把它们放在一起
寻求帮助
处理文本数据
选择正确的评估器(estimator.md)
外部资源,视频和谈话
API 参考
常见问题
时光轴

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.