Giter Club home page Giter Club logo

Comments (53)

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
[deleted comment]

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
[deleted comment]

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
10.发布功能支持上传文件
11.邮箱通知支持上传附件

Original comment by [email protected] on 19 Sep 2012 at 2:08

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
12.采集规则添加一个过滤接口,一个过滤插件

Original comment by [email protected] on 20 Sep 2012 at 3:34

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
13.循环区配,采集规则不允许为空时,当一条记录的部份规则匹配为空时,此时将不能再匹配任何其它记录,需要修改逻辑,使其从最后匹配的位置继续匹配下一条记录。

Original comment by [email protected] on 6 Nov 2012 at 9:27

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
14.分页规则索引号可以与采集规则索引号相同

Original comment by [email protected] on 9 Nov 2012 at 9:01

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
15.新添加的采集规则再更新时有错误

Original comment by [email protected] on 9 Nov 2012 at 9:10

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
[deleted comment]

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
16.账户管理,语言、时区设置

Original comment by [email protected] on 26 Jan 2013 at 3:25

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
17.修改select的值时,应当更新所有页面的select

Original comment by [email protected] on 26 Jan 2013 at 3:27

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
18.采集规则添加"标签组合"

Original comment by [email protected] on 15 Apr 2013 at 5:59

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
19.重新设计日志记录方式,1.将日志存放于内存(GAE),2将日志存放于DB。
20.前台可以设置每个站点的采集速率。

Original comment by [email protected] on 20 Jul 2013 at 8:57

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
21.任务队列统计、采集的URL(每日统计)、采集到的数据(每日统计)
22.前台查看日志

Original comment by [email protected] on 10 Aug 2013 at 1:29

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
[deleted comment]

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
23.当数据来源为其它标签时修改采集规则区域的显示方式.

Original comment by [email protected] on 29 Nov 2013 at 6:59

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
24.完善选项卡异步加载
25.完善嵌套采集时的COOKIE设置

Original comment by [email protected] on 18 Feb 2014 at 1:40

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
26.实现查看计划任务中采集规则的运行状态
27.XPATH读取时可以直接添加到采集规则

Original comment by [email protected] on 13 Oct 2014 at 8:01

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
[deleted comment]

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
29.采集到的数据列表页增加按入库日期查询
30.数量统计同步统计类型字段

Original comment by [email protected] on 13 Oct 2014 at 8:32

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
31.当采集测试没有匹配到数据时提示是哪条规则没有匹配到数据

Original comment by [email protected] on 16 Oct 2014 at 8:36

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
32.站点管理》HTTP请求配置窗口无法打开

Original comment by [email protected] on 17 Oct 2014 at 1:03

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
33.采集规则字段合并排版问题

Original comment by [email protected] on 23 Oct 2014 at 3:27

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
34.JS依赖分析失败

Original comment by [email protected] on 23 Oct 2014 at 3:30

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
35.load异常的时,关闭loading mark

Original comment by [email protected] on 23 Oct 2014 at 6:36

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
36.数据列表页查询时开始索引错误

Original comment by [email protected] on 23 Oct 2014 at 8:18

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
37.为计划任务添加执行日志
38.为“数据自动采集”计划任务增加入队列统计,完成度统计。

Original comment by [email protected] on 6 Nov 2014 at 3:10

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
39.站点编码“自动识别”改成每次抓取都自动识别

Original comment by [email protected] on 6 Nov 2014 at 3:15

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
40.修改XPATH提取工具的class,避免class冲突

Original comment by [email protected] on 19 Nov 2014 at 3:46

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
41.XPATH匹配增加outerHTML、innerHTML、innerTEXT属性

Original comment by [email protected] on 25 Nov 2014 at 3:20

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
[deleted comment]

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
43.添加采集队列管理功能,如删除队列、停止队列、运行队列

Original comment by [email protected] on 28 Nov 2014 at 3:37

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
44.统计功能数据自动刷新

Original comment by [email protected] on 9 Dec 2014 at 6:57

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
45.导出到CSV

Original comment by [email protected] on 9 Dec 2014 at 7:09

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
[deleted comment]

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
47.将采集器做为服务,开放采集API支持异步或同步返回两种形式

Original comment by [email protected] on 16 Dec 2014 at 1:44

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
48.在站点管理里增加“最大采集队列数”,为空或小于1时不限制。计划任务在执行“数据自动采集”时会检测当前站点未完成的任务数,超过限制时将不开启本次采集任务。这样可以避免开启过多的任务而耗尽系统资源。

Original comment by [email protected] on 16 Dec 2014 at 7:00

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
49.完善WEB端,
1.优化响应速度CND加速、多节点同步(DNS智能加速)
2.GAE在线安装使用排队机制

Original comment by [email protected] on 24 Dec 2014 at 7:49

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
50.Queue SYNC_FULL 需要加入CPU操时处理逻辑

Original comment by [email protected] on 14 Jan 2015 at 1:33

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
51.网址批量添加
多个网址用'|$|'分隔
to
多个网址使用'换行'或'|$|'分隔

Original comment by [email protected] on 16 Jan 2015 at 8:14

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
52.实现密码找回功能

Original comment by [email protected] on 19 Jan 2015 at 8:31

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
53.newcrawler.com全球服务器选择功能

Original comment by [email protected] on 19 Jan 2015 at 8:33

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
54.数据发布规则,默认隐藏,增加显示按钮

Original comment by [email protected] on 19 Jan 2015 at 8:36

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
[deleted comment]

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
55.快速开始,增加可视化规则创建功能
56.增加数据查询API,提供JSON、CSV格式.
57.爬虫池配置--负载均衡功能实现

Original comment by [email protected] on 12 Mar 2015 at 3:15

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
58.异步查询时增加loading中的图片

Original comment by [email protected] on 12 Mar 2015 at 3:26

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
59.可以为每个站点配置“触发抓取异常”
   抓取到网页内容后检测是否包含异常文本(如反爬虫验证码输入提示),包含异常文本时系统将抛出抓取异常并且系统默认会重试抓取一次

Original comment by [email protected] on 25 May 2015 at 8:57

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
60.增加自定义采集速率

Original comment by [email protected] on 26 May 2015 at 1:18

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
61.验证Cookie的语言环境是否与当前系统选择的语言一致

Original comment by [email protected] on 29 May 2015 at 1:45

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
62.爬虫统计数据没有生效

Original comment by [email protected] on 29 May 2015 at 1:45

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
63.可以为爬虫配置默认的采集速率
64.回调检测时间,描述:采集器会使用异步的方式调用爬虫采集,当爬虫由于一些原因没有返回结果时,需要重新采集网址,回调检测时间就是定义爬虫多长时间没有返回时触发重新采集

Original comment by [email protected] on 25 Jun 2015 at 3:48

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
65.登录后比较版本,需要更新时醒目提示
66.查看日志,length右对齐单位改为KB,lastmodified增加宽度
67.爬虫远程访问增加密码认证

Original comment by [email protected] on 25 Jun 2015 at 9:14

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
68.登录界面“帮助”连接到WIKI

Original comment by [email protected] on 29 Jun 2015 at 6:05

from newcrawler.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 29, 2024
69.添加服务条款页面

Original comment by [email protected] on 29 Jun 2015 at 6:06

from newcrawler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.