Giter Club home page Giter Club logo

discuz's Introduction

Discuz 论坛爬虫

  • 用于采集 discuz 论坛的帖子信息和评论,可以爬取分页和层叠的评论
  • 爬取目标为 帖子id 用户id 用户昵称 发帖内容 评论内容
  • 本爬虫只采集数据,不做数据统计工作
  • 这是临时赶制的粗糙项目,如果需要长期使用,你应该自己改写它

食用用法

spider = DiscuzSpider()
data = spider.parse(3846582)
print(data)

数据格式

{
    "tid": 1669412,
    "title": "★蓝色石器时代游戏风格模板 FOR 7.2★",
    "uid": "970218",
    "nickname": "njynjy",
    "content": "\n\n 本帖最后由 njynjy 于 2010-5-18 18:38 编辑 \n\n蓝色石器时代游戏风格模板 FOR 7.2希望大家给 石器时代 http://www.53sa.com/  做个友情链接,我会继续发布第二套模板。如果您做好了链接,可以短消息我,我会给你第二套模板。\r\n本套模板图:\n\r\n下载地址:\n\r\n第二套模板(如果您做好了友情链接,可以短消息我,我会给你第二套模板。):\n\n",
    "comments": [
        {
            "uid": "1120010",
            "nickname": "李玉郎",
            "content": "\n\n\r\n沙发 沙发!\r\n还是第2个好看,期待着\n\n\n\n\n"
        },
        {
            "uid": "602341",
            "nickname": "lishiminv",
            "content": "\n\n\r\n前排支持。\n\n\n\n\n"
        }
    ]
}

discuz's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.