Giter Club home page Giter Club logo

wechatdownload's Introduction

wechatDownload

微信公众号文章下载工具

此仓库已停止维护,感谢使用。

前言

能来 github 的,我默认都是友好的技术人员。大家提 issues 前,请确保你已经按照下面的说明正确安装证书。在 issues 中详细描述清楚你的环境(系统版本、软件版本、数据库版本等)和遇到的问题,并附上日志(设置中心->打开日志位置)

参考:提问的智慧

项目介绍

技术栈

Electron + Typescript + VUE3

原理

获取微信公号文章列表,需要 3 个特殊参数:

  • _biz:公众号的 id
  • uin:微信用户的 ID
  • key:不知道是啥

这 3 个参数通过 http 代理获取,剩下的就是普通爬虫的做法了

使用

image-20230112181356841

image-20230821104149231

  • 单篇文章下载

    直接输入链接,点击下载按钮即可

    此方式无需登录微信,也因此无法获取评论和文章中QQ音乐音频,如需要这两样数据,请使用批量下载或监控下载

  • 批量下载

    1. 初次使用请安装证书,

      • 自动安装(仅限window系统)

        需要管理员权限(右击软件图标 -> 以管理员身份运行)

        设置中心 → 安装证书

      • 手动安装

        设置中心 → 打开证书路径 → 打开rootCA.crt文件 Untitled

    2. 需要安装电脑版微信

    3. 点击批量下载按钮,开始监听微信公号数据

    4. 在电脑版微信打开一篇需要下载的公号的文章

    5. 回到WechatDownload,会弹框提示 wechatDownload.gif

  • 监控下载

    1. 需要安装电脑版微信

    2. 在WechatDownload点击监控下载按钮(按钮会变颜色)

    3. 在电脑版微信打开需要下载的文章(可以打开多篇文章)

    4. 回到WechatDownload,再次点击监控下载按钮即可开始下载

      wechatDownload

  • 保存至 MySql

    需要执行 /doc/mysql.sql 文件中的 SQL 语句创建表

  • 线程配置

    时间间隔:单位是毫秒,假设时间间隔500,单线程是下载完一篇文章,等待500毫秒再继续下载。多线程就是每500毫秒异步下载文章,无需等待上一篇文章下载完成。

    单批数量:假设单批数量10,每次会同时异步下载10篇文章,等待这10篇下载完成,再继续下载10篇。

  • 过滤规则

    目前支持对标题和作者进行关键词过滤

    {
        "title": {
            "include": ["包含关键词1", "包含关键词2"],
            "exclude": ["排除关键词1","排除关键词2"]
        },
        "auth": {
            "include": ["包含关键词1", "包含关键词2"],
            "exclude": ["排除关键词1", "排除关键词2"]
        }
    }

    举例子,如果需要作者是 张三 并且标题包含 好人,那就是

    {
        "title": {
            "include": ["好人"]
        },
        "auth": {
            "include": ["张三"]
        }
    }
  • 生成Epub

    支持通过 HTML 文件生成 Epub 电子书,所以使用需要先使用批量下载将公众号文章保存到本地,再生成 Epub

    使用参数如下

    • 文件名:必要参数。例如填写 test,最后就会生成 test.epub 文件

    • 文件夹:必要参数。保存了 HTML 文件的文件夹,也就是 Epub 的数据来源

    • 封面图片:Epub 文件的封面图片,支持 jpg、png 格式

功能

设置中心有啥就支持啥

  • 支持选择下载范围
  • 将网页抓换成HTML、Markdown、PDF
  • 将网页源码保存至Mysql(下载来源是网络才有效)
  • 下载图片、音频到本地
  • 添加原文链接、元数据(作者、时间、公号名)
  • 跳过现有文章
  • 下载评论
  • 下载来源(此选项只影响批量下载):
  • 网络:就是从微信接口获取文章
  • 数据库:如果选择了保存至Mysql选项,数据库中会保存文章的网页源码,此时如果需要将源码转换成HTML、Markdown ,选择下载来源是数据库即可。(微信接口用得多会被限制)

源码运行&编译

安装

$ npm install

调试

$ npm run dev

编译

# For windows
$ npm run build:win

# For macOS
$ npm run build:mac

# For Linux
$ npm run build:linux

特别感谢

感谢 JetBrains 提供的开源开发许可证

wechatdownload's People

Contributors

xiaoguyu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wechatdownload's Issues

可以下载单篇文章,但批量下载提示“获取文章列表失败,错误信息:no session”

wechatDownload版本1.4.8
电脑系统 win11 64bit
微信版本 3.9.8.25(目前最新版)

软件界面上提示的信息为:

下载来源为网络
代理开启成功,准备批量下载...
请在微信打开任意一篇需要批量下载的公号的文章
别偷懒,已经打开的不算...
批量下载超时,未监测到公号文章!
已监测到文章,请确认是否批量下载该文章所属公号
获取文章列表失败,错误信息:no session
批量下载完成,共0篇文章,耗时0.24秒

log文件:
2024-1-16.log

希望增加另存为word功能和内嵌图片到HTML功能。

希望增加另存为word功能和内嵌图片到HTML功能。准备下载一些硬货公众号文章,编成电子书,pdf格式,合并后放到手机里,阅读不方便,想用word格式,转成epub,或者HTML直接转成epub。谢谢。

最新版本爆崩!爆崩!

最新版本,下载文章时候,如果公众号有很多文章,放着自动下载,过一会莫名其妙就会自己关掉

[2024-01-16 16:52:27.841] [info] [ 'setting', 'firstRun', false ] [2024-01-16 16:52:27.843] [info] [ 'setting', 'dlSource', 'web' ] [2024-01-16 16:52:27.844] [info] [ 'setting', 'threadType', 'multi' ] [2024-01-16 16:52:27.845] [info] [ 'setting', 'dlInterval', 500 ] [2024-01-16 16:52:27.845] [info] [ 'setting', 'batchLimit', 10 ] [2024-01-16 16:52:27.846] [info] [ 'setting', 'dlHtml', 0 ] [2024-01-16 16:52:27.846] [info] [ 'setting', 'dlMarkdown', 1 ] [2024-01-16 16:52:27.847] [info] [ 'setting', 'dlPdf', 1 ] [2024-01-16 16:52:27.848] [info] [ 'setting', 'dlMysql', 0 ] [2024-01-16 16:52:27.848] [info] [ 'setting', 'dlAudio', 0 ] [2024-01-16 16:52:27.849] [info] [ 'setting', 'dlImg', 1 ] [2024-01-16 16:52:27.849] [info] [ 'setting', 'skinExist', 1 ] [2024-01-16 16:52:27.850] [info] [ 'setting', 'saveMeta', 1 ] [2024-01-16 16:52:27.851] [info] [ 'setting', 'classifyDir', 1 ] [2024-01-16 16:52:27.851] [info] [ 'setting', 'sourceUrl', 1 ] [2024-01-16 16:52:27.852] [info] [ 'setting', 'dlComment', 0 ] [2024-01-16 16:52:27.852] [info] [ 'setting', 'dlCommentReply', 0 ] [2024-01-16 16:52:27.853] [info] [ 'setting', 'dlScpoe', 'all' ] [2024-01-16 16:52:27.853] [info] [ 'setting', 'tmpPath', 'C:\\Users\\ADMINI~1\\AppData\\Local\\Temp\\wechatDownload' ] [2024-01-16 16:52:27.854] [info] [ 'setting', 'savePath', 'F:\\Program Files (x86)\\公众号文章保存' ] [2024-01-16 16:52:27.854] [info] [ 'setting', 'caPath', 'C:\\Users\\Administrator\\.anyproxy\\certificates' ] [2024-01-16 16:52:27.855] [info] [ 'setting', 'mysqlHost', 'localhost' ] [2024-01-16 16:52:27.855] [info] [ 'setting', 'mysqlPort', 3306 ]

这是软件日志,程序崩了,进程中还有残留:

image

批量下载文章,下载中途闪退

拉代码跑了一下,看到了一个报错的点,不知道是不是它引发的

D:\sc\github\wechatDownload\node_modules\brotli\build\encode.js:3
1<process.argv.length?process.argv[1].replace(/\/g,"/"):"unknown-program");b.arguments=process.argv.slice(2);"undefined"!==typeof module&&(module.exports=b);process.on("uncaughtException",function(a){if(!(a instanceof y))throw a;});b.inspect=function(){return"[Emscripten Module object]"}}else if(x)b.print||(b.print=print),"undefined"!=typeof printErr&&(b.printErr=printErr),b.read="undefined"!=typeof read?read:function(){throw"no read() available (jsc?)";},b.readBinary=function(a){if("function"===

                                                                                ^

TypeError [Error]: Cannot read properties of undefined (reading 'title')
at Service.objToArticle (D:\sc\github\wechatDownload\out\main\service-478a4708.js:887:32)
at downList (D:\sc\github\wechatDownload\out\main\worker-2ab1927a.js:626:13)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async downList (D:\sc\github\wechatDownload\out\main\worker-2ab1927a.js:657:5)
at async downList (D:\sc\github\wechatDownload\out\main\worker-2ab1927a.js:657:5)
at async downList (D:\sc\github\wechatDownload\out\main\worker-2ab1927a.js:657:5)
at async downList (D:\sc\github\wechatDownload\out\main\worker-2ab1927a.js:657:5)
at async downList (D:\sc\github\wechatDownload\out\main\worker-2ab1927a.js:657:5)
at async batchDownloadFromWeb (D:\sc\github\wechatDownload\out\main\worker-2ab1927a.js:532:3)
at async MessagePort. (D:\sc\github\wechatDownload\out\main\worker-2ab1927a.js:81:7)

Node.js v18.16.1

反爬机制记录

触发公众号的反爬机制,停止采集!

建议设置合理的下载间隔

建议选择合适的下载范围,按区间分批下载


单公众号批量下载出现提示就停止了,如何能根据上一次下载的位置继续下载。

Mac的直接打包后,选择一篇文章,点击下载,就挂了

Uncaught Exception:
Error [ERR_UNHANDLED_ERROR]: Unhandled error. ('AxiosError: read ECONNRESET\n' +
' at AxiosError.from (/wechatDownload/dist/mac/wechatDownload.app/Contents/Resources/app.asar/node_modules/axios/dist/node/axios.cjs:789:14)\n' +
' at RedirectableRequest.handleRequestError (/wechatDownload/dist/mac/wechatDownload.app/Contents/Resources/app.asar/node_modules/axios/dist/node/axios.cjs:2744:25)\n' +
' at RedirectableRequest.emit (node:events:527:28)\n' +
' at eventHandlers. (/wechatDownload/dist/mac/wechatDownload.app/Contents/Resources/app.asar/node_modules/follow-redirects/index.js:14:24)\n' +
' at ClientRequest.emit (node:events:527:28)\n' +
' at TLSSocket.socketErrorListener (node:_http_client:454:9)\n' +
' at TLSSocket.emit (node:events:527:28)\n' +
' at emitErrorNT (node:internal/streams/destroy:157:8)\n' +
' at emitErrorCloseNT (node:internal/streams/destroy:122:3)\n' +
' at process.processTicksAndRejections (node:internal/process/task_queues:83:21) {\n' +
" syscall: 'read',\n" +
" code: 'ECONNRESET',\n" +
' errno: -54,\n' +
' config: {\n' +
' transitional: {\n' +
' silentJSONParsing: true,\n' +
' forcedJSONParsing: true,\n' +
' clarifyTimeoutError: false\n' +
' },\n' +
" adapter: [ 'xhr', 'http' ],\n" +
' transformRequest: [ [Function: transformRequest] ],\n' +
' transformResponse: [ [Function: transformResponse] ],\n' +
' timeout: 0,\n' +
" xsrfCookieName: 'XSRF-TOKEN',\n" +
" xsrfHeaderName: 'X-XSRF-TOKEN',\n" +
' maxContentLength: -1,\n' +
' maxBodyLength: -1,\n' +
' env: { FormData: [Function], Blob: null },\n' +
' validateStatus: [Function: validateStatus],\n' +
' headers: AxiosHeaders {\n' +
" Accept: 'application/json, text/plain, /',\n" +
" 'User-Agent': 'axios/1.2.2',\n" +
" 'Accept-Encoding': 'gzip, compress, deflate, br'\n" +
' },\n' +
" method: 'get',\n" +
" url: 'https://mp.weixin.qq.com/s/4NLAPpg17z96SXiI1XYEWg',\n" +
' data: undefined\n' +
' },\n' +
' request: <ref *1> Writable {\n' +
' _writableState: WritableState {\n' +
' objectMode: false,\n' +
' highWaterMark: 16384,\n' +
' finalCalled: false,\n' +
' needDrain: false,\n' +
' ending: false,\n' +
' ended: false,\n' +
' finished: false,\n' +
' destroyed: false,\n' +
' decodeStrings: true,\n' +
" defaultEncoding: 'utf8',\n" +
' length: 0,\n' +
' writing: false,\n' +
' corked: 0,\n' +
' sync: true,\n' +
' bufferProcessing: false,\n' +
' onwrite: [Function: bound onwrite],\n' +
' writecb: null,\n' +
' writelen: 0,\n' +
' afterWriteTickInfo: null,\n' +
' buffered: [],\n' +
' bufferedIndex: 0,\n' +
' allBuffers: true,\n' +
' allNoop: true,\n' +
' pendingcb: 0,\n' +
' constructed: true,\n' +
' prefinished: false,\n' +
' errorEmitted: false,\n' +
' emitClose: true,\n' +
' autoDestroy: true,\n' +
' errored: null,\n' +
' closed: false,\n' +
' closeEmitted: false,\n' +
' [Symbol(kOnFinished)]: []\n' +
' },\n' +
' _events: [Object: null prototype] {\n' +
' response: [Function: handleResponse],\n' +
' error: [Function: handleRequestError],\n' +
' socket: [Function: handleRequestSocket]\n' +
' },\n' +
' _eventsCount: 3,\n' +
' _maxListeners: undefined,\n' +
' _options: {\n' +
' maxRedirects: 21,\n' +
' maxBodyLength: Infinity,\n' +
" protocol: 'https:',\n" +
" path: '/s/4NLAPpg17z96SXiI1XYEWg',\n" +
" method: 'GET',\n" +
' headers: [Object: null prototype],\n' +
' agents: [Object],\n' +
' auth: undefined,\n' +
' beforeRedirect: [Function: dispatchBeforeRedirect],\n' +
' beforeRedirects: [Object],\n' +
" hostname: 'mp.weixin.qq.com',\n" +
" port: '',\n" +
' agent: undefined,\n' +
' nativeProtocols: [Object],\n' +
" pathname: '/s/4NLAPpg17z96SXiI1XYEWg'\n" +
' },\n' +
' _ended: true,\n' +
' _ending: true,\n' +
' _redirectCount: 0,\n' +
' _redirects: [],\n' +
' _requestBodyLength: 0,\n' +
' _requestBodyBuffers: [],\n' +
' _onNativeResponse: [Function (anonymous)],\n' +
' _currentRequest: ClientRequest {\n' +
' _events: [Object: null prototype],\n' +
' _eventsCount: 7,\n' +
' _maxListeners: undefined,\n' +
' outputData: [],\n' +
' outputSize: 0,\n' +
' writable: true,\n' +
' destroyed: false,\n' +
' _last: true,\n' +
' chunkedEncoding: false,\n' +
' shouldKeepAlive: false,\n' +
' maxRequestsOnConnectionReached: false,\n' +
' _defaultKeepAlive: true,\n' +
' useChunkedEncodingByDefault: false,\n' +
' sendDate: false,\n' +
' _removedConnection: false,\n' +
' _removedContLen: false,\n' +
' _removedTE: false,\n' +
' _contentLength: 0,\n' +
' _hasBody: true,\n' +
" _trailer: '',\n" +
' finished: true,\n' +
' _headerSent: true,\n' +
' _closed: false,\n' +
' socket: [TLSSocket],\n' +
" _header: 'GET /s/4NLAPpg17z96SXiI1XYEWg HTTP/1.1\r\n' +\n" +
" 'Accept: application/json, text/plain, /\r\n' +\n" +
" 'User-Agent: axios/1.2.2\r\n' +\n" +
" 'Accept-Encoding: gzip, compress, deflate, br\r\n' +\n" +
" 'Host: mp.weixin.qq.com\r\n' +\n" +
" 'Connection: close\r\n' +\n" +
" '\r\n',\n" +
' _keepAliveTimeout: 0,\n' +
' _onPendingData: [Function: nop],\n' +
' agent: [Agent],\n' +
' socketPath: undefined,\n' +
" method: 'GET',\n" +
' maxHeaderSize: undefined,\n' +
' insecureHTTPParser: undefined,\n' +
" path: '/s/4NLAPpg17z96SXiI1XYEWg',\n" +
' _ended: false,\n' +
' res: [IncomingMessage],\n' +
' aborted: false,\n' +
' timeoutCb: null,\n' +
' upgradeOrConnect: false,\n' +
' parser: null,\n' +
' maxHeadersCount: null,\n' +
' reusedSocket: false,\n' +
" host: 'mp.weixin.qq.com',\n" +
" protocol: 'https:',\n" +
' _redirectable: [Circular *1],\n' +
' [Symbol(kCapture)]: false,\n' +
' [Symbol(kNeedDrain)]: false,\n' +
' [Symbol(corked)]: 0,\n' +
' [Symbol(kOutHeaders)]: [Object: null prototype]\n' +
' },\n' +
" _currentUrl: 'https://mp.weixin.qq.com/s/4NLAPpg17z96SXiI1XYEWg',\n" +
' [Symbol(kCapture)]: false\n' +
' },\n' +
' cause: Error: read ECONNRESET\n' +
' at TLSWrap.onStreamRead (node:internal/stream_base_commons:217:20) {\n' +
' errno: -54,\n' +
" code: 'ECONNRESET',\n" +
" syscall: 'read'\n" +
' }\n' +
'}')
at new NodeError (node:internal/errors:372:5)
at Worker.emit (node:events:516:17)
at [kOnErrorMessage] (node:internal/worker:290:10)
at [kOnMessage] (node:internal/worker:301:37)
at MessagePort. (node:internal/worker:202:57)
at [nodejs.internal.kHybridDispatch] (node:internal/event_target:643:20)
at exports.emitMessage (node:internal/per_context/messageport:23:28)

批量下载超时,未监测到公号文章!

我在使用批量下载功能的时候遇到问题。明明已经打开公众号文章了(是在微信中打开的,不是在浏览器中打开的),而且是在点击“批量下载”按钮之后,我才点击打开公众号文章的。
后来发现:
当打开文章是微信浏览器提示页面不安全,然后要点击继续前往,这样的情况下,wechatDownload就能检测到公众号文章了。
但是,这样的情况很少发生,就导致检测不到。怎么办呢?

建议不用一篇文章一个文件夹

建议 不用一篇文章一个文件夹, 批量下载情况下,以公众号名字为文件夹,导出的HTML或PDF 一目了然,方便操作。
因为如果批量下载多个公众号,导出HTML或PDF全设置在一个文件夹,不好区分和归类。
公众号有更新,需要重新下载一遍了吧?

请求增加下载间隔

下载的太快,后面全是验证。
下载的东西上千个,大概有900的是空的。

批量下载+下载来源的问题或者不解

描述:
下载来源数据库+监控下载,正常下载,正常入库
image

当设置:下载来源数据库+批量下载时,自动下载数据库中的数据;并没有像批量下载+下载来源网络,那样下载

macOS 是否需要自行编译?

Tag 那边没有提供 macOS 的安装包, 但 Readme 文档里提到了 mac 下的编译指令.

我是否可以理解为 macOS 用户自行编译即可?

微信号被封

大概是使用了批量下载或评论下载,微信突然提示被封,使用了 第三方工具... 万幸,勾选几个规则后,微信可正常登陆。不敢用工具了。
特此说明 可能存在的风险。

监控下载失效

点击“监控下载”,然后电脑微信打开“微信公众号文章”,出现下面的界面,再次点击“监控下载” 出现获取文章失败

代理开启成功,准备批量下载...

请在微信打开需要下载的文章,可打开多篇文章

获取文章失败

版本: 微信 3.9.6.33, wechatdownload 1.4.4

日志:
[2023-09-26 10:06:07.375] [info] [ 'setting', 'savePath', 'D:\wechatdownload微信公众号文章' ]
[2023-09-26 10:06:07.375] [info] [ 'setting', 'caPath', 'C:\Users\corebug\.anyproxy\certificates' ]
[2023-09-26 10:06:07.376] [info] [ 'setting', 'mysqlHost', 'localhost' ]
[2023-09-26 10:06:07.377] [info] [ 'setting', 'mysqlPort', 3306 ]
[2023-09-26 10:14:28.711] [info] [ 'setting', 'firstRun', false ]
[2023-09-26 10:14:28.714] [info] [ 'setting', 'dlSource', 'web' ]
[2023-09-26 10:14:28.715] [info] [ 'setting', 'threadType', 'multi' ]
[2023-09-26 10:14:28.716] [info] [ 'setting', 'dlInterval', 500 ]
[2023-09-26 10:14:28.717] [info] [ 'setting', 'batchLimit', 10 ]
[2023-09-26 10:14:28.718] [info] [ 'setting', 'dlHtml', 1 ]
[2023-09-26 10:14:28.719] [info] [ 'setting', 'dlMarkdown', 1 ]
[2023-09-26 10:14:28.719] [info] [ 'setting', 'dlPdf', 1 ]
[2023-09-26 10:14:28.720] [info] [ 'setting', 'dlMysql', 0 ]
[2023-09-26 10:14:28.721] [info] [ 'setting', 'dlAudio', 1 ]
[2023-09-26 10:14:28.721] [info] [ 'setting', 'dlImg', 1 ]
[2023-09-26 10:14:28.722] [info] [ 'setting', 'skinExist', 1 ]
[2023-09-26 10:14:28.723] [info] [ 'setting', 'saveMeta', 1 ]
[2023-09-26 10:14:28.723] [info] [ 'setting', 'sourceUrl', 1 ]
[2023-09-26 10:14:28.724] [info] [ 'setting', 'dlComment', 1 ]
[2023-09-26 10:14:28.724] [info] [ 'setting', 'dlCommentReply', 1 ]
[2023-09-26 10:14:28.725] [info] [ 'setting', 'dlScpoe', 'seven' ]
[2023-09-26 10:14:28.725] [info] [
'setting',
'tmpPath',
'C:\Users\corebug\AppData\Local\Temp\wechatDownload'
]
[2023-09-26 10:14:28.726] [info] [ 'setting', 'savePath', 'D:\wechatdownload微信公众号文章' ]
[2023-09-26 10:14:28.726] [info] [ 'setting', 'caPath', 'C:\Users\corebug\.anyproxy\certificates' ]
[2023-09-26 10:14:28.727] [info] [ 'setting', 'mysqlHost', 'localhost' ]
[2023-09-26 10:14:28.728] [info] [ 'setting', 'mysqlPort', 3306 ]
[2023-09-26 10:18:21.249] [info] [ '触发检查更新' ]
[2023-09-26 10:18:21.250] [info] [ { code: 2, msg: '正在检查更新……' } ]
[2023-09-26 10:18:32.643] [info] [ { code: 4, msg: '现在使用的就是最新版本,不用更新' } ]

批量下载:提示下载超时,未监测到公众号文章

windows 微信版本 3.9.7.29 wechatDownload 版本 v.1.4.4

使用批量下载功能,证书也安装了,下载配置中只修改了下载范围,时间选择7日内,其他无任何改动,按照README提示,无法正常工作

提示:
下载来源为网络

代理开启成功,准备批量下载...

请在微信打开任意一篇需要批量下载的公号的文章

别偷懒,已经打开的不算...

批量下载超时,未监测到公号文章!

我想问下:是否和微信版本有关?如果是,你的README中的微信版本是多少,

批量下载失败,错误信息:unknown error

下载来源为网络

代理开启成功,准备批量下载...

请在微信打开任意一篇需要批量下载的公号的文章

别偷懒,已经打开的不算...

已监测到文章,请确认是否批量下载该文章所属公号

获取文章列表失败,错误信息:unknown error

批量下载完成,共0篇文章,耗时0.23秒

log:

[2023-10-10 23:06:41.477] [error] [
'获取文章列表失败',
'https://mp.weixin.qq.com/mp/profile_ext?action=getmsg&f=json&count=10&is_ok=1&__biz=MzU2NzEwMDc1MA==&key=b8982b21b1d079659f80a14401c1b32b429cf05b13e72337bac3433c27616930ae5e3392a663bb64d47b651c07f7c037e0c5979884e982f77a79c29b249fb2e67062712ccad41d630f372fd166b8a8936fa96b18a9225e7fef6675b473c74ae292c937a5e11a51b58edf28037d4236803b38b75edc0ab0c7dde703d3b63dda0b&uin=MTU3MzY3MjU1&pass_ticket=8sZkL3P6GeLFMmugNRCJkZoEyi9wot%2BZeWxcGeozHWO7uSg%2BIupMFEGr4hwF2Roy&offset=0',
{ ret: -6, errmsg: 'unknown error', home_page_list: [] }
]
[2023-10-10 23:06:41.479] [info] [ 'resp', 2, '获取文章列表失败,错误信息:unknown error', null ]
[2023-10-10 23:06:41.480] [info] [ 'resp', 4, '批量下载完成,共0篇文章,耗时0.23秒', null ]
[2023-10-10 23:06:41.480] [info] [ 'resp', 5, '', null ]

批量下载

mac 的调试模式,点批量下载 - 微信中点开公众号文章 - 回到程序 - 超时没有检测到文章

无法抓取历史文章

从1.4开始,那个版本就无法抓全部文件,单线程或者多线程,按日期或者全部都一样,只有20个文章被下载

请问有计划开放API接口吗?

请问有计划开放api接口吗?通过http get方法传入一个公众号的链接,可以返回包含markdown文本的json响应数据,或者下载到本地目录。

批量下载选择有评论和回复但经常采集不到评论,单次只下载两到三篇也不行,换了几个微信号也不行

由于有些公众号文章的评论区有很多比正文还有价值的内容,所以会选择勾选“有评论和回复”,上个月批量下载一两百篇都能正常采集到所有评论,但现在更新到1.5.1版勾选了评论也经常采集不到,一开始以为是限制爬虫所以只几篇几篇的下也不行,然后换了网络环境也不行,以为是微信号被限制了于是换了其他微信号登录微信尝试也无效(还新注册了一个全新的微信,结果新微信号采集时直接显示“获取文章列表失败,错误信息:unknown error”不知道为什么)。

希望大佬可以解决或者解答这个问题,感谢!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.