Giter Club home page Giter Club logo

douyinlivewebfetcher's Introduction

👋  Hey there! I'm Saermart,Chinese,an amateur programming enthusiasts.

💡  I like to explore new technologies and develop software solutions and quick hacks.
🌱  I'm on track for learning more about Reverse Engineering of mobile APPs.
💬  Feel free to reach out to me for some interesting discussion.
✉️  You can shoot me an email at [email protected]! I'll try to respond as soon as I can.

🛠  Interested

Python  JavaScript  Java  C  C++  Django  Flask  HTML  CSS  Git  GitHub  Markdown
Visual Studio Code 

⚙️  GitHub Analytics

🤝🏻  Connect with Me

douyinlivewebfetcher's People

Contributors

saermart avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

douyinlivewebfetcher's Issues

WebSocket error

WebSocket connected.
WebSocket error: Connection to remote host was lost.
WebSocket error: DouyinLiveWebFetcher._wsOnClose() takes 2 positional arguments but 4 were given

大神.WebSocket connected.WebSocket error: Connection to remote host was lost..

大神, 昨天抖音的爬取还可以用. 今天早上发现就变成了"INFO:websocket:Websocket connected
WebSocket connected.WebSocket error: Connection to remote host was lost.
ERROR:websocket:Connection to remote host was lost. - goodbye
ERROR:websocket:error from callback DouyinLiveWebFetcher._wsOnClose() missing 1 required positional argument: 'close_base'
WebSocket error: DouyinLiveWebFetcher._wsOnClose() missing 1 required positional argument: 'close_base'" B站我还私信了你.
微信图片_20240503140139
-.-改了些代码, 但是用另外一台电脑下载源项目也是这个反应, 也应该不是IP. 用了VPS转Tun模式也是一样. 就很迷, 昨天些代码还可以用. 今早起来发现连不到服务器了. 是我的问题还是项目?

爬不到弹幕的问题,我也在找解决办法,应该是要获取登录状态,然后拿去链接ws

之前我就提过了,被删除了。
很难得重现这个问题,重现条件是,当前电脑浏览器打开抖音直播,弹窗需要登录,这个时候,是爬取不到弹幕的。
为什么需要会弹窗登录呢?已知,这是一个直播公司,公司只有一个光猫,这个公司应该很多电脑打开了抖音直播,ip被和谐了?但是重启光猫了也没有用。并且这个公司其他电脑打开网页的抖音直播,都是强制弹窗 需要登录。
一般来说浏览器打开抖音直播,是不需要登录,就能看到右侧的弹幕信息,这也是爬虫实现的原理,因为目前的爬虫,都是不登录的。

连接中断

当直播间20秒没有任何动静的时候,会出现
WebSocket error: Connection to remote host was lost.
WebSocket error: DouyinLiveWebFetcher._wsOnClose() takes 2 positional arguments but 4 were given
程序就直接结束了,怎么解决

报证书问题怎么解决呢

WebSocket error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1125)

WebSocket error: ("hostname 'webcast3-ws-web-lq.douyin.com' doesn't match either of '*.alicdn.com', '*.cmos.greencompute.org', 'cmos.greencompute.org', 'm.intl.taobao.com', '*.mobgslb.tbcache.com', 'alikunlun.com', '*.alikunlun.com', 's.tbcdn.cn', '*.django.t.taobao.com', 'alicdn.com'",) WebSocket connection closed.

WebSocket error: ("hostname 'webcast3-ws-web-lq.douyin.com' doesn't match either of '.alicdn.com', '.cmos.greencompute.org', 'cmos.greencompute.org', 'm.intl.taobao.com', '.mobgslb.tbcache.com', 'alikunlun.com', '.alikunlun.com', 's.tbcdn.cn', '*.django.t.taobao.com', 'alicdn.com'",)
WebSocket connection closed.

大佬 报错如上

连接被断开

你好,请问我连接上了,服务器不会给我发送任何消息,然后就被断开了,这种情况是不是需要加cookie,具体需要加哪些呢?我带了登录cookie还是会被断开

太棒了

一次成功,希望能持续更新。

长时间去获取,终于被拉黑了,这种该如何去规避呢?

WebSocket error: Handshake status 200 OK -+-+- {'server': 'Tengine', 'content-length': '0', 'connection': 'keep-alive', 'date': 'Thu, 20 Jun 2024 11:20:03 GMT', 'handshake-msg': 'DEVICE_BLOCKED', 'handshake-status': '415', 'server-timing': 'inner; dur=83', 'x-tt-trace-host': '01fdc36f8a940d17f868057b0a967d4625fe855bea22c6de623ad77d9a68c3c79b9e93c17211f50599c1d411b2acaaca743c3091e388bb93a975e4624c52c43ba66e00d49cdb0cb0d1bc001dda0e93cccdec03dcdb6ccbb0ff2b522572cedbb46f', 'x-tt-trace-tag': 'id=03;cdn-cache=miss;type=dyn', 'x-tt-trace-id': '00-240620192003A11B6919904367DA19FA-26267C3448A6C175-00', 'x-tt-logid': '20240620192003A11B6919904367DA19FA', 'via': 'vcache16.cn5907[204,0]', 'timing-allow-origin': '*', 'eagleid': '717de42417188824036548184e'} -+-+- b''

WebSocket error: empty or no certificate,

WebSocket error: empty or no certificate, match_hostname needs a SSL socket or SSL context with either CERT_OPTIONAL or CERT_REQUIRED
大佬,报错 需要证书 请问您遇到过吗 如何解决的

还是要signature 啊

WebSocket error: Handshake status 200 OK -+-+- {'server': 'volc-dcdn', 'content-length': '0', 'connection': 'keep-alive', 'date': 'Tue, 16 Jul 2024 02:37:54 GMT', 'handshake-msg': 'DEVICE_BLOCKED', 'handshake-status': '415', 'server-timing': 'inner; dur=82, cdn-cache;desc=MISS, origin;dur=152, edge;dur=16, cdn-cache;desc=MISS', 'x-tt-trace-host': '015322e150330143e811c6e384c1c52d3dabf2fe1edb92ad4f76973fb1dec99c5c91addba49c26ec48c2485fb4f0a8d6adb01d2bc568bc5ecc087fa687e58d4e9af654b43c86d22e91654d588ee924b1d9d3fede0328cfca54ec5feb21d8650142747b5ed983671a8166cb0328bd2eb7236cb12f1a6ee5598f085e720da7ed1acf', 'x-tt-trace-tag': 'id=5', 'x-tt-trace-id': '00-b967537f03010f813ad52b5ae44c18ef-b967537f03010f81-01', 'x-tt-logid': '2024071610375450081B126C82623DFC5F', 'via': 'n157-149-136.whmp.Creative,n113-219-164-185.czct02-container.Creative', 'x-request-ip': '171.83.32.190', 'x-dsa-trace-id': '1721097474652c809ec71a6178d1e7ecfc743ae947', 'x-dsa-origin-status': '200'} -+-+- b''
WebSocket closed. <websocket._app.WebSocketApp object at 0x000001C31533F560>
直接复制的地址可以用 拼接后signature的值要对应,修改就无法访问了 如何获取signature值呢

调用几次就废了,提个建议,可以在获取cookie的位置弹出窗口打开抖音直播页面,然后由人工扫码登录以后获取到Cookie

【X】Request the live url error: HTTPSConnectionPool(host='live.douyin.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
【X】Request the live room url error: HTTPSConnectionPool(host='live.douyin.com', port=443): Max retries exceeded with url: /878471479841 (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
【X】Request the live url error: HTTPSConnectionPool(host='live.douyin.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
【X】Request the live room url error: HTTPSConnectionPool(host='live.douyin.com', port=443): Max retries exceeded with url: /878471479841 (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
【X】Request the live url error: HTTPSConnectionPool(host='live.douyin.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
【X】Request the live room url error: HTTPSConnectionPool(host='live.douyin.com', port=443): Max retries exceeded with url: /878471479841 (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
【X】Request the live url error: HTTPSConnectionPool(host='live.douyin.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
【X】Request the live room url error: HTTPSConnectionPool(host='live.douyin.com', port=443): Max retries exceeded with url: /878471479841 (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
【X】Request the live url error: HTTPSConnectionPool(host='live.douyin.com', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:997)')))
WebSocket error: Handshake status 502 Bad Gateway -+-+- {'server': 'Tengine', 'content-type': 'text/html', 'content-length': '550', 'connection': 'keep-alive', 'date': 'Sun, 18 Feb 2024 08:40:52 GMT', 'proxy-status': '0000201502301102', 'x-tt-trace-host': '0111df544d7f71817403aab46f317c3546cde959325b16508359e5cdf0b8791756437b8a0b7319d471f85a93f642d0f8906a6e9b1a327f3a0c683f595ad15cce94', 'x-tt-trace-tag': 'id=03;cdn-cache=miss;type=dyn', 'x-tt-trace-id': '00-240218164052FEEC55BB30F0B2A703A7-48D3C66468F664CE-00', 'x-tt-logid': '20240218164052FEEC55BB30F0B2A703A7', 'x-alicdn-da-ups-status': 'endOs,0,502', 'via': 'ens-cache21.cn6725[87,0]', 'timing-allow-origin': '*', 'eagleid': '1bde141917082456519776383e'} -+-+- b'\r\n<title>502 Bad Gateway</title>\r\n\r\n

502 Bad Gateway

\r\n
TLB\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n\r\n'
WebSocket error: DouyinLiveWebFetcher._wsOnClose() takes 2 positional arguments but 4 were given

好像最近又跑不动了

WebSocket error: Handshake status 200 OK -+-+- {'server': 'volc-dcdn', 'content-length': '0', 'connection': 'keep-alive', 'date': 'Fri, 12 Jul 2024 07:35:48 GMT', 'handshake-msg': 'DEVICE_BLOCKED', 'handshake-status': '415', 'server-timing': 'inner; dur=52, cdn-cache;desc=MISS, origin;dur=149, edge;dur=0', 'x-tt-trace-host': '015f3ca947c964de55f9c7f9afccd8b7ec0708188bd3a93f77d1839f4321259bbec690b4fadfc54b2660e681112654d0806f1f2e1c56b9742eb9a7497f8a48c5ea6e8c46f02b493f80c5bc58d85385d5cc6288e4b453ebebd70848b4ebcd4c4ac0', 'x-tt-trace-tag': 'id=5', 'x-tt-trace-id': '00-a5dea10203010d10215c71a9d0b518ef-a5dea10203010d10-01', 'x-tt-logid': '20240712153548F12E4865E84EFB2CBAB4', 'via': 'n180-097-246-148.jsxzct02-container.Creative', 'x-request-ip': '14.154.11.222', 'x-dsa-trace-id': '172076974807628c31eac450ad0aab39c16b5c737a', 'x-dsa-origin-status': '200'} -+-+- b''

运行时老是报错啊,

老哥们,为什么有时候运行中会提示 WebSocket error: Connection to remote host was lost. WebSocket error: DouyinLiveWebFetcher._wsOnClose() takes 2 positional arguments but 4 were given,有人遇到这个问题没

大神,需要更新了

今天抖音似乎改数据了,抓到的数据不准,例如1314个小心心这种预设的数量抓到的数据只有1个,小心心连击会返回2次数据。
那种连击礼物有没有办法识别啊,现在抓取是按111、112、113这样返回数据的,统计出来就成了数列累加了。

WSS报错200

之前运行的挺正常的,但是今天突然就用不了了 WSS报错
不知道是账号被风控了 还是验证方法改了

以下是报错信息

aiohttp.client_exceptions.WSServerHandshakeError: 200, message='Invalid response status', url=URL('wss://webcast3-ws-web-lq.douyin.com/webcast/im/push/v2/?app_name=douyin_web&version_code=180800&webcast_sdk_version=1.3.0&update_version_code=1.3.0&compress=gzip&internal_ext=internal_src:dim%7Cwss_push_room_id:7382522869470808883%7Cwss_push_did:7382522869470808883%7Cdim_log_id:202302171547011A160A7BAA76660E13ED%7Cfetch_time:1676620021641%7Cseq:1%7Cwss_info:0-1676620021641-0-0%7Cwrds_kvs:WebcastRoomStatsMessage-1676620020691146024_WebcastRoomRankMessage-1676619972726895075_AudienceGiftSyncData-1676619980834317696_HighlightContainerSyncData-2&cursor=t-1676620021641_r-1_d-1_u-1_h-1&host=https://live.douyin.com&aid=6383&live_id=1&did_rule=3&debug=false&endpoint=live_pc&support_wrds=1&im_path=/webcast/im/fetch/&user_unique_id=7382522869470808883&device_platform=web&cookie_enabled=true&screen_width=1440&screen_height=900&browser_language=zh&browser_platform=MacIntel&browser_name=Mozilla&browser_version=5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_15_7)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/110.0.0.0%20Safari/537.36&browser_online=true&tz_name=Asia/Shanghai&identity=audience&room_id=7382522869470808883&heartbeatDuration=0&signature=00000000')

关于signature参数的阶段性更新

webmssdk.es5.js 是包含signature加密方法 frontierSign 的主要script,在NodeJs中运行需要补齐浏览器运行环境,基本就是缺啥补啥,在调试 signature 参数的过程中发现frontierSign 主要入口是 _0x5c2014 ,可以全局导出该函数并直接调用,在 issue#45 中有兄弟提供了一个链接,提到需要注意的是:在浏览器中 _0x6caf['envcode'] 的值是 1 ,在node中是129,所以也需要补进去。

使用PyExecJS运行 脚本sign.js ,正常:
image

是否有一种新思路去链接ws

比如浏览器开启s5代理 链接到程序.然后直接只需要进行pb解包即可.至于环境包括登陆这些 就不需要管理了?
但是浏览器的话好像需要定时激活一下.写个js脚本半个小时激活下鼠标事件应该就可以 这样环境,cookie 这些检测都不需要怎么处理了

有知道弹幕飘屏怎么获取吗?

def _parseChatMsg(self, payload):
'''聊天消息'''
message = ChatMessage().parse(payload)
user_name = message.user.nick_name
user_id = message.user.id
content = message.content

content 好像没有弹幕飘屏的数据
有没有大佬指点一下

发现个BUG

礼物数量为10的时候会显示10,大佬可以修复吗

大佬WebSocket error: Connection to remote host was lost

WebSocket error: Connection to remote host was lost.
WebSocket error: DouyinLiveWebFetcher._wsOnClose() takes 2 positional arguments but 4 were given
代码没有任何改动,跑一会就这样了,多次运行都是这样

收取礼物包问题

1.收取的礼物通讯包会是两个一样的先后发送,导致同一个礼物接收两次,需要一个静态列表检测礼物包是否收过
2.在观众多次点击送礼物连击的时候,会重复接收礼物包导致数量偏多,和上一个问题类似,但收到的不是同一个包,包id不一样

大佬,有办法解决吗?商家账号直播间无法获取数据

我的账号是商家账号,属于私密账号,未登录的用户爬取不到数据,
直接在没有登录抖音号的浏览器打开直播间链接显示:”服务器开小差了,点击刷新重试“
在已经登录抖音号的浏览器打开直播间正常显示:如下方的图片
image

没有心跳检查

当这个直播间没有人发送弹幕的时候
你这个websocket会超时断开
但是后面这个直播间又有人发消息 又收不到

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.