Giter Club home page Giter Club logo

Comments (15)

nciefeiniu avatar nciefeiniu commented on July 30, 2024

@yilu1015 ciphertext这个参数正确吗

from wenshu.

nciefeiniu avatar nciefeiniu commented on July 30, 2024
import requests


headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:76.0) Gecko/20100101 Firefox/76.0",
    "Cookie": "HM4hUBT0dDOn80S=_wkz59snzaPmdO69oJWw7RwKvOLLkQX0DikwmBGlkQPmxpSSOx12K0bMQsZbsAnM; HM4hUBT0dDOn80T=4Cy7lu21LZTzgJixkyHmmEWZbRc8ka8p5n4j3VjY4QGDG1SvaYh_7s905F2vIqvUGdeqkJsKzN4nn207l2ZD5vCAYFgItnHeaHE9BgfeeFdxdrkPoybXDL1RJ7ZP_5WTlOs5R7awSBB_ft9xbGTXkYY4Yk3Cg4H5_iirToB6gyrJi67k95Ce8R.uGobThrdX2fAuiZF2ME1Wi9uIefdYS9UEajx44DAw2oi3R7X6o7XKmuyrMkU7h1DSW3I5XrUYu3wrrpNRSiTZoFndIDsOuiA9iKs2RnTnS3.v9Gi34m_msrGtVPkMlqjZxrXHzsjfKtO7; SESSION=ff5520e2-66e0-4bce-998e-02062e95b414"

}


res = requests.post(url="http://wenshu.court.gov.cn/website/parse/rest.q4w", data={
    "docId": "83451b69d9ff46b6af96abeb00d51326",
    "ciphertext": "110010+1000110+1100100+110100+1001001+1001101+1001010+1100001+1100100+1000100+1110111+110011+1001011+110010+1001100+1110101+110101+1110110+1101000+1101011+110000+1010010+1101101+1001111+110010+110000+110010+110000+110000+110111+110000+110110+1110100+110110+1101111+110010+1101001+1110011+1010011+110010+110100+1101110+1001000+1000011+1000100+1110110+1110111+1110001+110111+1000110+1001110+1110110+110100+1000001+111101+111101",
    "cfg": "com.lawyee.judge.dc.parse.dto.SearchDataDsoDTO@docInfoSearch",
    "__RequestVerificationToken": "SnhEAA5fkrhLG4Yqhv6ySDvi"
},
                    headers = headers)


print(res.text)

我测试是可以的啊,没问题 @yilu1015

返回结果如下

{"code":1,"description":null,"secretKey":"YuNfjorc70mO1Cllf6Isxf2B","result":"","success":true}

from wenshu.

yilu1015 avatar yilu1015 commented on July 30, 2024

@yilu1015 ciphertext这个参数正确吗

应该没问题。我用它成功获得了条目信息。跑出来就只有

{'code': 1, 'description': None, 'secretKey': 'c6LrFHW57hQQraFRWLcgLcFh', 'result': '7DMzlEH7ahk=', 'success': True}

from wenshu.

yilu1015 avatar yilu1015 commented on July 30, 2024

@nciefeiniu 参数如何设置有方法吗? 我当时看了#4 ,以为不需要。

from wenshu.

nciefeiniu avatar nciefeiniu commented on July 30, 2024

@yilu1015 参考 #13 (comment)

from wenshu.

huangsiyuan924 avatar huangsiyuan924 commented on July 30, 2024

我也是返回200但是没有全文, 楼主解决了吗

from wenshu.

yilu1015 avatar yilu1015 commented on July 30, 2024

抱歉,这两周忙其他的项目,还没仔细研究。请问你抓全文是用APP版还是网页版?欢迎参考 #13

from wenshu.

huangsiyuan924 avatar huangsiyuan924 commented on July 30, 2024

抱歉,这两周忙其他的项目,还没仔细研究。请问你抓全文是用APP版还是网页版?欢迎参考 #13

已经解决了, 不过还有个问题是pyqt5可以获取cookie, 但是连续获取第二次的话会直接退出Process finished with exit code -1073741819 (0xC0000005), 请问你又出现吗

from wenshu.

yilu1015 avatar yilu1015 commented on July 30, 2024

抱歉,这两周忙其他的项目,还没仔细研究。请问你抓全文是用APP版还是网页版?欢迎参考 #13

已经解决了, 不过还有个问题是pyqt5可以获取cookie, 但是连续获取第二次的话会直接退出Process finished with exit code -1073741819 (0xC0000005), 请问你又出现吗

哦?请问问题出在哪里?我读了大牛的回答,以为是cookies的问题,看着要设置pyppeteer + asyncio,就还没做。所以最后还是请求设置的问题?谢谢指教!

from wenshu.

huangsiyuan924 avatar huangsiyuan924 commented on July 30, 2024

抱歉,这两周忙其他的项目,还没仔细研究。请问你抓全文是用APP版还是网页版?欢迎参考 #13

已经解决了, 不过还有个问题是pyqt5可以获取cookie, 但是连续获取第二次的话会直接退出Process finished with exit code -1073741819 (0xC0000005), 请问你又出现吗

哦?请问问题出在哪里?我读了大牛的回答,以为是cookies的问题,看着要设置pyppeteer + asyncio,就还没做。所以最后还是请求设置的问题?谢谢指教!

pyqt获取的cookie没问题, 我是formdata的queryCondition多了个逗号

from wenshu.

yilu1015 avatar yilu1015 commented on July 30, 2024

抱歉,这两周忙其他的项目,还没仔细研究。请问你抓全文是用APP版还是网页版?欢迎参考 #13

已经解决了, 不过还有个问题是pyqt5可以获取cookie, 但是连续获取第二次的话会直接退出Process finished with exit code -1073741819 (0xC0000005), 请问你又出现吗

哦?请问问题出在哪里?我读了大牛的回答,以为是cookies的问题,看着要设置pyppeteer + asyncio,就还没做。所以最后还是请求设置的问题?谢谢指教!

pyqt获取的cookie没问题, 我是formdata的queryCondition多了个逗号

谢谢提示。以下是我POST方法的请求数据,感觉没问题:你的formdata是怎么设置的?

至于pyqt退出问题,我也有同样问题。目前还在测试获取全文,我只是重启Jupyter kernel,实战如何解决,也还等大佬指教。

data = {
    'docID': '97e53a7245264aaeacd4abde01272f72',
    'ciphertext': make_ciphertext(),
    'cfg': 'com.lawyee.judge.dc.parse.dto.SearchDataDsoDTO@docInfoSearch',
    '__RequestVerificationToken': verification_token()
}


headers = {
    "Accept": "application/json, text/javascript, */*; q=0.01",
    "Accept-Encoding": "gzip, deflate, br",
    "Accept-Language": "zh-CN,zh;q=0.9",
    "Host": "wenshu.court.gov.cn",
    "Origin": "https://wenshu.court.gov.cn",
    "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36",
    "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
    "X-Requested-With": "XMLHttpRequest",
    "cookie": cookie_string
}

from wenshu.

huangsiyuan924 avatar huangsiyuan924 commented on July 30, 2024

抱歉,这两周忙其他的项目,还没仔细研究。请问你抓全文是用APP版还是网页版?欢迎参考 #13

已经解决了, 不过还有个问题是pyqt5可以获取cookie, 但是连续获取第二次的话会直接退出Process finished with exit code -1073741819 (0xC0000005), 请问你又出现吗

哦?请问问题出在哪里?我读了大牛的回答,以为是cookies的问题,看着要设置pyppeteer + asyncio,就还没做。所以最后还是请求设置的问题?谢谢指教!

pyqt获取的cookie没问题, 我是formdata的queryCondition多了个逗号

谢谢提示。以下是我POST方法的请求数据,感觉没问题:你的formdata是怎么设置的?

至于pyqt退出问题,我也有同样问题。目前还在测试获取全文,我只是重启Jupyter kernel,实战如何解决,也还等大佬指教。

data = {
    'docID': '97e53a7245264aaeacd4abde01272f72',
    'ciphertext': make_ciphertext(),
    'cfg': 'com.lawyee.judge.dc.parse.dto.SearchDataDsoDTO@docInfoSearch',
    '__RequestVerificationToken': verification_token()
}


headers = {
    "Accept": "application/json, text/javascript, */*; q=0.01",
    "Accept-Encoding": "gzip, deflate, br",
    "Accept-Language": "zh-CN,zh;q=0.9",
    "Host": "wenshu.court.gov.cn",
    "Origin": "https://wenshu.court.gov.cn",
    "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36",
    "content-type": "application/x-www-form-urlencoded; charset=UTF-8",
    "X-Requested-With": "XMLHttpRequest",
    "cookie": cookie_string
}

能看到的就是headers里面cookie的c没大写,网站里的是大写

from wenshu.

nciefeiniu avatar nciefeiniu commented on July 30, 2024

@yilu1015 老哥,你这个问题。。。。。。。。

今天有空,就来看看这个

现在这样做还是能爬取到数据的。

老哥你获取不到详细数据,是你请求携带的data 数据搞错了!!!!!!

data = {
'docId': '199a3ed2137846f1bf17ac1d01116358'  # 请注意这个 docId 的大小写
} 

我自己看半天也没看出哪里错了。抓包一下就看到了。

from wenshu.

nciefeiniu avatar nciefeiniu commented on July 30, 2024

@yilu1015

image
image

from wenshu.

hujisong avatar hujisong commented on July 30, 2024

老师,我想请教个幼稚的问题,我用了您的方法来获取文书网首页的访问量:

res = requests.post(url="http://wenshu.court.gov.cn/website/parse/rest.q4w", data={
"cfg": "com.lawyee.judge.dc.parse.dto.SearchDataDsoDTO@wsCountSearch",
"__RequestVerificationToken": "Vy3UDgRWHtqQdQG14quguqDm"
}, headers = header01)

其中,cfg 和 header01 都是我从xhr获取的,但总是得不到数据,报405错误,我明明用的post,报错信息却是: Request method 'GET' not supported

以下时我执行后的结果:
<!doctype html><title>HTTP Status 405 – Method Not Allowed</title><style type="text/css">
H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;}
B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;}
P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}
A {color : black;}A.name {color : black;}.line {height: 1px; background-color: #525D76; border: none;}
</style>

HTTP Status 405 – Method Not Allowed


Type Status Report

Message Request method 'GET' not supported

Description The method received in the request-line is known by the origin server but not supported by the target resource.


Apache Tomcat/8.0.53

from wenshu.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.