Giter Club home page Giter Club logo

wenshu_spider's People

Contributors

sixs avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wenshu_spider's Issues

docid解密问题

你好关于docid解密那段js,运行出现至如下代码出现问题:
key = ctx2.call("EvalKey", js1, js2)
key = re.findall(r""([0-9a-z]{32})"", key)[0]

执行第一行时候,key取得为None,导致出错。

vjkl5问题

在爬取的翻页过程中,经常出现remind key的问题.请问您有了解过吗

返回的id是加密的

{'id': 'FcOOwrkBBDEIBMKwwpZ4w4xgQsOMw5N/ScK3FyvDkcKDD8OHOV1GeSwZw5YRw7XDuMKIEgnDqwDCh8KKwoHCoMKsw54EIB5dK8O+KsKKwpTCp8KEXsKpWD57E8Ohw4bCt8ObN8OaXF0SV2XDmkHDpm4fasK3bMKNw6PCjjzDlMKsRMOSHTbCnsKBwqxfckjDnsKcw43CgMOnVUbDrCXDhVNbwpHDrcKTw5HCgsOSw6U+w4fCqXHCv3k1Y8Kqw7/CosO0AMKuc13Dnw8=', 'name': '林山与覃世松以及田东县桂松酒精有限责任公司股东出资纠纷申请再审民事裁定书', 'type': '2', 'date': '2014-02-27', 'number': '(2013)民申字第1102号', 'court': '最高人民法院'}

现在返回的数据中id字段是加密的,请问作者有研究么?

列表请求问题

每次请求发过去返回的都是remind key,怎么才能获取列表页内容

"remind key"

文书网现在运行这个爬虫几乎一直返回remind key,我检查比对了
print("get guid")
guid = get_guid()
print(guid)
print("get number")
number = get_number(guid, logger)
print(number)
print("get vjkl5")
vjkl5 = get_vjkl5(guid, number, Param, logger)
print(vjkl5)
print("get vl5x")
vl5x = get_vl5x(vjkl5, logger)
print(vl5x)
这些方程似乎流程是没变的。不知道有没有人知道怎么回事

vjkl5获取

vjkl5 = req1.cookies["vjkl5"]
现在获取不出vjkl5,请问怎么能解决呀

验证码

你好 网站的验证码改了 有字母了 应该怎么改呢?

验证码已更新,checkcode模块失效

验证码已更新为带杂点的5位字母和数字

此外vl5x的计算函数也已经更新,请问应该怎么通过混淆的代码获取到原先的vl5x代码呢?

download

保存模块的content.html文件,您是否忘了上传呀?

验证码错误

我使用的是py3.6.1(64位) 将项目导入pycharm运行后得到以下错误

第1页

出现验证码
识别验证码为:1
验证码错误
识别验证码为:1

第1页

Traceback (most recent call last):
File "G:/coding/wenshu_spider-master/court.py", line 393, in
get_data(Param,Page,Order,Direction)
File "G:/coding/wenshu_spider-master/court.py", line 252, in get_data
json_data = json.loads(return_data)
File "C:\Users\Andrew\AppData\Local\Programs\Python\Python36\lib\json_init_.py", line 354, in loads
return _default_decoder.decode(s)
File "C:\Users\Andrew\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Users\Andrew\AppData\Local\Programs\Python\Python36\lib\json\decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

DocID issue

运行后decrypt_id中返回的js是这样的 hidescript=String.fromCharCode(+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[],+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!+[]+!

Please help. Thanks!

三个问题

1,验证码两三个月前,就已经不是纯数字了
2,DocID解密有时会出现错误:
提示 execjs._exceptions.ProgramError: Error: Malformed UTF-8 data
3, 详情页根据docid会打不开,增加了:MmEwMD 参数
问题有点棘手

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.