Comments (2)
I wasn't able to open the website in my personal computer using safari, but for some reason it works using tor and when I tried to cURL from a AWS EC2. The below code get's the information already as a JSON:
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:109.0) Gecko/20100101 Firefox/115.0',
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'en-US,en;q=0.5',
# 'Accept-Encoding': 'gzip, deflate, br',
'Referer': 'https://web.diputados.gob.mx/',
'Content-Type': 'application/json',
'Origin': 'https://web.diputados.gob.mx',
'Connection': 'keep-alive',
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-site',
}
json_data = {
'operationName': None,
'variables': {},
'query': '{\n allDiputados {\n Oid\n Nombre\n PrimerApellido\n SegundoApellido\n NombreCompleto\n Estado\n Partido\n Distrito\n Legislacion\n PrimerApellido\n CabeceraMunicipal\n Suplente\n id_dip\n IdDiputado\n Correo\n Telefono\n TipoEleccion\n Licencia\n __typename\n }\n}\n',
}
response = requests.post('https://micrositios.diputados.gob.mx:4001/graphql', headers=headers, json=json_data)
I can write the crawler and test it in a server, but I'm not sure it will run in the open sanction infrastructure, what do you think? @jbothma
This is the returned data: https://pastebin.com/7DhAcwWt
from crawler-planning.
Great! Nice research.
It seems to work for me from the UK.
I'd say go ahead and implement it.
See if it works with only the accept and content-type headers.
from crawler-planning.
Related Issues (20)
- BIS Entity List HOT 1
- BIS Unverified List HOT 1
- BIS Military End User List HOT 1
- Council Regulation (EU) 2022/398
- Council Implementing Regulation (EU) 2022/2476
- US Nonproliferation Sanctions
- US Terrorist Exclusion List
- US Section 7031(c) of the DoS, Foreign Operations, and Related Programs Appropriations Act
- European Council Decision 2014/145
- European Council Decision 2022/399
- European Council Decision 2022/2477
- European Council Decision 2022/2478
- European Council Decision 2023/2871 HOT 1
- Luxembourg Administrative sanctions HOT 1
- Czech Republic National Sanctions List HOT 4
- Organizations designated as terrorist by Bahrain
- Executive Order 13959
- parltrack as source of european MEPs HOT 1
- Armenia National Assembly
- FCC Covered List HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crawler-planning.