Comments (4)
looks like it's worth getting the CSV or Excel link from the HTML if the link changes but the landing page URL stays the same
from crawler-planning.
Here is the link to the CSV:
https://mzv.gov.cz/file/5296648/Vnitrostatni_sankcni_seznam.csv
Let me know if you want me to write an importer for the CSV and the code to find CSV link on the page (I'm not sure this is a good idea, as it seems that each addendum to the sanctions list will have its own page and csv file)
@jbothma please advise
from crawler-planning.
so it looks like https://mzv.gov.cz/jnp/cz/o_ministerstvu/otevrena_data/vnitrostatni_sankcni_seznam/index.html which is the csv link from https://mzv.gov.cz/jnp/cz/zahranicni_vztahy/sankcni_politika/sankcni_seznam_cr/vnitrostatni_sankcni_seznam.html was created in august and updated in december, while the file was updated in march. so I think the safest is to write a crawler that finds the file link at https://mzv.gov.cz/jnp/cz/o_ministerstvu/otevrena_data/vnitrostatni_sankcni_seznam/index.html, downloads the file, then extracts the entities from that CSV.
The earliest sanction was last year April, so there's a chance the page linking to it was different - hopefully they delete the old page if they create a new one then we'll find out.
from crawler-planning.
Okay, taking this one.
from crawler-planning.
Related Issues (20)
- Fix: Azerbaijan sanctions are now in XML HOT 1
- US OFAC Enforcement Actions HOT 2
- UN Senior Officials
- Netherlands D66 political party members
- Malta MFSA Penalties
- France AMF blacklist of illegal financial services
- France AMF Regulatory sanctions
- IOM Disqualified Directors
- Guernsey Disqualified Directors
- Guernsey GFSC Disqualified Directors HOT 1
- US FDIC Failed Bank List
- Canada SEDAR+ Disciplined List
- South Africa FIC Sanctions imposed
- Estonian International Sanctions Act Lists
- South Africa National and Provincial Office Holders
- FinCEN 311 Special Measures
- Danish PEP list is broken
- Office of Antiboycott Compliance (OAC) | Bureau of Industry and Security (bis.gov)/ Requester List
- CBP Forced Labor Withhold Release Orders and Findings List
- Finland National Bureau of Investigation Asset Freeze List
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crawler-planning.