Giter Club home page Giter Club logo

Comments (14)

pudo avatar pudo commented on September 23, 2024 1

Here's a CSV download link that we can use in the crawler: https://docs.google.com/spreadsheets/d/e/2PACX-1vT17qv7NxgWJnqmJJiGTncmAQeWI2QKW9Z92CZOXxWJi071xJr5V8CxtnB3AxgFkFZLCg2eGgBizxXs/pub?output=csv

from crawler-planning.

pudo avatar pudo commented on September 23, 2024 1

Let's not make Sanction here. I would just stuff the text into entity.add('program', ...) and tag it as debarment.

from crawler-planning.

pudo avatar pudo commented on September 23, 2024

The US just started sanctioning these, we should do a one-off extract of this list

from crawler-planning.

pudo avatar pudo commented on September 23, 2024

Here's a Google Sheet with the data, send me an access request as it still needs to be formatted quite a bit: https://docs.google.com/spreadsheets/d/1KszxKHQ6VTkMCQfjBkaPace5DdbDv7e9gIFmJrPtyy0/edit#gid=833768193

from crawler-planning.

pudo avatar pudo commented on September 23, 2024

Let's use the topic & collection debarment

from crawler-planning.

dhdaines avatar dhdaines commented on September 23, 2024

Hi! I could do this one - I assume you have generated the spreadsheet semi-automatically from the PDF? Do we include it in the repository or continue to host it on Google Sheets?

from crawler-planning.

pudo avatar pudo commented on September 23, 2024

We have a whole folder of these and just fetch them as CSVs. Let me see if I can do this from mobile.

from crawler-planning.

pudo avatar pudo commented on September 23, 2024

Nope. Gotta do it tomorrow :/

from crawler-planning.

dhdaines avatar dhdaines commented on September 23, 2024

No problem!

from crawler-planning.

dhdaines avatar dhdaines commented on September 23, 2024

Great, I'll make a preliminary crawler. Not sure exactly how to create the Sanction entities for this - authority is OHCHR? Do we have a reference for the US sanctioning these (some interesting names in there...)?

from crawler-planning.

dhdaines avatar dhdaines commented on September 23, 2024

Let's not make Sanction here. I would just stuff the text into entity.add('program', ...) and tag it as debarment.

Two questions

  1. is the Program entity documented anywhere? Can't find any details on it in the data dictionary - just free text?
  2. in the case of companies which have been removed from the list, it would be useful to add them with an end date, but if there is no Sanction we can't do that. should I simply omit them from the list?

from crawler-planning.

jbothma avatar jbothma commented on September 23, 2024

regarding entity.add('program', that's referring to the program property of Thing and its descendants. Search for Thing:program at https://www.opensanctions.org/reference/#schema.LegalEntity

regarding companies removed from the database but included in the spreadsheet, I'm not sure how to treat them in this case. @pudo ?

From the PDF:

  1. Of the 112 business enterprises included in the 2020 database report A/HRC/43/71,
    OHCHR found reasonable grounds for the removal of 15 business enterprises on basis that they
    were ceasing or were no longer involved in one or more of the listed activities in the Occupied
    Palestinian Territory, according to the standard described above. They were, as a result,
    removed from the updated database set out in Section A below.

Section A is the table of companies with Section value A. Business enterprises no longer involved in listed activities. From the number of entries, it looks like Section A is those that were indeed removed from the database, and just listed here for clarity.

With sanctions it seems we list them as long as they're on the official list but stop adding the sanction topic. With debarments we usually continue listing them, with an end date in the Sanction entity.

from crawler-planning.

pudo avatar pudo commented on September 23, 2024

I'd vote dropping them - don't think we have very good grounds for listing them if they're delisted around the first time we mention them.

from crawler-planning.

dhdaines avatar dhdaines commented on September 23, 2024

regarding entity.add('program', that's referring to the program property of Thing and its descendants. Search for Thing:program at https://www.opensanctions.org/reference/#schema.LegalEntity

Yes, I saw it there ... but I can't find a description of what it actually means, either there or in the FollowTheMoney ontology. For the moment I am putting the section name, e.g. "B. Business enterprises involved..." or "C. Business enterprises involved as parent companies".

I will just drop those from section A, makes sense to me.

from crawler-planning.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.