Giter Club home page Giter Club logo

legalagreements's Introduction

MIT Human Dynamics Lab

Repository for Computable Contracts Research and Development

Extracting Contracts from Edgar Filings

Couple searches that reveal most recent filings with contracts:

Open Knowledge Foundation:

Good Links to Gather Context:

GAAP Material Contracts tags:

element id="us-gaap_BankruptcyClaimsDescriptionOfMaterialContractsAssumedOrAssigned

  • Line 1149: <xs:element id='us-gaap_BankruptcyClaimsDescriptionOfMaterialContractsAssumedOrAssigned' name='BankruptcyClaimsDescriptionOfMaterialContractsAssumedOrAssigned' nillable='true' substitutionGroup='xbrli:item' type='xbrli:stringItemType' xbrli:periodType='duration' />

  • Line 1151 <xs:element id='us-gaap_BankruptcyClaimsAmountOfClaimsOnMaterialContractsRejected' name='BankruptcyClaimsAmountOfClaimsOnMaterialContractsRejected' nillable='true' substitutionGroup='xbrli:item' type='xbrli:monetaryItemType' xbrli:balance='credit' xbrli:periodType='instant' />

  • Line 1157 <xs:element id='us-gaap_BankruptcyClaimsDescriptionOfMaterialContractsRejected' name='BankruptcyClaimsDescriptionOfMaterialContractsRejected' nillable='true' substitutionGroup='xbrli:item' type='xbrli:stringItemType' xbrli:periodType='duration' />

Also See Property Type Tags:

  • xs:element name="property" type="PROPERTY_TYPE"

To download all XBRL documents:

Go to ftp://ftp.sec.gov/edgar/full-index/
Crawl ftp://ftp.sec.gov/edgar/full-index/20XX/QTRX/xbrl.idx ( for example, ftp://ftp.sec.gov/edgar/full-index/2015/QTR1/xbrl.idx )
Filter lines in each xbrl.idx by the form type you are interested in. For example:
1000180|SANDISK CORP|10-K|2015-02-10|edgar/data/1000180/0001000180-15-000013.txt
Download ftp://ftp.sec.gov/ -- for example, ftp://ftp.sec.gov/edgar/data/1000180/0001000180-15-000013.txt
Extract XBRL document beginning with <?xml>

To Scrape Edgar Contracts into Hadoop
SEC EDGAR Oil Contracts Finder
https://github.com/pudo/edgar-oil-contracts (can widen this scope to all contracts, not limit to oil related contracts)

A hacky way to download the documents:

  1. go to ftp://ftp.sec.gov/edgar/daily-index/
  2. then find a file with the following company.XXXXXXXX.idx (where XXXXXXXX is the full date)
  3. download the file by using wget ftp://ftp.sec.gov/edgar/daily-index/<PATH_to_file>/company.XXXXXXXX.idx
  4. To process it you need to convert it to csv by running the following command
    a. On your computer(using Terminal) navigate to the folder that has company.XXXXXXXX.idx
    b. cat company.XXXXXXXX.idx | sed 's/ +/;/g' > company.csv
    c. Now you have a CSV file and the separator character is ; instead of ,
  5. To download more than one just cat the file with wget the last field
    a. for i in cat company.csv | cut -d';' -f5 ; do wget "ftp://ftp.sec.gov/$i" ;done

legalagreements's People

Contributors

adminq80 avatar dazzaji avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.