Giter Club home page Giter Club logo

Comments (5)

manusimidt avatar manusimidt commented on May 26, 2024 1

Thank you for your issue!

No, the Http cache is not optional at this time. Even if you have downloaded the instance file and/or the files from the extension taxonomy, the parser must also download all the taxonomies and their files that are imported by the XBRL instance file.

For submissions from the SEC this includes for example the US-GAAP taxonomy, the DEI Taxonomy and the SRT Taxonomy.

These standard taxonomies can be pretty huge (i.e: US-GAAP 2020 has about 18 MB of xml files) thus caching is required when parsing multiple taxonomies. (you don't want do download the same standard taxonomy again and again for every of your 1000 submissions).

I got your example running with the following code:

logging.basicConfig(level=logging.INFO)
cache: HttpCache = HttpCache('./cache/')

# parse from path
instance_path = './data/TSLA/tsla-10k_20201231_htm.xml'
inst1 = parse_xbrl(instance_path, cache, 'https://www.sec.gov/Archives/edgar/data/1318605/000156459021004599/')

Currently you have to define the base url to the submission because the Taxonomyschema is imported with a relative path in the instance file.
i.e:

<link:schemaRef xlink:href="./tsla-20201231.xsd" xlink:type="simple"/>

But you are correct, this is very inconvenient if you have already downloaded the files of the extension taxonomy. The parser should at least try to find the schema file in the current directory or the instance file you want to parse.

I will implement this in the next days.

from py-xbrl.

manusimidt avatar manusimidt commented on May 26, 2024 1

Will do some further testing and documentation and then upload a new package version to pypi in the next 2-3 days.

from py-xbrl.

jamiehannaford avatar jamiehannaford commented on May 26, 2024

The parser should at least try to find the schema file in the current directory or the instance file you want to parse.

Awesome, thank you. It'd be great if the parser could find the schema locally. Great job on the project, I'm finding it super helpful!

from py-xbrl.

manusimidt avatar manusimidt commented on May 26, 2024

It should now work with the new package version 1.2.0.
I used the following code to get your example running:

from xbrl_parser.instance import parse_xbrl
from xbrl_parser.cache import HttpCache
import logging
logging.basicConfig(level=logging.INFO)
cache: HttpCache = HttpCache('./../cache/')
# cache.set_headers({'From': '', 'User-Agent': 'py-xbrl/1.1.4'})

# parse from path
instance_path = './data/TSLA/10-k/20201231/tsla-10k_20201231_htm.xml'
inst1 = parse_xbrl(instance_path, cache)
print(inst1)

I also tested on ~100 other SEC EDGAR submissions, both XBRL and iXBRL and it worked pretty reliabily.
Nevertheless, I would be happy if you give me feedback if it works for you.

image

from py-xbrl.

jamiehannaford avatar jamiehannaford commented on May 26, 2024

Thanks @manusimidt. I'll try using this weekend and reopen if I have any issues.

from py-xbrl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.