I have the following files: <div class="snippet-clipboard-content notranslate posi

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Doesn't seem to work with local XSD files about py-xbrl HOT 5 CLOSED

manusimidt commented on May 26, 2024

Doesn't seem to work with local XSD files

from py-xbrl.

Comments (5)

manusimidt commented on May 26, 2024 1

Thank you for your issue!

No, the Http cache is not optional at this time. Even if you have downloaded the instance file and/or the files from the extension taxonomy, the parser must also download all the taxonomies and their files that are imported by the XBRL instance file.

For submissions from the SEC this includes for example the US-GAAP taxonomy, the DEI Taxonomy and the SRT Taxonomy.

These standard taxonomies can be pretty huge (i.e: US-GAAP 2020 has about 18 MB of xml files) thus caching is required when parsing multiple taxonomies. (you don't want do download the same standard taxonomy again and again for every of your 1000 submissions).

I got your example running with the following code:

logging.basicConfig(level=logging.INFO)
cache: HttpCache = HttpCache('./cache/')

# parse from path
instance_path = './data/TSLA/tsla-10k_20201231_htm.xml'
inst1 = parse_xbrl(instance_path, cache, 'https://www.sec.gov/Archives/edgar/data/1318605/000156459021004599/')

Currently you have to define the base url to the submission because the Taxonomyschema is imported with a relative path in the instance file.
i.e:

<link:schemaRef xlink:href="./tsla-20201231.xsd" xlink:type="simple"/>

But you are correct, this is very inconvenient if you have already downloaded the files of the extension taxonomy. The parser should at least try to find the schema file in the current directory or the instance file you want to parse.

I will implement this in the next days.

from py-xbrl.

manusimidt commented on May 26, 2024 1

Will do some further testing and documentation and then upload a new package version to pypi in the next 2-3 days.

from py-xbrl.

jamiehannaford commented on May 26, 2024

The parser should at least try to find the schema file in the current directory or the instance file you want to parse.

Awesome, thank you. It'd be great if the parser could find the schema locally. Great job on the project, I'm finding it super helpful!

from py-xbrl.

manusimidt commented on May 26, 2024

It should now work with the new package version 1.2.0.
I used the following code to get your example running:

from xbrl_parser.instance import parse_xbrl
from xbrl_parser.cache import HttpCache
import logging
logging.basicConfig(level=logging.INFO)
cache: HttpCache = HttpCache('./../cache/')
# cache.set_headers({'From': '', 'User-Agent': 'py-xbrl/1.1.4'})

# parse from path
instance_path = './data/TSLA/10-k/20201231/tsla-10k_20201231_htm.xml'
inst1 = parse_xbrl(instance_path, cache)
print(inst1)

I also tested on ~100 other SEC EDGAR submissions, both XBRL and iXBRL and it worked pretty reliabily.
Nevertheless, I would be happy if you give me feedback if it works for you.

from py-xbrl.

jamiehannaford commented on May 26, 2024

Thanks @manusimidt. I'll try using this weekend and reopen if I have any issues.

from py-xbrl.

Doesn't seem to work with local XSD files about py-xbrl HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent