Comments (8)
using the url above the library did not work.
using the url: https://www.sec.gov/Archives/edgar/data/0001365135/000155837021005716/wu-20210331x10q.htm
worked but did not result in a found DocumentPeriodEndDate.
using the url for the xbrl itself : https://www.sec.gov/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q_htm.xml
did work to find the DocumentPeriodEndDate
inst: XbrlInstance = xbrlParser.parse_instance('https://www.sec.gov/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q_htm.xml')
for i in inst.facts:
if i.concept.name == 'DocumentPeriodEndDate':
print(i.concept)
print(i.value)
out:
DocumentPeriodEndDate
2021-03-31
Sorry if you were looking for any information as to why, but I hope this helps.
from py-xbrl.
Interesting the xml version extracts the DocumentPeriodEndDate:
https://www.sec.gov/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q_htm.xml
But the ixbrl original htm doesn't:
https://www.sec.gov/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q.htm
Sounds like a parsing error.
manusimidt told me we should prefer ixbrl htm (original filling) over the SEC extracted xml.
from py-xbrl.
Yes, looks like a parsing error. The iXBRL Instance Document certainly contains the DocumentPeriodEndDate fact.
<ix:nonNumeric format="ixt:datemonthdayyearen"
contextRef="Duration_1_1_2021_To_3_31_2021_Pi9QpSqr-0e0RF1F9GgqSg"
name="dei:DocumentPeriodEndDate"
id="Narr_VydiiUz0MUOCCrLq-p-Mpw">
<b style="font-weight:bold;">
March 31, 2021
</b>
</ix:nonNumeric>
I think the fact is not parsed because it contains additional HTML Elelemts (the bold tag).
from py-xbrl.
Yes, this is the issue:
Lines 416 to 417 in a2aca03
I will implement a fix.
from py-xbrl.
In bs4 we have these:
.text is recursive (what we want), there should be an equivalent for it in etree
.string is only for one given item (wouldn't go into bold tag)
from py-xbrl.
Yes, thats correct.
With bs4 it is really easy to extract the text recursively for the given element.
I could not find any equivalent for it in etree. Please let me know if you find a solution.
I am currently implementing a function that extracts the text recursively but i don't know if that is the best way of doing.
from py-xbrl.
Doc says:
xml.etree.ElementTree.tostring(element, encoding="us-ascii", method="xml", *, short_empty_elements=True)
"Generates a string representation of an XML element, including all subelements."
-> so it should be recursive too, might need playing with parameters.
*short_empty_elements is from v3.4
from py-xbrl.
using the url above the library did not work. using the url:
https://www.sec.gov/Archives/edgar/data/0001365135/000155837021005716/wu-20210331x10q.htm
worked but did not result in a found DocumentPeriodEndDate.using the url for the xbrl itself :
https://www.sec.gov/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q_htm.xml
did work to find the DocumentPeriodEndDateinst: XbrlInstance = xbrlParser.parse_instance('https://www.sec.gov/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q_htm.xml') for i in inst.facts: if i.concept.name == 'DocumentPeriodEndDate': print(i.concept) print(i.value)
out: DocumentPeriodEndDate 2021-03-31
Sorry if you were looking for any information as to why, but I hope this helps.
and
Interesting the xml version extracts the DocumentPeriodEndDate: https://www.sec.gov/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q_htm.xml
But the ixbrl original htm doesn't: https://www.sec.gov/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q.htm
Sounds like a parsing error. manusimidt told me we should prefer ixbrl htm (original filling) over the SEC extracted xml.
guys, do you still have parser working for ixblr urls?
i am using py-2.0.7 and
https://www.sec.gov/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q_htm.xml (xblr) works
https://www.sec.gov/ix?doc=/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q.htm (ixblr url with ? = characters) doesn't work at all
PermissionError: [Errno 1] Operation not permitted
but and https://www.sec.gov/Archives/edgar/data/1365135/000155837021005716/wu-20210331x10q.htm (ixblr clean url) does not work at all
ParseError: not well-formed (invalid token): line 9, column 1106
it worked for you at least particially (without DocumentPeriodEndDate), so i wonder what's the reason.
from py-xbrl.
Related Issues (20)
- Need path or reference to source file of a Linkbase HOT 2
- Standardised Financial Data HOT 5
- Equals method for all fact classes HOT 1
- Solution to frequently missing taxonomy specifications in UK submissions HOT 21
- Parsing Failures for Empty Fact Values and 'nil' Text in XBRL Documents HOT 2
- Date parsing fails
- "Explicit Member"s missing HOT 3
- Add support for Datetime in context duration. HOT 3
- KeyError: 'Unit_sqft' HOT 2
- Add support for the ixt-sec transformations. HOT 1
- unresolved schemas HOT 12
- Not well-formed (invalid token) error for ixblr. HOT 11
- parse_ixbrl should add encoding argument HOT 2
- Be nicer to submissions that do not follow the XBRL standard 100% HOT 6
- New 2022 taxonomies HOT 4
- Bug: instance.json('my-file.json') HOT 1
- Unclear +/- sign of some facts HOT 17
- Space in url creates issues when requesting a taxonomy
- Potential arg bug in transformations __init__ HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from py-xbrl.