jseutter / ofxparse Goto Github PK

Ofx file format parser for Python

Home Page: http://sites.google.com/site/ofxparse/

License: MIT License

Python 100.00%

ofxparse's Introduction

ofxparse

ofxparse is a parser for Open Financial Exchange (.ofx) format files. OFX files are available from almost any online banking site, so they work well if you want to pull together your finances from multiple sources. Online trading accounts also provide account statements in OFX files.

There are three different types of OFX files, called BankAccount, CreditAccount and InvestmentAccount files. This library has been tested with real-world samples of all three types. If you find a file that does not work with this library, please consider contributing the file so ofxparse can be improved. See the Help! section below for directions on how to do this.

Example Usage

Here's a sample program

from ofxparse import OfxParser
with codecs.open('file.ofx') as fileobj:
    ofx = OfxParser.parse(fileobj)

# The OFX object

ofx.account               # An Account object

# AccountType
# (Unknown, Bank, CreditCard, Investment)

# Account

account = ofx.account
account.account_id        # The account number
account.number            # The account number (deprecated -- returns account_id)
account.routing_number    # The bank routing number
account.branch_id         # Transit ID / branch number
account.type              # An AccountType object
account.statement         # A Statement object
account.institution       # An Institution object

# InvestmentAccount(Account)

account.brokerid          # Investment broker ID
account.statement         # An InvestmentStatement object

# Institution

institution = account.institution
institution.organization
institution.fid

# Statement

statement = account.statement
statement.start_date          # The start date of the transactions
statement.end_date            # The end date of the transactions
statement.balance             # The money in the account as of the statement date
statement.available_balance   # The money available from the account as of the statement date
statement.transactions        # A list of Transaction objects

# InvestmentStatement

statement = account.statement
statement.positions           # A list of Position objects
statement.transactions        # A list of InvestmentTransaction objects

# Transaction

for transaction in statement.transactions:
  transaction.payee
  transaction.type
  transaction.date
  transaction.user_date
  transaction.amount
  transaction.id
  transaction.memo
  transaction.sic
  transaction.mcc
  transaction.checknum

# InvestmentTransaction

for transaction in statement.transactions:
  transaction.type
  transaction.tradeDate
  transaction.settleDate
  transaction.memo
  transaction.security      # A Security object
  transaction.income_type
  transaction.units
  transaction.unit_price
  transaction.comission
  transaction.fees
  transaction.total
  transaction.tferaction

# Positions

for position in statement.positions:
  position.security       # A Security object
  position.units
  position.unit_price
  position.market_value

# Security

security = transaction.security
# or
security = position.security
security.uniqueid
security.name
security.ticker
security.memo

Help!

Sample .ofx and .qfx files are very useful. If you want to help us out, please edit all identifying information from the file and then email it to jseutter dot ofxparse at gmail dot com.

Development

Prerequisites::: # Ubuntu sudo apt-get install python-beautifulsoup python-nose python-coverage-test-runner # Python 3 (pip) pip install BeautifulSoup4 six lxml nose coverage # Python 2 (pip) pip install BeautifulSoup six nose coverage

The six package is required for python 2.X compatibility

Tests: Simply running the nosetests command should run the tests.

nosetests

If you don't have nose installed, the following might also work:

python -m unittest tests.test_parse

Test Coverage Report:

coverage run -m unittest tests.test_parse

# text report
coverage report

# html report
coverage html
firefox htmlcov/index.html

Homepage

Homepage: https://sites.google.com/site/ofxparse

Source: https://github.com/jseutter/ofxparse

License

ofxparse is released under an MIT license. See the LICENSE file for the actual license text. The basic idea is that if you can use Python to do what you are doing, you can also use this library.

ofxparse's People

Contributors

Stargazers

Watchers

Forkers

ericmoritz egguy vineeth-alva santagada pavit iffy guyrt chrisrossi ryanbackman themoriarty achiang orangain jcollie cuelindar afrolov glennimoss ansother mikeivanov captin411 gbagnoli tejastank olarin raninho cjw296 parsing hiciu danc86 pombredanne egh nix gravengaard zenbot ridler77 greedo chriscla absoludity ekryski brunovianarezende scottbdr afnarel hiromu2000 pombreda natebragg westurner udibr zen42 nall hobeika takis tubaman rdsteed an613 mileo kmee tgoetze mattyw josephw sreeshas lsowen tuzzeg eliribble paulortman felipefarias tonnydourado ilesh-malani bwooceli mralbu cgiacofei jantman grgbrn viraptor avorio matsharpe flash3780 kamushadenes gustavofc cilynx e2thenegpii reasonableperson bruny mbkamble euresti zapgram patbakdev idealisms mamraoui brucedesa jamesonnetworks chemtov johnstarich mirkodziadzka adelarduarte jeanlaroche lancelotj pablosantosdev fyhuang fdinel vieiragabriel lucrorural pcekm

ofxparse's Issues

ofxparse shows up as ofx_parse, not ofxparse

There are some leftover parts from when the package was named ofx_parse. Clean up.

Error parsing TRNAMT (should also accept commas)

Hello,

I have an OFX file generated by a Brazilian bank (Santander) - here we use "," as the decimal separator and the OFX have mixed "," and "." as decimal separators on the TRNAMT values (probably this difference occurs based on which integration they have, like the debit card or ATM systems).

Here is a sample:

<STMTTRN>
    <TRNTYPE>OTHER
    <DTPOSTED>20150706000000[-3:GMT]
    <TRNAMT>            -12.80
    <FITID>00451065
    <CHECKNUM>00451065
    <PAYEEID>0
    <MEMO>DEBITO VISA ELECTRON BRASIL        05/07 CAFE ALBRECHT L
</STMTTRN>
<STMTTRN>
    <TRNTYPE>OTHER
    <DTPOSTED>20150706000000[-3:GMT]
    <TRNAMT>           -200,00
    <FITID>00239183
    <CHECKNUM>00239183
    <PAYEEID>0
    <MEMO>SAQUE NO BANCO 24 HORAS
</STMTTRN>

I thought it was an error on the file (so Santander should fix it), but looking into the OFX specification I found that it's actually supported:

3.2.5 Transaction Amount <TRNAMT>
Format: Amount
Open Financial Exchange uses <TRNAMT> in any request or response that reports the total amount of an individual transaction.
Usage: Bank statement download, investment statement download, payments

And:

Amount: Amounts that do not represent whole numbers (for example, 540.32), must include a decimal point or comma to indicate the start of the fractional amount. Amounts should not include any punctuation separating thousands, millions, and so forth. The maximum value accepted depends on the client.

The fix needed is pretty simple -- just change the function that tries to parse the transaction amount on https://github.com/jseutter/ofxparse/blob/master/ofxparse/ofxparse.py#L824

Note: I had the same error on the ofxtools library.

[easy] PEP8 code style cleanup

This is totally a pedantry issue and not a bug issue, but the flake8 report on this is not so great. Would to clean it up so that it is easier to find bugs (eg only printout is the bug).

travis.yml should not specify BeautifulSoup for Python 2.7

travis.yml specifies installing BeautifulSoup rather than BeautifulSoup4 for Python 2.*

BeautifulSoup4 supports Python 2.7 but no longer supports earlier versions.

LICENSE and tests aren't included in tarballs

You can probably just add a MANIFEST.in like this:

include LICENSE AUTHORS
recursive-include tests *.py *.ofx

[medium] Application script: Convert an OFX file to a .json file

I think it would be nice to include some scripts that do useful things, so users don't have to keep doing the same thing over and over.

New release?

Hi,

The latest released version of ofxparse is 0.14 (https://pypi.python.org/pypi/ofxparse), from 2013.

The latest version of ledger-autosync (gitlab.com/egh/ledger-autosync) uses some unreleased features in ofxparse (namely, #85) for improved support of investment transactions.

It would be great if their could be a new version of ofxparse released so that users installing ledger-autosync via pip or as a debian/ubuntu package could use this feature.

Thanks for all your work on this project.

HELP NEEDED: Build infrastructure

Ofxparse could benefit from having some infrastructure to do things like run tests and create releases. This issue is for discussing what tools are available and the merits of each. My own knowledge is not up to date..

Pretty much needed features:

I would like a tool that gives more immediate feedback to people when they create PRs, as the code is still fresh in their minds. I would like it to run the tests as well as flake8.
The tool should be a service where I don't have to keep a server running. I have set up servers for this in the past, but keeping them running has been more work than I was willing to commit.

Some nice to haves, but not critical:

The solution would run on someone else's infrastructure, as maintaining this by myself has been painful to the point where I don't maintain it.
Be able to create releases for both python 3 and python 2.7. This doesn't have to be automatic
Generate test coverage stats

What kind of solutions appear to be viable?

Investment transactions don't have a type

There's no way to distinguish between a buy and a sell, for example.

Special char breaks parsing ?

Hi,

I use OFX format to deal with financial data. I've found this librariry to automate parsing (so this my first use of this package).

I try

>>> import os
>>> from ofxparse import OfxParser
>>> file = open(os.path.expanduser('~/20131107.ofx'))
>>> ofx = OfxParser.parse(file)

but I have this traceback

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.3/site-packages/ofxparse/ofxparse.py", line 330, in parse
    ofx_file = OfxPreprocessedFile(file_handle)
  File "/usr/lib/python3.3/site-packages/ofxparse/ofxparse.py", line 141, in __init__
    super(OfxPreprocessedFile,self).__init__(fh)
  File "/usr/lib/python3.3/site-packages/ofxparse/ofxparse.py", line 71, in __init__
    self.read_headers()
  File "/usr/lib/python3.3/site-packages/ofxparse/ofxparse.py", line 77, in read_headers
    head_data = head_data[:head_data.find(six.b('<'))]
TypeError: Can't convert 'bytes' object to str implicitl

Is anybody ever had this bug ?

Regards,

Can not add own field in Transaction Object

I want to add one more field in Transaction Object named 'refnum' which can be mostly need for store/parse reference number of each Transactions.

I have tried to add my self by failed.

Can any one have any idea about it.

[hard] Document how to make a release

Signons incomplete

Would love support for the missing language, dates, fi, and intu.bid fields.

12   <SIGNONMSGSRSV1>
13     <SONRS>
14       <STATUS>
15         <CODE>0
16         <SEVERITY>INFO
17       </STATUS>
18       <DTSERVER>20140416194035.208
19       <LANGUAGE>ENG
20       <DTPROFUP>20050531050000.000
21       <FI>
22         <ORG>U.S. Bank
23         <FID>1402
24       </FI>
25       <INTU.BID>1402
26     </SONRS>
27   </SIGNONMSGSRSV1>

[medium] Use coveralls.io to generate test coverage stats

Some pull requests come in without tests, like pull #54, which are easily broken later. Test coverage stats would be a huge help to identifying which parts of the code are lacking tests, so this situation can be rectified.

http://wiki.python.org/moin/CodeCoverage is a good place to start looking. I believe Coverage and Figleaf were both good libraries and Figleaf builds on top of Coverage.

Things needed:

When a special command is used, the tests are run and stats are gathered.
A coverage report, can be html or plaintext, can be generated from the stats
A list of packages that this functionality depends on
Update the README file with this information

Use OpenJade to improve parsing/remove the need to BeautifulSoup?

Got a tip from a user that OpenJade could be used to upgrade the OFX file to XML, then standard Python XML tools could parse the OFX file. Sounds really promising, look into it!

http://openjade.sourceforge.net/

Pros:

Remove BeautifulSoup dependency
Would enable this library to run on Python 3
Probably less effort than a from-scratch OFX format parser

Cons:

Adds a dependency on a C++ program. Creates cross-platform issues

Pypi is 0.5 but github is still 0.4

ofxparse/init.py is still at 0.4 but pypi has a version of 0.5.

The date on pypi is prior to a whole slew of commits in github so I'm kind of confused.

Thanks!

Traceback parsing Investment file

From email:

PFA a sample file for investment account. When I try to parse the file I get the following error :

Traceback (most recent call last):
File "", line 1, in
File "ofxparse.py", line 140, in parse
ofx_obj.account = cls_.parseInvstmtrs(invstmtrs_ofx)
File "ofxparse.py", line 163, in parseInvstmtrs
account = InvestmentAccount()
File "ofxparse.py", line 79, in init
super(InvestmentAccount, self).init(self)
TypeError: init() takes exactly 1 argument (2 given)

Add hook for travis-ci

You can set up travis-ci by following these directions: https://travis-ci.org/getting_started

Basically, just need to sign in to travis-ci with github, give them some permissions over your repositories, and then flip a switch over there.

BeautifulSoup with "xml" gets ofx file ("The ofx file is empty!")

This is similar to issue #76

In ofxparse.py,

def soup_maker(fh):
    skip_headers(fh)
    try:
        from bs4 import BeautifulSoup
        soup = BeautifulSoup(fh, "xml")
        for tag in soup.findAll():
            tag.name = tag.name.lower()
    except ImportError:
        from BeautifulSoup import BeautifulStoneSoup
        soup = BeautifulStoneSoup(fh)
    return soup

"xml" parser returns "The ofx file is empty!".
If I choose "lxml" instead of "xml", the tests fail for suncorp.ofx. This is because suncorp.ofx has CDATA which cannot be parsed correctly by lxml in current condition.

If i choose "html.parser", everything works as expected.

I have tested the above behavior on Mac OS X with Python 2.7.

Time for a new release ?

Hi,
I wish to parse the ledger balance date in production, however this info is extracted from OFX files only since 8331cc7 (which was committed only 2 months after 0.14 release, such bad luck).

Do you plan a new release anytime soon ?

And thanks for your work on ofxparse, it's very useful.

Dies on attempt to parse ofx 2.11 documents with <?xml declaration

I suspect this is beautiful soup being rubbish.

Here is my version of beautiful soup

python -c 'import BeautifulSoup; print BeautifulSoup.__version__'
3.2.1

Here's a patch that "fixes" the issue, but I don't fully
understand what's going on. It also illustrates that
beautiful soup is getting fed the whole document.

--- a/ofxparse/ofxparse.py
+++ b/ofxparse/ofxparse.py
@@ -191,8 +191,12 @@ class OfxPreprocessedFile(OfxFile):
                 tag_name = re.findall(r'(?i)<([a-z0-9_\.]+)>', token)[0]
                 if tag_name.upper() not in closing_tags:
                     last_open_tag = tag_name
-            new_fh.write(token)
+
+            if not is_processing_tag:
+                new_fh.write(token)
+
         new_fh.seek(0)
+        print new_fh.getvalue()
         self.fh = new_fh

Here is a sanitized document that exhibits the behaviour

<?xml version="1.0" encoding="US-ASCII"?>
<?OFX OFXHEADER="200" VERSION="200" SECURITY="NONE" OLDFILEUID="NONE" NEWFILEUID="NONE"?>
<!-- Converted from: QIF -->
<!-- Date format was: DD/MM/YY -->
<OFX>
  <SIGNONMSGSRSV1>
    <SONRS>
      <STATUS>
        <CODE>0</CODE>
        <SEVERITY>INFO</SEVERITY>
        <MESSAGE>SUCCESS</MESSAGE>
      </STATUS>
      <DTSERVER>20151230</DTSERVER>
      <LANGUAGE>ENG</LANGUAGE>
      <FI>
        <ORG>UNKNOWN</ORG>
        <FID>UNKNOWN</FID>
      </FI>
    </SONRS>
  </SIGNONMSGSRSV1>
  <CREDITCARDMSGSRSV1>
    <CCSTMTTRNRS>
      <TRNUID>0</TRNUID>
      <STATUS>
        <CODE>0</CODE>
        <SEVERITY>INFO</SEVERITY>
        <MESSAGE>SUCCESS</MESSAGE>
      </STATUS>
      <CCSTMTRS>
        <CURDEF>USD</CURDEF>
        <CCACCTFROM>
          <ACCTID>UNKNOWN</ACCTID>
        </CCACCTFROM>
        <BANKTRANLIST>
          <DTSTART>20151203</DTSTART>
          <DTEND>20151230</DTEND>
          <STMTTRN>
            <TRNTYPE>DEBIT</TRNTYPE>
            <DTPOSTED>20151230</DTPOSTED>
            <TRNAMT>-3.45</TRNAMT>
            <FITID>UNKNOWN-CREDITCARD-20151230-3--3.45</FITID>
            <NAME>TESCO-STORES 2610</NAME>
          </STMTTRN>
        </BANKTRANLIST>
        <LEDGERBAL>
          <BALAMT>UNKNOWN</BALAMT>
          <DTASOF>20151230</DTASOF>
        </LEDGERBAL>
        <AVAILBAL>
          <BALAMT>UNKNOWN</BALAMT>
          <DTASOF>20151230</DTASOF>
        </AVAILBAL>
      </CCSTMTRS>
    </CCSTMTTRNRS>
  </CREDITCARDMSGSRSV1>
</OFX>

Currency on transactions

I needed to parse CURRENCY tags as well (#100), so I've modified the transaction parser to retrieve transaction currencies other than CURDEF. mralbu/ofxparse@591b6fe
I've added an ofx file to fixtures and a test function for it as well.
Do you think this solution is acceptable? Would you accept it as a pull request?

pip install doesn't work

$ pip install ofxparse==0.6
Downloading/unpacking ofxparse==0.6
  Downloading ofxparse-0.6.tar.gz
  Running setup.py egg_info for package ofxparse
    Traceback (most recent call last):
      File "<string>", line 14, in <module>
      File "/home/matt/vpy/ofx2/build/ofxparse/setup.py", line 3, in <module>
        import ofxparse
      File "ofxparse/__init__.py", line 1, in <module>
        from ofxparse import OfxParser, AccountType, Account, Statement, Transaction
      File "ofxparse/ofxparse.py", line 1, in <module>
        from BeautifulSoup import BeautifulStoneSoup
    ImportError: No module named BeautifulSoup
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):

  File "<string>", line 14, in <module>

  File "/home/matt/vpy/ofx2/build/ofxparse/setup.py", line 3, in <module>

    import ofxparse

  File "ofxparse/__init__.py", line 1, in <module>

    from ofxparse import OfxParser, AccountType, Account, Statement, Transaction

  File "ofxparse/ofxparse.py", line 1, in <module>

    from BeautifulSoup import BeautifulStoneSoup

ImportError: No module named BeautifulSoup

----------------------------------------
Command python setup.py egg_info failed with error code 1 in /home/matt/vpy/ofx2/build/ofxparse
Storing complete log in /home/matt/.pip/pip.log

This was in a freshly-made virtualenv. If I get a chance, I'll submit a patch.

BRANCHID support

Is there support for the BRANCHID tag? (inside BANKACCTFROM)

I couldn't find it anywhere, either inspecting the parse result or looking at the sources. Am I missing something? Ended up using this workaround:

class MyOfxParser(OfxParser):
    @classmethod
    def parseStmtrs(cls_, stmtrs_list, accountType):
        ret = OfxParser.parseStmtrs(stmtrs_list, accountType)
        for account, stmtrs_ofx in zip(ret, stmtrs_list):
            branchid_tag = stmtrs_ofx.find('branchid')
            if hasattr(branchid_tag, 'contents'):
                account.branch_id = branchid_tag.contents[0].strip()
        return ret

Thanks in advance

FutureWarning on Python 3.5

Running the tests on Python 3.5 results in this warning:

$ python setup.py test 2>&1 | grep Future
testForFourAccounts (tests.test_parse.TestAccountInfoAggregation) ... /Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/re.py:203: FutureWarning: split() requires a non-empty pattern match.

The issue is almost certainly at ofxparse.py:103 where re.split(six.b('\r?\n?'), head_data) is invoked.

No currency on transactions

Hi Guys,

I have stumble upon some Credit Card statements with international purchases. It contains the tags (CURRATE inside CURRENCY), but the parser doesn't look into those, nor the Transaction objects have those attributes. I will be happy to send a pull request if you point me into the right direction on how to do this.

Ideally, would be nice to have a way to return the converted value, instead of the original one.

README file() example fails in py3

Since the file() function is gone in Python3, the example in the README doesn't work. Neither does the obvious substitution of open(), as at least some qfx files have binary characters (i.e., I had to open(file.qfx, 'rb'), and then it worked). In general, the README could use some work so that it can be understood by non-expert programmers. I may take a crack at it and offer a pull later, but I wanted to get this in now, in case someone had already done it and hadn't done a pull.

--jh--

[medium] Application script: Convert an OFX file to a CSV file

I think it would be nice to include some scripts that do useful things, so users don't have to keep doing the same thing over and over.

Beautiful soup with lxml gets ofx file ("The ofx file is empty!")

Tested on OS X 10.10 and Ubuntu 12.04.

I see that we switched to lxml in bd544ba, but even the test introduced in that merge fails for me. Looking deeper, beautiful soup is returning an empty document.

Dies on attempt to parse ofx 2.11 documents with <?xml declaration

I suspect this is beautiful soup being rubbish.

Here is my version of beautiful soup

python -c 'import BeautifulSoup; print BeautifulSoup.__version__'
3.2.1

Here's a patch that "fixes" the issue, but I don't fully
understand what's going on. It also illustrates that
beautiful soup is getting fed the whole document.

--- a/ofxparse/ofxparse.py
+++ b/ofxparse/ofxparse.py
@@ -191,8 +191,12 @@ class OfxPreprocessedFile(OfxFile):
                 tag_name = re.findall(r'(?i)<([a-z0-9_\.]+)>', token)[0]
                 if tag_name.upper() not in closing_tags:
                     last_open_tag = tag_name
-            new_fh.write(token)
+
+            if not is_processing_tag:
+                new_fh.write(token)
+
         new_fh.seek(0)
+        print new_fh.getvalue()
         self.fh = new_fh

Here is a sanitized document that exhibits the behaviour

<?xml version="1.0" encoding="US-ASCII"?>
<?OFX OFXHEADER="200" VERSION="200" SECURITY="NONE" OLDFILEUID="NONE" NEWFILEUID="NONE"?>
<!-- Converted from: QIF -->
<!-- Date format was: DD/MM/YY -->
<OFX>
  <SIGNONMSGSRSV1>
    <SONRS>
      <STATUS>
        <CODE>0</CODE>
        <SEVERITY>INFO</SEVERITY>
        <MESSAGE>SUCCESS</MESSAGE>
      </STATUS>
      <DTSERVER>20151230</DTSERVER>
      <LANGUAGE>ENG</LANGUAGE>
      <FI>
        <ORG>UNKNOWN</ORG>
        <FID>UNKNOWN</FID>
      </FI>
    </SONRS>
  </SIGNONMSGSRSV1>
  <CREDITCARDMSGSRSV1>
    <CCSTMTTRNRS>
      <TRNUID>0</TRNUID>
      <STATUS>
        <CODE>0</CODE>
        <SEVERITY>INFO</SEVERITY>
        <MESSAGE>SUCCESS</MESSAGE>
      </STATUS>
      <CCSTMTRS>
        <CURDEF>USD</CURDEF>
        <CCACCTFROM>
          <ACCTID>UNKNOWN</ACCTID>
        </CCACCTFROM>
        <BANKTRANLIST>
          <DTSTART>20151203</DTSTART>
          <DTEND>20151230</DTEND>
          <STMTTRN>
            <TRNTYPE>DEBIT</TRNTYPE>
            <DTPOSTED>20151230</DTPOSTED>
            <TRNAMT>-3.45</TRNAMT>
            <FITID>UNKNOWN-CREDITCARD-20151230-3--3.45</FITID>
            <NAME>TESCO-STORES 2610</NAME>
          </STMTTRN>
        </BANKTRANLIST>
        <LEDGERBAL>
          <BALAMT>UNKNOWN</BALAMT>
          <DTASOF>20151230</DTASOF>
        </LEDGERBAL>
        <AVAILBAL>
          <BALAMT>UNKNOWN</BALAMT>
          <DTASOF>20151230</DTASOF>
        </AVAILBAL>
      </CCSTMTRS>
    </CCSTMTTRNRS>
  </CREDITCARDMSGSRSV1>
</OFX>

ofxparse module should have a version number attribute

This is something I look for when I have a module installed. ofxparse should support this as well.

Make code PEP 8 compliant

It would be nice to have the code match the PEP 8 style guide. This would mean a version 1.0 release, so it may or may not be feasible.

Empty tags

My bank exports files with empty tags here and there, for example:
<FI><ORG/><FID/></FI>

The empty tag syntax is valid XML but the parser doesn't like it. I can remove empty tags of course, but then I have to process all my bank statements. It would be much better if ofxparse sees the empty tag format and ignores the tags.

Beautiful Soup 4 no longer supports Python versions < 2.7

Current releases of Beautiful Soup 4 have dropped support for Python 2.6 and earlier. Since this is a major dependency, suggest changing ofxparse to match.

Flawed tests in test_parse.py?

I've written an extension to Beautiful Soup 4 (bs4) which I believe is much more robust at handling ofx formatted files than any of the various options currently available in Beautiful Soup 4. When I test it in ofxparse using nosetests, it passes all tests except testThatParseStmtrsReturnsAnAccount and testThatReturnedAccountAlsoHasAStatement.

Both contain the line:

account = OfxParser.parseStmtrs(stmtrs.find('stmtrs'), AccountType.Bank)[0]

In looking at the definition of parseStmtrs, it appears the first argument (stmtrs_list) should be a list of stmtrs Tags rather than the single Tag returned by stmtrs.find('stmtrs'). When I change the line to:

account = OfxParser.parseStmtrs(stmtrs.find_all('stmtrs'), AccountType.Bank)[0],

the tests work fine. I believe this is because because stmtrs.find_all('stmtrs') returns a list of stmtrs Tags as expected by parseStmtrs.

Is this a case of a malformed test lurking in the repository, or is it simply my thoughts that are malformed?

Document how to run unit tests.

Multiple account

Hi,

I may be mistaken but my bank enable me to select multiple account when downloading the ofx file.

It takes the form of several STMTRS

It would be nice if it was the case in ofxparse :)

Thanks

[medium] Parse dates in %d%m%y format

The HSBC Brasil ofx file parse fail because BANKTRANLIST :: DTSTART tag is in %d%m%y format.

The error happens on ofxparse.py line 396. I fixed locally by changing to this code:

        try:
            return datetime.datetime.strptime(
                ofxDateTime[:8], '%Y%m%d') - timeZoneOffset
        except:
            return datetime.datetime.strptime(
                ofxDateTime[:6], '%d%m%y') - timeZoneOffset

I don't know if is right fix that way, but i want to share this problem.

Tks,

Diogo.

BeautifulSoup warning

Late versions of BeautifulSoup issue this warning:

c:\program files\python 3.5\lib\site-packages\bs4\__init__.py:166: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

To get rid of this warning, change this:

 BeautifulSoup([your markup])

to this:

 BeautifulSoup([your markup], "html.parser")

  markup_type=markup_type))

OfxParserException : Invalid Transaction Amount

When you want to parse an OFX file from a Belgian bank (e.g. FORTIS),
you have an 'Invalid Transaction Amount' exception from Ofx parser.

The transaction amounts are represented with a comma as decimal point.

Client Traceback (most recent call last):
File "/media/data/openerp/server/openerp/addons/web/http.py", line 204, in dispatch
response["result"] = method(self, *_self.params)
File "/media/data/openerp/server/openerp/addons/web/controllers/main.py", line 1129, in call_button
action = self._call_kw(req, model, method, args, {})
File "/media/data/openerp/server/openerp/addons/web/controllers/main.py", line 1117, in _call_kw
return getattr(req.session.model(model), method)(_args, **kwargs)
File "/media/data/openerp/server/openerp/addons/web/session.py", line 42, in proxy
result = self.proxy.execute_kw(self.session._db, self.session._uid, self.session._password, self.model, method, args, kw)
File "/media/data/openerp/server/openerp/addons/web/session.py", line 30, in proxy_method
result = self.session.send(self.service_name, method, *args)
File "/media/data/openerp/server/openerp/addons/web/session.py", line 103, in send
raise xmlrpclib.Fault(openerp.tools.ustr(e), formatted_info)

Server Traceback (most recent call last):
File "/media/data/openerp/server/openerp/addons/web/session.py", line 89, in send
return openerp.netsvc.dispatch_rpc(service_name, method, args)
File "/media/data/openerp/server/openerp/netsvc.py", line 292, in dispatch_rpc
result = ExportService.getService(service_name).dispatch(method, params)
File "/media/data/openerp/server/openerp/service/web_services.py", line 626, in dispatch
res = fn(db, uid, _params)
File "/media/data/openerp/server/openerp/osv/osv.py", line 190, in execute_kw
return self.execute(db, uid, obj, method, *args, *_kw or {})
File "/media/data/openerp/server/openerp/osv/osv.py", line 132, in wrapper
return f(self, dbname, _args, *_kwargs)
File "/media/data/openerp/server/openerp/osv/osv.py", line 199, in execute
res = self.execute_cr(cr, uid, obj, method, _args, *_kw)
File "/media/data/openerp/server/openerp/osv/osv.py", line 187, in execute_cr
return getattr(object, method)(cr, uid, _args, *_kw)
File "/media/data/openerp/addons/account_statement_base_import/wizard/import_statement.py", line 108, in import_statement
context=context
File "/media/data/openerp/addons/account_statement_base_import/statement.py", line 150, in statement_import
result_row_list = parser.parse(file_stream)
File "/media/data/openerp/addons/account_statement_base_import/parser/parser.py", line 148, in parse
self._parse(args, *kwargs)
File "/media/data/openerp/addons/account_statement_ofx_import/parser/ofx_parser.py", line 69, in parse
ofx = ofxparse.OfxParser.parse(file(ofx_file.name))
File "/usr/local/lib/python2.7/dist-packages/ofxparse-0.14-py2.7.egg/ofxparse/ofxparse.py", line 345, in parse
ofx_obj.accounts += cls.parseStmtrs(stmtrs_ofx, AccountType.Bank)
File "/usr/local/lib/python2.7/dist-packages/ofxparse-0.14-py2.7.egg/ofxparse/ofxparse.py", line 646, in parseStmtrs
account.statement = cls.parseStatement(stmtrs_ofx)
File "/usr/local/lib/python2.7/dist-packages/ofxparse-0.14-py2.7.egg/ofxparse/ofxparse.py", line 736, in parseStatement
cls.parseTransaction(transaction_ofx))
File "/usr/local/lib/python2.7/dist-packages/ofxparse-0.14-py2.7.egg/ofxparse/ofxparse.py", line 792, in parseTransaction
six.u("Invalid Transaction Amount: '%s'") % amt_tag.contents[0])
OfxParserException: Invalid Transaction Amount: '-72,00'

Can't handle ofx 2.11 documents with <?xml declarations

The parser complains about the document being empty.

This was true in 9d7b66e41ffb7857f94c4d691c67bf38ae03538fESC

I suspect that this is beautiful soup being rubbish.
My experiences with beautiful soup have in general
not been very positive. Is there an equivalent
of lxml.etree.HTML for buggy xml?
(Sorry about the F.U.D :/ - I'm too lazy to back
up my statements with facts)

For reference my beautiful soup version is 3.2.1

Anyway, the following patch seemed to fix the issue
but I don't fully understand what's going on, and
I've reached my quota of yack shaving for today...

--- a/ofxparse/ofxparse.py
+++ b/ofxparse/ofxparse.py
@@ -191,8 +191,12 @@ class OfxPreprocessedFile(OfxFile):
                 tag_name = re.findall(r'(?i)<([a-z0-9_\.]+)>', token)[0]
                 if tag_name.upper() not in closing_tags:
                     last_open_tag = tag_name
-            new_fh.write(token)
+
+            if not is_processing_tag:
+                new_fh.write(token)
+
         new_fh.seek(0)
+        # Without the is_processing_tag, this shows
+        # that the *full* document is fed into BeautifulSoup
+        print new_fh.getvalue()
         self.fh = new_fh

Here's a sanitized document that consistently exhibits the bug

<?xml version="1.0" encoding="US-ASCII"?>
<?OFX OFXHEADER="200" VERSION="200" SECURITY="NONE" OLDFILEUID="NONE" NEWFILEUID="NONE"?>
<!-- Converted from: QIF -->
<!-- Date format was: DD/MM/YY -->
<OFX>
  <SIGNONMSGSRSV1>
    <SONRS>
      <STATUS>
        <CODE>0</CODE>
        <SEVERITY>INFO</SEVERITY>
        <MESSAGE>SUCCESS</MESSAGE>
      </STATUS>
      <DTSERVER>20151230</DTSERVER>
      <LANGUAGE>ENG</LANGUAGE>
      <FI>
        <ORG>UNKNOWN</ORG>
        <FID>UNKNOWN</FID>
      </FI>
    </SONRS>
  </SIGNONMSGSRSV1>
  <CREDITCARDMSGSRSV1>
    <CCSTMTTRNRS>
      <TRNUID>0</TRNUID>
      <STATUS>
        <CODE>0</CODE>
        <SEVERITY>INFO</SEVERITY>
        <MESSAGE>SUCCESS</MESSAGE>
      </STATUS>
      <CCSTMTRS>
        <CURDEF>USD</CURDEF>
        <CCACCTFROM>
          <ACCTID>UNKNOWN</ACCTID>
        </CCACCTFROM>
        <BANKTRANLIST>
          <DTSTART>20151203</DTSTART>
          <DTEND>20151230</DTEND>
          <STMTTRN>
            <TRNTYPE>DEBIT</TRNTYPE>
            <DTPOSTED>20151230</DTPOSTED>
            <TRNAMT>-3.45</TRNAMT>
            <FITID>UNKNOWN-CREDITCARD-20151230-3--3.45</FITID>
            <NAME>TESCO-STORES 2610</NAME>
          </STMTTRN>
        </BANKTRANLIST>
        <LEDGERBAL>
          <BALAMT>UNKNOWN</BALAMT>
          <DTASOF>20151230</DTASOF>
        </LEDGERBAL>
        <AVAILBAL>
          <BALAMT>UNKNOWN</BALAMT>
          <DTASOF>20151230</DTASOF>
        </AVAILBAL>
      </CCSTMTRS>
    </CCSTMTTRNRS>
  </CREDITCARDMSGSRSV1>
</OFX>

ofxparse and unseen SIC/MCC codes

The ofxparser uses a dictionary of known sic/mcc codes. However, when it encounters an unseen sic code it causes an exception. I think that should fall under the fail_fast argument since it isn't a major issue.

[medium] ofxparse wiki pages

Want to have the wiki pages contain documentation on ofxparse.

ofxparse homepage

Want to have a homepage for ofxparse that welcomes people. github appears pretty unfriendly.

[easy] README file does not show attributes of transactions

The readme file does not show attributes of transactions. The transaction.payee thing is confusing for users because it is the "name" field in the ofx file. At least document it in the readme and maybe make it available as a name attribute on the Transaction object as well.

[easy] FTIDs in test fixtures should be unique

Our test fixtures are useful to other projects as a test corpus from several different banking institutions. I got this message from someone:

achiang oh, one more thing, i think some of the anonymization in the test fixtures went a little too far. the FTIDs are supposed to be unique afaict, but they are not, at least not in the vanguard one
achiang check out the fidelity ofx file i supplied as an example
achiang yeah, nothing in ofxparse really cares about those FITIDs, but consumers of the library probably do. check out my https://github.com/achiang/spymark as an example, where i have a test that ensures no duplicate transactions were imported

Apparently the FTIDs in the test files are not unique, which violates the OFX spec. Can someone go through and manually fix files with this issue? (/tests/fixtures/*.ofx)

ofxparse relies on BeautifulSoup
BeautifulSoup4 supports both python 2.6+ and 3. This would mean dropping support for python 2.4 and 2.5. It is getting to be a pain to maintain build machines for python 2.4 and 2.5.
I want to release a new version of ofxparse where the method names and stuff are pep8 compliant. This would also be a good time to switch over to bs4 and support python 3.
start with a bs4 branch in git.

The PyPi page looks like crap

It looks like the information page on PyPi is missing line endings or something. Fix.