Personal Finance library for the Computationally Curious.
pfcompute
is a python library for computationally exploring personal finance. The library provides a computing environment with data structures, common equations/algorithms and convenience functions to calculate, analyze and visualize existing personal finance data. It is not meant to be a full service financial app or replace any software used to manage personal finances. It does not collect data. However, once data is provided, it allows a much larger degree of freedom in processing and analyzing financial data than typical personal finance apps supply.
While this library may be used internally to other software, it was intended to be used in a interactive computing fashion with software such as jupyter. This type of environment, along with the library, provides a quick and easy way to explore complex data, relations and algorithms.
Since pfcompute
does not aggregate or collect it's own data, it requires data to be provided by some other means (e.g. mint, personalcapital, ynab, yodlee, a bank, etc.). It is left up to the user to obtain this data however they wish. There are three main data sources assumed to exist by the library:
- Accounts
- Transactions
- Paychecks
They can really be in any format (xls(x)
, csv
, json
, pdf
) as long as the user writes some method to interpret the data. The data format and some collection ideas are described below.
The following checklist describes the current ([x]
) and future ([ ]
) capabilities of pfcompute
and what is currently (:pencil:) being worked on:
- Data input
- Import convenience functions
- Overridable data cleaning functions
- Generic paycheck import from folder of
pdf
s (user must write internal parser) - User defines account & transaction categories
- Generate financial statements
- Balance sheet
- Income statement
- Cashflow statement
- Account Summary
- Net worth analysis
- Net worth calculator
- Growth calculator & analyzer
- Milestone status
- Performance metrics (DTI, Margin/Savings Rate, Income to Net, etc.)
- Account Forecasting
- Autoregressive Integrated Moving Average (ARIMA) modeling
- Autoregressive Integrated Moving Average (ARIMA) forecasting
- Sum of Square Error (SSE) distribution modeling
- ๐ Monte Carlo forecasting
- Assumption Based Forecasting
- Assumption (savings rate, expense, growth) modeler
- Investment Forecasting
- Index (e.g. Vanguard) based asset data
- Asset allocation correlation
- Asset allocation modeling
- Monte Carlo Asset forecasting
- Visualization
- Time series plotting
- Probability Distribution Function (PDF) plotting
- ๐ Sankey Money Flow Diagrams
- Report Generation
- Annual
- Monthly
-
html
-
pdf
- Example Average Data (Useful for Comparisons)
- Extracted personal finance data from FRED
- ๐ Create semi-random personal finance models
import pfcompute as pf
View the notebooks to see more detail.
The format needed by pfcompute
for each set of data is pandas DataFrame
s with the following format:
-
Accounts
DataFrame of account balances with Date index and (Category, Account Name) multi-columns:
| | Cash | Investment | Credit | Loan | | Date | Ally Checking | Ally Savings | LendingClub | Betterment | BofA | Student | | Oct 15 | 1000.00 | 5000.00 | 10000.00 | 30000.00 | -1000.00 | -10000.00 |
Date could be any period... however, currently assumes month end and one row per date.
-
Transactions
DataFrame of transaction details with Date index and Field name columns:
| Date | Amount | Account | Category | Label | | 2015-10-15 | 2000.00 | Ally Checking | Paycheck | {} | | 2015-10-15 | -1000.00 | Ally Checking | Rent | {} |
Can have multiple transactions per date
-
Paychecks
DataFrame of paycheck details with Date index and Field name columns:
| Date | Total Gross | Tax | Pre Tax Deductions | Post Tax Deductions | Net Pay | | 2015-10-15 | 2000.00 | 500.00 | 500.00 | 0.0 | 1000.00 |
Can have multiple paychecks per date
Optional data sets include: Credit Limits and Miscellaneous. These can be useful to keep track of but may require additional effort to record.
-
Credit Limits DataFrame of account limits with Date index and (Category, Account Name) multi-columns, similar to Account data.
-
Miscellaneous DataFrame of whatever Miscellaneous data is applicable with Date index and Field/Account columns, similar to Account data
A Google Sheets of month end account values.
Mint transaction can be "downloaded" using the following code in the developer console of a typical authenticated browser session (only tested on Chrome):
// Constants
L = 100;
transactions = [];
offset = 0;
url = 'https://wwws.mint.com/app/getJsonData.xevent?accountId=0&offset={}&task=transactions,txnfilters&rnd=###';
// Download each page of transactions from Mint.com
function getNextData() {
console.log('offset: ' + offset);
jQuery.getJSON(url.replace('{}', offset),function( rsp ) {
data = rsp['set'][0]['data'];
transactions.push.apply(transactions, data);
L = data.length;
offset = offset + 100;
if (L == 100) {
getNextData();
}
});
}
getNextData();
This javascript
object will need to be copied into a text file and saved (e.g. transaction.json
).
You can copy a console variable with the following code, then paste (cmd+v
|| ctrl+v
) into a text editor:
// Send to clipboard
copy(transactions)
This is the most difficult one... You will have to get dirty and roll your own custom implementation. Assuming you get a pdf
paycheck, you must create a pdf
parser. The library has an example of this and provides a framework to make is easier.
Good Luck :)
Generally, and for code-to-be, pfcompute
is intended to be used within the anaconda distribution and its packages plus a few other libraries (i.e. tabulate
, pdfminer
).
Specifically, it currently uses:
- matplotlib
- numpy
- pandas
- pdfminer
- scipy
- statsmodels