Giter Club home page Giter Club logo

simfin-tutorials's Introduction

docs downloads

SimFin - Simple financial data for Python

SimFin makes it easy to obtain and use financial and stock-market data in Python. It automatically downloads share-prices and fundamental data from the SimFin server, saves the data to disk for future use, and loads the data into Pandas DataFrames.

Installation

pip install simfin

More detailed installation instructions can be found below.

API-Key

To download data from SimFin you need an API-key, you can get one for free by registering on simfin.com. Once you registered, you can find your API-key here. The free datasets contain less data than the paid SimFin+ datasets and some datasets are only available to SimFin+ users. Visit SimFin for a comparison of the free and paid data versions.

Example

Once the simfin package has been installed and you got your API-key, the following Python program will automatically download all Income Statements for US companies, and print the Revenue and Net Income for Microsoft.

import simfin as sf
from simfin.names import *

# Set your API-key for downloading data.
# Replace YOUR_API_KEY with your actual API-key.
sf.set_api_key('YOUR_API_KEY')

# Set the local directory where data-files are stored.
# The dir will be created if it does not already exist.
sf.set_data_dir('~/simfin_data/')

# Load the annual Income Statements for all companies in the US.
# The data is automatically downloaded if you don't have it already.
df = sf.load_income(variant='annual', market='us')

# Print all Revenue and Net Income for Microsoft (ticker MSFT).
print(df.loc['MSFT', [REVENUE, NET_INCOME]])

This produces the following output:

                  Revenue   Net Income
Report Date
2008-06-30   6.042000e+10  17681000000
2009-06-30   5.843700e+10  14569000000
2010-06-30   6.248400e+10  18760000000
2011-06-30   6.994300e+10  23150000000
2012-06-30   7.372300e+10  16978000000
2013-06-30   7.784900e+10  21863000000
2014-06-30   8.683300e+10  22074000000
2015-06-30   9.358000e+10  12193000000
2016-06-30   9.115400e+10  20539000000
2017-06-30   9.657100e+10  25489000000
2018-06-30   1.103600e+11  16571000000
2019-06-30   1.258430e+11  39240000000

We can also load the daily share-prices and plot the closing share-price for Microsoft (ticker MSFT):

# Load daily share-prices for all companies in USA.
# The data is automatically downloaded if you don't have it already.
df_prices = sf.load_shareprices(market='us', variant='daily')

# Plot the closing share-prices for ticker MSFT.
df_prices.loc['MSFT', CLOSE].plot(grid=True, figsize=(20,10), title='MSFT Close')

This produces the following image:

Share-price for MSFT

Documentation

Installation (Detailed Instructions)

The best way to install simfin and use it in your own project, is to use a virtual environment. You write the following in a Linux terminal:

virtualenv simfin-env

You can also use Anaconda instead of a virtualenv:

conda create --name simfin-env python=3

Then you can install the simfin package inside that virtual environment:

source activate simfin-env
pip install simfin

If the last command fails, or if you want to install the latest development version from this GitHub repository, then you can run the following instead:

pip install git+https://github.com/simfin/simfin.git

Now try and put the above example in a file called test.py and run:

python test.py

When you are done working on the project you can deactivate the virtualenv:

source deactivate

Development

If you want to modify your own version of the simfin package, then you should clone the GitHub repository to your local disk, using this command in a terminal:

git clone https://github.com/simfin/simfin.git

This will create a directory named simfin on your disk. Then you need to create a new virtual environment, where you install your local copy of the simfin package using these commands:

conda create --name simfin-dev python=3
source activate simfin-dev
cd simfin
pip install --editable .

You should now be able to edit the files inside the simfin directory and use them whenever you have a Python module that imports the simfin package, while you have the virtual environment simfin-dev active.

Testing

Two kinds of tests are provided with the simfin package:

Unit Tests

Unit-tests ensure the various functions of the simfin package can run without raising exceptions. The unit-tests generally do not test whether the data is valid. These tests are mainly used by developers when they make changes to the simfin package.

The unit-tests are run with the following commands from the root directory of the simfin package:

source activate simfin-env
pytest

Data Tests

Data-tests ensure the bulk-data downloaded from the SimFin servers is valid. These tests are mainly used by SimFin's database admin to ensure the data is always valid, but the end-user may also run these tests to ensure the downloaded data is valid.

First you need to install nbval, which enables support for Jupyter Notebooks in the pytest framework. This is not automatically installed with the simfin package, so as to keep the number of dependencies minimal for normal users of simfin. To install nbval run the following commands:

source activate simfin-env
pip install nbval

Then you can run the following commands from the root directory of the simfin package to execute both the unit-tests and data-tests:

pytest --nbval-lax

The following command only runs the data-tests:

pytest --nbval-lax -v tests/test_bulk_data.ipynb

More Tests

The tutorials provide more realistic use-cases of the simfin package, and they can also be run and tested automatically using pytest. See the tutorials' README for details.

Credits

The database is created by SimFin. The Python API and download system was originally designed and implemented by Hvass Labs. Further development of the Python API by SimFin and the community.

License (MIT)

This is published under the MIT License which allows very broad use for both academic and commercial purposes.

You are very welcome to modify and use this source-code in your own project. Please keep a link to the original repository.

simfin-tutorials's People

Contributors

dependabot[bot] avatar hvass-labs avatar shivam-nirhali avatar thf24 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

simfin-tutorials's Issues

Masking on specific date

Hello,

Sorry if this is a basic question! I have a MultiIndex df similar to the one in the Screener tutorial. I want to add a date mask for when Date == current_date. Where current date is for example current_date = datetime.datetime(2020, 7, 28)

Similarly, I'd also like to do the same but with a Ticker filter too. So Date == current_date and Ticker == 'AAPL' for example.

When I try setting this up I am struggling with different errors. I've tried a few things. Could you point me in the right direction please? Many thanks!

Here's what I've tried:


date_limit = datetime.now() - timedelta(days=90)
mask_date_limit = (df_all_signals.reset_index(DATE)[DATE] == date_limit)
mask = (df_all_signals[CURRENT_RATIO] > mask_current_ratio)
mask &= (df_all_signals[ROE] > mask_roe)
mask &= mask_date_limit


---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-32-32feb6bad999> in <module>()
     13 mask &= (df_all_signals[ROA] > mask_roa)
     14 mask &= (df_all_signals[ROE] > mask_roe)
---> 15 mask &= mask_date_limit

9 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/indexes/base.py in _join_level(self, other, level, how, return_indexers, keep_order)
   3692         if not right.is_unique:
   3693             raise NotImplementedError(
-> 3694                 "Index._join_level on non-unique index is not implemented"
   3695             )
   3696 

NotImplementedError: Index._join_level on non-unique index is not implemented

Tutorial 5- 'If None then all tickers are used'

Hi there, thank you for these amazing tutorials! When I try to run tutorial 5 for all stocks (leave tickers list empty in cell 6) then no tickers appear in my data later when I call df.head(). Am I doing something incorrectly here ? Thanks

# List of tickers we want. If None then all tickers are used.
tickers = []

function to download only specific companies data

Hello! I would like to know if there is a way to only download specific company data. I want to time to time review some company financials (about 30 companies). Does not make sense to download the entire dataset just for data

Thanks in advance.,

Divide by zero error chapter 9

I'm getting the following error when trying to run this command.


%%time
df_fin_signals = hub.fin_signals(variant='daily')

RuntimeWarning: divide by zero encountered in log10
result = getattr(ufunc, method)(*inputs, **kwargs)
RuntimeWarning: invalid value encountered in log10
result = getattr(ufunc, method)(*inputs, **kwargs)

TypeError: expected str, bytes or os.PathLike object, not NoneType

I run this code and I am getting an error:

import simfin as sf
sf.set_api_key(api_key='free')
df_prices = sf.load_shareprices(variant='daily', market='us')

Here is the error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-d968c738d8e3> in <module>
     11 
     12 # %%time
---> 13 df_prices = sf.load_shareprices(variant='daily', market='us')

~\Miniconda3\lib\site-packages\simfin\load.py in load(dataset, variant, market, parse_dates, index, refresh_days)
    130 
    131     # Download file if it does not exist on local disk, or if it is too old.
--> 132     _maybe_download_dataset(**dataset_args, refresh_days=refresh_days)
    133 
    134     # Lambda function for converting strings to dates. Format: YYYY-MM-DD

~\Miniconda3\lib\site-packages\simfin\download.py in _maybe_download_dataset(refresh_days, **kwargs)
    281 
    282     # Full path for the local data-file.
--> 283     path = _path_dataset(**kwargs)
    284 
    285     # Full path for the downloaded file.

~\Miniconda3\lib\site-packages\simfin\paths.py in _path_dataset(**kwargs)
     66     """
     67     filename = _filename_dataset(extension='csv', **kwargs)
---> 68     path = os.path.join(get_data_dir(), filename)
     69     return path
     70 

~\Miniconda3\lib\ntpath.py in join(path, *paths)
     74 # Join two (or more) paths.
     75 def join(path, *paths):
---> 76     path = os.fspath(path)
     77     if isinstance(path, bytes):
     78         sep = b'\\'

TypeError: expected str, bytes or os.PathLike object, not NoneType

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.