Giter Club home page Giter Club logo

eventstudy's Introduction

Event Study package

Event Study package is an open-source python project created to facilitate the computation of financial event study analysis.

Install

$ pip install eventstudy

Documentation

You can read the full documentation here.

Go through the Get started section to discover through simple examples how to use the eventstudy package to run your event study for a single event or a sample of events.

Read the API guide for more details on functions and their parameters.

eventstudy's People

Contributors

bkrayfield avatar lemairejean-baptiste avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

eventstudy's Issues

Import Returns

Hi this may be a silly question but it is not entirely clear to me.

When we import returns, do we import:

  1. the price of stock or
  2. percentage change from previous day or
  3. some other calculated return

I'm slightly confused because the returns csv files in examples folder seems to be the percent change and not price.

Thanks!

Query on the computation of standard deviation of abnormal return AR in constant mean model

Hello,

I noticed in the constant mean model, the std(AR) is computed across the whole timeline (estimation & event window) rather than the estimation window alone, as per my understanding from A. Craig Mackinlay's paper[1] and EventStudyTools website[2]. This also aligns with your implementation of the OLS model.

var = [np.var(residuals)] * event_window_size

If my observation is correct, is it okay if I contribute to the repo?

PS: Thank you for the great repo. You have taught me how to implement an event study myself.

Reference:
[1] A. Craig Mackinlay, Event Studies in Economics and Finance
[2] https://www.eventstudytools.com/significance-tests, Some Preliminaries section

Interface for data import

Thanks for the package!

Why does import has to be file or API - the user can have some dataframe with data (eg from web), seems one must save data to file to allow import?

Maybe there should be some method.import_returns_from_dataframe()? Can help implementing it, if it is a good idea for the project.

Also the decision to propagate import inside whole class - is it really worth it? Why does class need to be aware of the data?

https://lemairejean-baptiste.github.io/eventstudy/api/eventstudy.Single.import_returns.html#eventstudy-single-import-returns

statistics variance

Dear,
I was wondering what formula u were using for the t-test and its variance.
I am using the Fama and French three factor model with the linear regression function: residuals, df, var_res, model = Model(estimation_size, event_window_size, keep_model).OLS(X, Y). I saw something in the statistics code:
image and
image
but I do not completely understand this formulation.
Is the variance formula the same like the following, but formulated in a more 'python' way?
image
with 4 the numbers of parameters needed to estimate the abnormal return (FF: one constant, three factors?)
Kind regards and thank you for the amazing package.

.results

Hi, thank you for the package!

I am having some trouble showing the results when calling event.results(decimals=[3,5,3,5,2,2]).

The plot is working properly as is the get_CAR_dist().

The error I am receiving is:
if not indexes and not raw_lengths:
TypeError: object of type 'map' has no len()

Do you have any idea where I might have gone wrong?

Thanks again //agarp

Segmented regressions with optional variables for identifying drivers of abnormal returns

Hey @LemaireJean-Baptiste ,

wouldn't it be helpful to extend the given functionality by being able to call a method that extends the given dataset by other similarly-formatted variables (such as oil prices or size of the firm, for instance) for running a segmented regression with the abnormal returns as the independent variable, analyzing corresponding coefficients and therefore potentially draw conclusions on the varying impact of included dependent variables as key drivers?

In M&A research, the significance of firm acquisitions (or generally certain events) on the shareholder value of the firm (or the stock price) is oftentimes analyzed by 1) determining whether statistically-significant abnormal returns associated with a particular event can be established and 2) adopt a more medium- to long-term view by investigating the drivers of varying abnormal returns by employing a segmented regression.

What do you think? Because when I was using your module for accomplishing step 1), I often wished that step 2) could handily applied as well (this can be achieved with any Python package that allows specifying segmented regressions, however it would be considerably more comfortable for this use case to have everything included in a single package.

Kind regards,
Andreas

Contribution to the computation of daily stock return, i.e. a preprocessing module

Hello,

I wonder if it's a good idea to extend this package to cover the data preprocessing, i.e. transform data into ready-to-consumed before feeding it into this package.

To elaborate further, when I research event study, I realize there are two different ways to compute daily stock return, that are:

  • $R_t = S_t/S_{t-1} -1$ with $S_t$ is stock value at day $t$ and $R_t$ is the correspondingly computed day-to-day return.
  • $R_t = log(S_t/S_{t-1})$ similarly but have good distributional characteristics.

Also, when I work on a use case where data could span in both negative and positive zones, I must develop a new variant for the first approach to handle corner cases dealing with 0. I'm new to this and am unsure if an event study can be applied to this case.

Please let me know if this idea is good and if I can contribute to this feature.
Thank you :))

# postive and negative

Hi there,

in the multiple event study setting, is there any way of attaining the number or percentage of positive and negative cases for the CAAR.

Included an example in the screenshot:

Screen Shot 2022-10-30 at 11 15 19 am

Cheers!!

Exporting Multiple Sets of Data

Hi,

I am trying to export my data to Excel but am having issues exporting multiple sets of data at once.

I ran multiple single events and want to export them at once, rather than manually having to export them individually. Intuitively, I have tried the following code but it does not work (I only get the data for the last event):

for event in results_full:
event.to_excel('test1.xlsx')

I am looking to retrieve data for an aggregate set of events (multiple class) and data for each individual event that make up the aggregate set (single class). Is there an efficient way to do this?

Thank you!

NaN output

Screen Shot 2022-09-17 at 3 20 44 pm

Hey!

Just wondering why I might be getting NaN output (see attached). For context, the package works for a smaller estimation window i.e., approx 30-40 days. But some reason it all goes to NaN after a certain threshold. I checked to see if there was enough data to estimate the model/ if there were any outliers etc. but doesn't seem to be the case.

Other securities work fine!

Cheers

What are the requirements for the returns file?

Hey, first of all thank you for creating this amazing package, I like it a lot!
I am just struggling with the returnsfile. What are the requirements?

I created my csv file containing stockprices with yahoofinancials, creating an output like this:
image
But when I run the event study, I do not get the right output, I get an error:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'date'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/yanah/Documents/BEDRIJFSECONOMIE/Thesis/Results/Event1.py", line 5, in
es.Single.import_returns('prices.csv', is_price=True, log_return=False, date_format= '%Y-%m-%d')
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/eventstudy/single.py", line 327, in import_returns
data = read_csv(path, format_date=True, date_format=date_format)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/eventstudy/utils.py", line 112, in read_csv
df[date_column] = pd.to_datetime(df[date_column], format=date_format)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/frame.py", line 3458, in getitem
indexer = self.columns.get_loc(key)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
raise KeyError(key) from err
KeyError: 'date'

What am I doing wrong with my file?

Kind regards

Validating returns_GAFAM.csv file

Hi. I tried to validate the AAPL returns in the returns_GAFAM.csv file and found differences on the dividend dates. It would appear that the returns are underestimated as if the dividends had been ignored.

From Yahoo Finance
based on
pct_change of
Adj Closing Price
2012-08-09 | 0.005703
2012-11-07 | -0.038263
2013-02-07 | 0.029734
2013-05-09 | -0.008724
2013-08-08 | -0.001991

From returns_GAFAM.csv
date AAPL
2012-08-09 0.001404
2012-11-07 -0.042635
2013-02-07 0.023767
2013-05-09 -0.015242
2013-08-08 -0.008538

Thank you.

possibly some gain to accept data as pandas TimeSeries?

def get_index_of_date(data, date: np.datetime64, n: int = 4):
# assume the date exist and there is only one of it in the dataset
# assume date are in index
for i in range(n + 1):
index = np.where(data == date)[0]
if len(index) > 0:
return index[0]
else:
date = date + np.timedelta64(1, "D")
# return None if there is no row corresponding to this date or n days after.
return None

Carhart Four-Factor Model

This is very helpful and easy to use โ€“ thank you!

I was wondering whether you could add the option of using the Carhart Four-Factor Model to modelise the returns. This is essentially the Fama-French Three-Factor Model with an additional factor, Momentum. Many research papers use this model so it would be extremely useful to have it.

Link to Carhart (1997): https://onlinelibrary.wiley.com/doi/10.1111/j.1540-6261.1997.tb03808.x

I am new to Python so am not sure how I could add this myself. Since it is similar to the Fama-French Three-Factor Model, I assume there is a way to edit that model by adding the Momentum data to it. One source of this data is https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html, but probably best to calculate using code.

t-test for AAR

Dear Jean-Baptiste,

maybe a cool improvement idea would be to also include the t-test and p-values of H1: AAR = 0 next to the t-test and p-values of H1: CAAR = 0.

Best,
Constantin

results of event study can't be show in a table

Hi, I am a beginner of python programming and I got a problem on results table.
I will truly appreciate if you can help me.
The following is my code:

event= es.Single.constant_mean( security_ticker= 'aapl', event_date= np.datetime64('2020-01-16'), event_window= (-2,+10), estimation_size= 30, buffer_size= 0)

event.results(decimals=[3,5,3,5,2,2])

TypeError Traceback (most recent call last)

in
5 estimation_size= 30,
6 buffer_size= 0)
----> 7 event.results(decimals=[3,5,3,5,2,2])

D:\Anaconda3\lib\site-packages\eventstudy\single.py in results(self, asterisks, decimals)
194 asterisks_dict=asterisks_dict,
195 decimals=decimals,
--> 196 index_start=self.event_window[0],
197 )
198

D:\Anaconda3\lib\site-packages\eventstudy\utils.py in to_table(columns, asterisks_dict, decimals, index_start)
28 )
29
---> 30 df = pd.DataFrame.from_dict(columns)
31 df.index += index_start
32 return df

D:\Anaconda3\lib\site-packages\pandas\core\frame.py in from_dict(cls, data, orient, dtype, columns)
1136 raise ValueError('only recognize index or columns for orient')
1137
-> 1138 return cls(data, index=index, columns=columns, dtype=dtype)
1139
1140 def to_numpy(self, dtype=None, copy=False):

D:\Anaconda3\lib\site-packages\pandas\core\frame.py in init(self, data, index, columns, dtype, copy)
390 dtype=dtype, copy=copy)
391 elif isinstance(data, dict):
--> 392 mgr = init_dict(data, index, columns, dtype=dtype)
393 elif isinstance(data, ma.MaskedArray):
394 import numpy.ma.mrecords as mrecords

D:\Anaconda3\lib\site-packages\pandas\core\internals\construction.py in init_dict(data, index, columns, dtype)
210 arrays = [data[k] for k in keys]
211
--> 212 return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
213
214

D:\Anaconda3\lib\site-packages\pandas\core\internals\construction.py in arrays_to_mgr(arrays, arr_names, index, columns, dtype)
49 # figure out the index, if necessary
50 if index is None:
---> 51 index = extract_index(arrays)
52 else:
53 index = ensure_index(index)

D:\Anaconda3\lib\site-packages\pandas\core\internals\construction.py in extract_index(data)
303 elif is_list_like(val) and getattr(val, 'ndim', 1) == 1:
304 have_raw_arrays = True
--> 305 raw_lengths.append(len(val))
306
307 if not indexes and not raw_lengths:

TypeError: object of type 'map' has no len()

the beginner examples can not be used?

event = es.EventStudy.FamaFrench_3factor(....
AttributeError: module 'eventstudy' has no attribute 'EventStudy'
event I delete es,,, still shows do not have FamaFrench_3factor~~

DataMissingError

I imported files according the sample csv's in the repo. The debugger is reporting {'error_msg': "Some data are missing for (Mkt-RF) in 'FamaFrench''.", 'error_type': 'DataMissingError', 'event_date': numpy.datetime64('2009-02-27T00:00:00.000000'), 'market_ticker': 'SPY', 'security_ticker': 'APPLE'}. But when I open the actual csv, the row does not miss any data. This error only appears for some rows. And if I just ignore and finish running, the output AR table is full of NaN.

I checked the dtypes if I import the files as a dataframe. All dtypes match with the repo sample csv. All columns names match. The only difference is the number of decimal places.

I also ran the code using the original sample csv from the repo. The code works fine. So it must be something with my file. Any ideas?

import numpy as np
import matplotlib.pyplot as plt
from eventstudy.single import Single
from eventstudy.multiple import Multiple

Single.import_returns('security_returns.csv')
Single.import_FamaFrench('factor_returns.csv')
release_10K = Multiple.from_csv(
    path = 'earnings_surprises.csv', # the path to the csv file created  
    event_study_model = Single.FamaFrench_3factor,
    event_window = (-5,+10),
    estimation_size = 200,
    buffer_size = 30,
    date_format = '%d/%m/%Y',
    ignore_errors = True
)

print(release_10K.results(decimals=[3,5,3,5,2,2]))

Single event study loop: one event on one company for multiple companies

Hey!
In the get started pdf, there is a tutorial on how to set up a multi event study for different events and companies using the eventstudy.Multiple.from_list. However, is there a possibility to make a loop for a single event study on different companies?
I want to study the CAR effect of one event A on company A, as well as on company B, C, D, E, ...
And how can I export all the results for every company to excel?
Sorry for the stupid question, I am not a python expert, but I have been looking for the solution for a while now.
Kind regards,

famafrench.csv

In the famafrench.csv, Is column Mkt-RF the same as (MktRet_t - RF_t)? Then why does the csv also contains RF column?

In fama french model Ret_(i,t) = alpha + beta *(MktRet_t - RF_t) + gamma*SMB_t + theta*HML_t where Ret_(i,t) is return of company i on day t, MktRet_t is market return on day t, RF_t risk free rate on day t, SMB_t is small minus big factor on day t. HML_t is high minus low factor on day t.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.