deshpandenu / time-series-forecasting-of-amazon-stock-prices-using-neural-networks-lstm-and-gan- Goto Github PK

Project analyzes Amazon Stock data using Python. Feature Extraction is performed and ARIMA and Fourier series models are made. LSTM is used with multiple features to predict stock prices and then sentimental analysis is performed using news and reddit sentiments. GANs are used to predict stock data too where Amazon data is taken from an API as Generator and CNNs are used as discriminator.

Jupyter Notebook 100.00%

time-series-forecasting-of-amazon-stock-prices-using-neural-networks-lstm-and-gan-'s People

Contributors

Stargazers

Watchers

Forkers

deng1689 agamat wangyoucao lyzl2010 itsmonterey gokulsg xaerowl augmen chirag111222 hanni-barca lesteraleong sbhadade williamwongys rob831 nhu2000 fmilthaler byq-luo manuknath hiteshai nimitkothari developeralgo8888 gaoyuanning doubleplusplus flyertea frankfan007 541764418 menglin-cao goodpupil haddouchebille yf292 juhi10071998 sudae827 navyapusapati anirbanghoshsbi shaun0001 luigivendetta ass77 tamerhamed hithere34114 foundersix manashty swagathmullangi 1895-art tongxf kfromcv stungkit gouthamkumar-r manateechen rupakgoyal oumaimamarbouh77 wtwong316 dee-z cloudfast-bit linuxpowerludo sanghonam7 jailukanna ajayarunachalam pilusass tiadwi xzl1996 popquants franksd welhzh jidhnyasa manaliraman feideer xinhen ffish14 henilchopra douglasmendes gentlly shreeja7sheth mustafaalahmid dibinsvds aviraljoshi23 longshen931 soi130 ulandz jtmancilla dhiakhouja xingquan-li cueb-qf-zqs jingth tacu03 kirankumarhs29 jeb1987 akankshadash techthiyanes good-repos wulingteen hardtouseweer cecery praxton74 zzcneverback nova-land jediyoda36 small-stars osundiranay manishthenua dainguyenvan

time-series-forecasting-of-amazon-stock-prices-using-neural-networks-lstm-and-gan-'s Issues

Gan prediction

Great work here Nupur. Just not sure why in notebook 5 you create an estimator but you don't use it to predict.

data leakage

Please note that there is the possibility of data leakage. The way data are standardize is very dangerous because you shift back in time future information. Just try to delete (.shift(-num_historical_days)) in your scaling method and you will see how results will get worser.

Prediction output

Hello @deshpandenu This is really a cool project. I am trying to use the model for prediction but unable to output it.

class Predict:
  def __init__(self, num_historical_days=20, days=10, pct_change=0, gan_model=f'{googlepath}/deployed_models/gan', cnn_modle=f'{googlepath}/deployed_models/cnn', xgb_model=f'{googlepath}/deployed_models/xgb'):
    self.data = []
    self.num_historical_days = num_historical_days
    self.gan_model = gan_model
    self.cnn_modle = cnn_modle
    self.xgb_model = xgb_model
    
    files = [f"{googlepath}stock_data/{f}" for f in os.listdir(f"{googlepath}stock_data")] 
    for file in files:
      
      print(file)
      df = pd.read_csv(file, index_col='timestamp', parse_dates=True)
      df = df[['open','high','low','close','volume']]
            # data for new column labels that will use the pct_change of the closing data.
            # pct_change measure change between current and prior element. Map these into a 1x2
            # array to show if the pct_change > (our desired threshold) or less than.
            
      df = ((df -
            df.rolling(num_historical_days).mean().shift(-num_historical_days))
            /(df.rolling(num_historical_days).max().shift(-num_historical_days)
            -df.rolling(num_historical_days).min().shift(-num_historical_days)))
      df = df.dropna()
      self.data.append((file.split('/')[-1], df.iloc[0], df[200:200+num_historical_days].values))
      #split the df into arrays of length num_historical_days and append
      # to data, i.e. array of df[curr - num_days : curr] -> a batch of values
      # appending if price went up or down in curr day of "i" we are lookin
      # at
      
      
  def gan_predict(self):
    tf.reset_default_graph()
    gan = GAN(num_features=5, num_historical_days=self.num_historical_days, generator_input_size=200, is_train=False)
    with tf.Session() as sess:
      sess.run(tf.global_variables_initializer())
      saver = tf.train.Saver()
      saver.restore(sess, self.gan_model)
      clf = joblib.load(self.xgb_model)
      for sym, date, data in self.data:
        features = sess.run(gan.features, feed_dict={gan.X:[data]})
        features = xgb.DMatrix(features)
        print('{} {} {}'.format(str(date).split(' ')[0], sym, clf.predict(features)[0][1] > 0.5))
        #predictions = np.array([x for x in gan_estimator.predict(p.gan_predict())])
        #print(predictions)

p = Predict()
p.gan_predict()

Questions:

How should I interpret the output? For eg: Is there a plot or time-series or forecasting that shows the prediction?
After running the prediction, how to interpret the output?
The prediction is using gan to get features and use xgboost to predict. How to use the cnn model for stacking?

Outdated Libraries - Error when running the example

Amazing project. Thanks for the contribution.

The code fails during the execution of TrainXGBoost due to the outdated library. XGBoost no longer allows a list of NumPy arrays. It always has to have the input as a 2D array.

dmlc/xgboost#3970

Answer to the XGBoost list input to DMatrix:
@markli123 Hi, I actually found the cause of the weird behavior. See dmlc/xgboost#3970 8. The list gets silently converted into scipy.sparse.csr_matrix, which causes unexpected behavior causing wrong prediction. This is why we later disallowed converting list into a DMatrix directly.

GAN

hey,
i have a couple of qouestions:

i would like to know what is the input (shape, content) of the generator, discriminator?
do have special matrix to calculte the loss of the GAN?
what is the architecter of the generator and the discriminator?
thank you

GAN

Hey Nupur,
I'm following your great notebooks :-) and in notebook five I came across two problems:

at the gan_estimator you have a variable that is - tfgan = tf.contrib.gan it works only in TensorFlow version below 1.15, do you have a solution for TensorFlow 2.3 because I could not find one...
in the class predict in def gan_predict there is a line - tf.reset_default_graph() that throws an error about nestef graphs but if i remove that line i get an error at this line - predictions = np.array([x for x in gan_estimator.predict(p.gan_predict())])
that there two varibles with the same name
thank you

deshpandenu / time-series-forecasting-of-amazon-stock-prices-using-neural-networks-lstm-and-gan- Goto Github PK

time-series-forecasting-of-amazon-stock-prices-using-neural-networks-lstm-and-gan-'s People

Contributors

Stargazers

Watchers

Forkers

time-series-forecasting-of-amazon-stock-prices-using-neural-networks-lstm-and-gan-'s Issues

Gan prediction

data leakage

Prediction output

Outdated Libraries - Error when running the example

GAN

GAN

which version of tensorflow are you using?

GAN_implementation

Can anyone please share the data of this project?

Data source

Outstanding work

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent