I made an image captioning model using LSTM and encoder-decoder architecture on the Flickr8k dataset. The model is made using keras framework and transformer is used to download 'bert-base-uncased' model to encode the result and compare it to the ground truth.
armins03 / image-captioning-lstm Goto Github PK
View Code? Open in Web Editor NEWI made an image captioning model using LSTM and encoder-decoder architecture on the Flickr8k dataset.