Giter Club home page Giter Club logo

image_captioning_with_transformers's Introduction

Mohamed Zarzoura

ย 

๐Ÿ”ฅ About Me:

  • I am intersted in NLP, NLU and multimodal learning based on language processing.
  • A final year master's student in language technology at Gothenburg University.

๐Ÿ’ผ Projects:

  • ๐Ÿ”— Image captioning using a transformer-based model, based on arXiv:2101.10804.
  • ๐Ÿ”— Relations extraction using sequence on tree structure LSTMs, based on 10.18653/v1/P16-1105.
  • ๐Ÿ”— Embedding textual spatial language (ongoing) (based on google paper arXiv:1807.01670).
  • ๐Ÿ”— A bot that plays the word gaming Ghost with a user. The bot is implemented using using Rasa.
  • ๐Ÿ”— A bot assistant that helps a user in interfacing with Zotero. The user can add papers and query about the items in their database. The bot was implemented using a proprietary tool called TDM.

๐Ÿ› ๏ธ Languages and Tools:

  • Programming: ย 

  • ML/DL: ย 

  • NLP: ย 

  • NLU: ย 

Profile views

image_captioning_with_transformers's People

Contributors

felipezeiser avatar zarzouram avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

image_captioning_with_transformers's Issues

Confusion about "vector_dir" when preparing dataset

Thanks for your great work! I'm trying to reproduce your results but stucked on running the create_dataset.py. I have read through the instructions on preparing datasets but got confused about set the "vector_dir" as Directory to the pre-trained embedding vectors files. Where can I get the files or zips. Have I missed some important part of the instructions?

Error run the train command

Hi @zarzouram I have some question about your work.

I have a question about running the create_dataset. I'm using the following command:

python code/create_dataset.py --dataset_dir dataset_dir --json_train json_train --json_val json_val --image_train image_train --image_val image_val --output_dir output_dir --vector_dir vector_dir --vector_dim 300 --min_freq 5 --max_len 52

However, when I execute this command, I encounter the following error:

Traceback (most recent call last):
File "code/create_dataset.py", line 56, in
vector_name = f"{vector_name[0].name.strip('.zip')}.{args.vector_dim}d"
IndexError: list index out of range

  1. Where to obtain the validation annotations since there is only one file in the JSON.

Would you be so kind as to assist me in resolving this matter?

How can you measure the mean and standard deviation value ?

Thanks for sharing the code , excuse me if I need to measure the the mean and standard deviation value for the model with different initialization of the weights , Does that mean I need to train the model with different learning rates ? or I need to train the model three times for example as each time I train the model i got different accuracy but is a little different and evaluate the model after each time and get BLEU scores then calculate the average of the scores ?

I need help with transformer's output, attns.

Hello. I am so grateful that you wrote this implementation code.

I want to use this code to learn the model with the coco dataset, but I have a problem, so I leave a question.
While using the "run_train.py" code, an error such as a photo occurred.

แ„‰แ…ณแ„แ…ณแ„…แ…ตแ†ซแ„‰แ…ฃแ†บ 2022-09-16 แ„‹แ…ฉแ„’แ…ฎ 9 00 49

At this time, attns should look like [layer_num, head_num, batch_size, max_len, code_size^2], but in my result, it appears that the dimension corresponding to [layer_num, batch_size, max_len, code_size^2] has disappeared.

To solve this problem, efforts were made to look at "models/IC_encoder_decoder/transformer.py", which seems to be a problem that occurs when attns is created through a module called layer. I'm leaving a question because I was wondering if there was a way to solve this problem.

I'll be waiting for your reply.

  • In addition, due to lack of English skills, the tone of the question may be unpleasant. I ask for your generous understanding of this.

Questions regarding the use of custom datasets

Hi,

I'm trying to use a dataset that has only one caption per image, however I'm having trouble finding where you handle the captions for use in training. How can I pass a different dataset to your code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.