Giter Club home page Giter Club logo

pytorch_bert's Introduction

Building BERT with PyTorch from scratch

img

This is the repository containing the code for a tutorial

Building BERT with PyTorch from scratch

Installation

After you clone the repository and setup virtual environment, install dependencies

pip install -r requirements.txt

Installation on Mac M1

You may experience difficulties installing tensorboard. Tensorboard requires grpcio that should be installed with extra environment variables. Read more in StackOverflow.

So, your installation line for Mac M1 should look like

export GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=1
export GRPC_PYTHON_BUILD_SYSTEM_ZLIB=1

pip install -r requirements.txt

pytorch_bert's People

Contributors

devihor avatar mikhailkravets avatar ramesaliyev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pytorch_bert's Issues

Shouldn't we add an extra True token to the inverse_token_mask when adding extra CLS token to the sentence?

Hi, thank you for the article!

Here at dataset.py#L200, you are adding extra CLS token to the sentence, but no extra True token to the inverse_token_mask.

This creates an alignment issue which can be seen in your example:

masked_sentence    [[CLS], one, of, the, other, [MASK], has, ment...
masked_indices     [0, 5, 6, 7, 8, 2, 10, 11, 4825, 13, 2, 15, 16...
sentence           [[CLS], one, of, the, other, reviewers, has, m...
indices            [0, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,...
token_mask         [True, True, True, True, False, True, True, Fa...
is_next                                                            1

So [MASK] token is actually at the index of 5 but in the token_mask row it is marked at the index of 4 because of that. Isn't this wrong? I've checked other parts of the code, but couldn't find any code which handles this alignment issue, which becomes even more a problem when two sentences are concatenated i think.

Thanks!

Unmatch with CLS token position of indicies array and inverse mask array

Dear Ivan,

All of the work that you did is great. But while using your code of IMDBDataset i found some strange things! Array of indicies and array of inverse token mask values do not match each other from the beggining of the sequence because of CLS token (see the reference below). And it was also strange for me, while i found out CLS token in the second sentence. So, i may be wrong, but original BERT uses only one CLS token in the begining. And the last one, calculating length of vocab each time could be very expensive. So, i hope, that my notes will help you to make your code better and more clear. I would be proud if you gave me a posibility to take part in this and help you to fix it.

I am looking forward to your reply,
Nesemenpolkov.

Reference:

def _create_item(self, first: typing.List[str], second: typing.List[str], target: int = 1):

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.