asacooperstickland / bert-n-pals Goto Github PK

Pytorch implementation of Bert and Pals: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning (https://arxiv.org/abs/1902.02671)

License: MIT License

Python 100.00%

bert-n-pals's People

Contributors

Stargazers

Watchers

bert-n-pals's Issues

Explanation for code changes from Huggingface

Hello!

I noticed that you modified some bits of code from the Huggingface repo, and I was wondering if you could add some explanation/logic behind the changes.

For example, what are the changes made to optimization.py, and why were they made? It seems like there's some similarity to the MT-DNN code as well, and I'm curious what your thought process was.

And for tokenization.py, why did you reimplement end-to-end tokenization as FullTokenizer as opposed to the original BertTokenizer?

I just want to get a better understanding of the code, and I'd really appreciate your response!

Testing code

Hi,

Do you provide anywhere the code for making test set predictions?

Recommend Projects

asacooperstickland / bert-n-pals Goto Github PK

bert-n-pals's People

Contributors

Stargazers

Watchers

Forkers

bert-n-pals's Issues

Explanation for code changes from Huggingface

Testing code

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent