Comments (5)
Hi!
Thank you so much for your time! I'm trying to learn more about transformers-xl. Could you please help answer this beginner question?
Is it possible to change the loss function and fine tune the transformer-xl model to do classification, similar to BERT? So for the transformer-xl model, there isn't one general pre-trained model like in BERT, but instead there are many pre-trained models based on different tasks?
Thank you so much for your time and God bless!
from transformer-xl.
Yes I believe it's a good direction. Transformer-XL is presumably good for document-level representations due to the ability of handling long context. On short text, Transformer-XL might also have an edge (see results on One Billion Word).
from transformer-xl.
@BoPengGit did you manage to create a transfoXL based classifier like Bert ?
from transformer-xl.
I don't remember, this was a long time ago.
from transformer-xl.
Is it possible to classify document of length 30k tokens/words, using transformer-XL?
from transformer-xl.
Related Issues (20)
- Difference between ppl and bpc
- The output of _rel_shift(...) does not conform to paper ? HOT 1
- Pytorch programs have been killed unexpectedlly
- Possibly Incorrect Calculation of Perplexity in Pytorch Implementation
- Question: why is relative positional encoding computed with length M vs. L+M in the paper ?
- can you provide an example program running with Python script? HOT 1
- error
- 运行不起来
- linux or windows? HOT 1
- Relative Positional Encoding HOT 1
- CUBLAS_STATUS_EXECUTION_FAILED and Blas GEMM launch failed
- why i-j always>0
- RelPartialLearnableDecoder vs RelLearnableDecoder HOT 1
- Differences in DecoderLayer and RelDecoderLayers/RelPartialDecoderLayers HOT 1
- enwiki8 18 layer model .sh file
- How to obtain the data?
- [W C:\w\b\windows\pytorch\aten\src\ATen\native\cuda\Indexing.cu:963] Warning: masked_fill_ received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead. (function masked_fill__cuda)
- About Using
- Why do you pass query, key, and value through the same fc_layer in transformer_xl model?
- How to train transformer-xl for new datasets (Specifically Hindi)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformer-xl.