guoday / ccf-bdci-sentiment-analysis-baseline Goto Github PK
View Code? Open in Web Editor NEWThe code for CCF-BDCI-Sentiment-Analysis-Baseline
License: Apache License 2.0
The code for CCF-BDCI-Sentiment-Analysis-Baseline
License: Apache License 2.0
/home/ming/anaconda3/lib/python3.7/site-packages/sklearn/metrics/classification.py:1439: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no true samples.
'recall', 'true', average, warn_for)
test 0.06457949662369551
F1正常,test值低,而且出现这样的报错,寻求许久,未解决,请问这是因为什么?
哈喽。大佬。
我想运行robera-english,调用了pytorch_transformers中的RobertaForSequenceClassification, RobertaConfig,RobertaTokenizer。
但是会报错,RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)
这个是什么原因。
好奇,感谢大佬解答。
请问bert输出层截断部分融合怎么处理的,代码体现在哪里,谢谢回复!
求解释下一个文章截成k端后怎么输入训练的,没有找到是在哪个地方“分别输入语言模型”的?如果是这样,理论上是不是不管多长的文章都可以通过切成很多端,分别输入处理了,不用截断文章了
BertForSequenceClassification中的forward:
def forward(self, input_ids, token_type_ids=None, attention_mask=None, labels=None,
position_ids=None, head_mask=None):
flat_input_ids = input_ids.view(-1, input_ids.size(-1))
flat_position_ids = position_ids.view(-1, position_ids.size(-1)) if position_ids is not None else None
flat_token_type_ids = token_type_ids.view(-1, token_type_ids.size(-1)) if token_type_ids is not None else None
flat_attention_mask = attention_mask.view(-1, attention_mask.size(-1)) if attention_mask is not None else None
比如k=2,input_ids中就包含文章划分的两端,通过view又展平了,那输入的长度还是没有变短?和不划分一样?
其实知道跑是可以跑的,只是花费的时间问题,因为实在没有这么多计算资源,很多模型都不尝试,直接放弃,kaggle上有 gpu资源,想回头去尝试一下,因此还是想咨询你一下,你这个模型跑完多久?
是的话lstm的输入是bert的输出的第一向量,还是所有的输出呢
raw:
changed:
训练时,发现eval_loss的曲线太奇怪了,后来发现原版的pytorch_transformer把eval_loss的计算放在了
with torch.no_grad 里,于是修改代码为:
with torch.no_grad():
tmp_eval_loss= model(input_ids=input_ids, token_type_ids=segment_ids, attention_mask=input_mask, labels=label_ids)
logits = model(input_ids=input_ids, token_type_ids=segment_ids, attention_mask=input_mask)
eval_loss += tmp_eval_loss.mean().item()
然后就正常了,不知道是不是这个原因?
可以测试对比一下在XLNet_zh_Large上的效果吗?
(目前的XLNet_zh_Large是尝鲜版,如有问题会协助解决)
Roberta-large 与Roberta-mid的差别R只是参数max_seq_length=512, split_num=1,直接修改一下脚本就可以?
你好,
请问为什么robert.sh 用的是run_bert.py呢
是因为roberta的模型可以在bert上finetune吗?
另外我尝试用albert的预训练模型在run_bert.py上跑,结果显示torch
size 不匹配, 一个是[21128, 2048],一个是(21128, 128)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.