brightmart / ai_law Goto Github PK
View Code? Open in Web Editor NEWall kinds of baseline models for long text classificaiton( text categorization)
all kinds of baseline models for long text classificaiton( text categorization)
Can your provide the trained checkpoint to reconstruct your result?
请问在large数据集上是怎么划分train/test/valid的呢
great job!
but when I use your code to train the 'big_data' (1700k),cpu always occupied 100%,then I cannot connect to my server.
您好,最近参加了一个长文本分类比赛,文本的平均长度为720个词,请问有什么模型可以推荐吗?开始用的DPCNN来做,由于内存才10g左右只能用很小的batch来做,并且很慢。多谢
tensorflow.python.framework.errors_impl.DataLossError: Checksum does not match: stored 2982177973 vs. calculated on the restored bytes 1718988051
我上传之后总是这个问题,请问您有什么见解。
我在本地测试都不会出现问题,但是就出在上传会报这个错误。
为了验证是我代码的问题,还是上传出错。我今天把您的代码跑了一篇,然后在本地跑main.py正常,然后压缩上传了一次,同样出现了这种错误。我压缩的方式是zip -r -9 predictor.zip predictor/,我想既然我上传您的代码也会出现同样的错误,那错误应该不是出在代码里,而是压缩的步骤或上传的步骤中存在问题,请问您是怎么压缩文件上传的?
train:
tf.app.flags.DEFINE_string("ckpt_dir","./predictor/checkpoint/","checkpoint location for the model")
tf.app.flags.DEFINE_string("model","text_cnn","name of model:han,text_cnn,dp_cnn,c_gru,c_gru2,gru,pooling")
predictor:
tf.app.flags.DEFINE_string("model_dpcnn", "dp_cnn", "name of model:han,c_gru,c_gru2,gru,text_cnn")
tf.app.flags.DEFINE_string("ckpt_dir_dpcnn", "predictor/checkpoint_dpcnn_big32/", "checkpoint location for the model")
tf.app.flags.DEFINE_string("model", "text_cnn", "name of model:han,c_gru,c_gru2,gru,text_cnn")
tf.app.flags.DEFINE_string("ckpt_dir", "predictor/checkpoint/", "checkpoint location for the model")
当训练模型的时候是用text_cnn, 不理解在预测的时候为什么是用的text_cnn和dpcnn. 难道是在训练模型的时候,需要再训练dpcnn吗?谢谢您的帮助。
('using pre-trained word emebedding.started.word2vec_model_path:', 'data/news_12g_baidubaike_20g_novel_90g_embedding_64.bin')
Traceback (most recent call last):
File "/home/qiu/PycharmProjects/ai_law-master (2)/HAN_train.py", line 271, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "/home/qiu/PycharmProjects/ai_law-master (2)/HAN_train.py", line 101, in main
assign_pretrained_word_embedding(sess, vocabulary_index2word, vocab_size, model,FLAGS.word2vec_model_path,model.Embedding)
File "/home/qiu/PycharmProjects/ai_law-master (2)/HAN_train.py", line 238, in assign_pretrained_word_embedding
for word, vector in zip(word2vec_model.vocab, word2vec_model.vectors):
AttributeError: 'EuclideanKeyedVectors' object has no attribute 'vectors'
您好,请问Imbalance Classification for Skew Data中的threshold是如何确定的。为什么是num_mini_example/freq
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.