Giter Club home page Giter Club logo

ncov_sentence_simi's Introduction

nCoV-2019 related sentence similarity

If useful for you, maybe a star to encourage our work.

introduce

ERNIE , RoBerta based model for sentence similarity

For example:

387,支原体肺炎,支原体肺炎的症状及治疗方法是什么,肺炎衣原体与肺炎支原体有什么区别?,0
388,支原体肺炎,支原体肺炎的症状及治疗方法是什么,肺炎支原体培养及药敏的检验单怎么看?,0
389,支原体肺炎,支原体肺炎的症状及治疗方法是什么,小儿支原体与小儿支原体肺炎相同吗?,0
390,支原体肺炎,宝宝支原体肺炎感染的症状有哪些?,宝宝肺炎支原体感染的症状是什么?,1
391,支原体肺炎,宝宝支原体肺炎感染的症状有哪些?,宝宝支原体肺炎感染有什么症状?,1

95.2 acc online (simply choose the 1st fold, 1/6)

  • ERNIE 1.0
  • Nadam with 2.0*1e-5 lr
  • OHEM CE, with label smoothing
  • cosine lr scheduler with warmup
  • clean noise data by an overfitted model

more tricks maybe

  • simply change the model

  • add any 'word2vec' features

  • split into multipiece data,get N bert,
    using multiple feature to train a tree based
    model, lightGBM, Xgboost...

  • for those hard example, maybe add the nearest sentence
    (pair with label) for reference info, into bert

  • pseudo label

  • more open data(e.g ping an CHIP 2019)

  • ...

denpendency

  • opencv-python
  • pytorch >= 1.4
  • pandas
  • yacs
  • sklearn

prepare

train

you maye change the data path, have a look at train.py test.py

export PYTHONPATH=./
sh train_pipeline.sh

ref

https://tianchi.aliyun.com/competition/entrance/231776/introduction?spm=5176.12281949.1003.4.21eb2448atCLQk

ncov_sentence_simi's People

Contributors

lhwcv avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.